Skip to content

Instantly share code, notes, and snippets.

@LSgeo
Last active November 17, 2021 06:58
Show Gist options
  • Save LSgeo/d0ed0b07b40ed677622ced3021ded558 to your computer and use it in GitHub Desktop.
Save LSgeo/d0ed0b07b40ed677622ced3021ded558 to your computer and use it in GitHub Desktop.

Learning Git

Git is awesome! Git is widely used! Knowing Git WILL make you sexier!

The git book is here, and contains documentation and the official write up on using git.

Having read several tutorials and write ups before this one, you'll know Git is a Version Control System. You may also know:

  • Git and GitHub are independent, but go hand in hand. There are alternatives, such as gitlab!
  • A repository on your local machine can be synchronised with a repository on the remote, which in most cases is GitHub.
    • This occurs most commonly with the git push and git pull commands, but there are other useful ones!
  • An organised Git history is both a backup of your code, and a journal that can be used to record your entire devlopment process.

This guide is my personal record of learning to keep an organised Git repository for my projects.


In my own words:

The command line

  • More or less, learn the command line. You're a dev, and it's simple and powerful. Knowing what is going on under the hood will make you better at using Git.
  • Optional: Set up a commit message editor in your IDE of choice. When you run a command such as git commit without specifying a message, or git rebase -i, your IDE will open a text edit session that lets you easily edit the messages, with all the power of your IDE.

IDE integration

  • As above, use your IDE to edit commit messages, and obey nice formatting guides for message formatting.
    • For VSCode, put this in your gitconfig (single line):
      [core] editor = code --wait [merge] tool = vscode [mergetool "vscode"] cmd = code --wait $MERGED [diff] tool = vscode [difftool "vscode"] cmd = code --wait --d
    • Use this command to find your git config settings: git config --list --show-origin

Now is a good time to mention git credential helpers - a convinient way to avoid typing your user/pass all the time! The (potentially better) alternative is to use SSH keys and the ssh-agent. GitHub provides useful documentation on generating and using ssh keys.

Keep it simple stupid! .gitignore

Use the .gitignore file to filter what files are tracked by git in your repository.

In general, you should only track your text files (i.e. code, documentation) with git. Anything data or external modules that are used or created by your project should be discluded from your repository history. This doesn't mean it shouldn't be made available - just that you should link the original source, rely on the project to generate it, or publish your end result and link to it.

Yes

  • Code
  • Documentation
  • Package management: requirements.txt or environment.yml
  • README.md, LICENSE, .gitignore, etc
  • Images are okay for linking in documentation

No

  • Input and output data
    • Github has a 100 MB upload limit
    • Link the source or publish your data elsewhere.
  • Machine generated files
    • Rely on your project to create these files when required.

Adding to your commit

git add When you save your work to your computer, git identifies that the file is changed. To tell git that you want those changes to be added to the git history (tree?), you use git add changed_file.txt. You can specify git add --all (for tracked) or git add * for all files (including untracked) if you are lazy, or know what you are doing. I recommend making use of an interactive add, either through your IDE, or with git add -i. This allows you to stage specific changes in your files. More on why you would want to do this in a sec. Once you have added your changes, they become staged, ready for a commit.

Commiting to your repository

git commit When you have finished your specific change, be it a new feature, a work in progress that you want to back up, or any simple change that should be included in your repository, you commit it. This records anything that you have staged with git add. A commit should have a concise name, and a short description on why you changed it. A description of what is changed is intrinsic to the commit, as it records the file names and content of the change already. The name should be written as a directive, as if you were telling the file what to do. For example, Add paragraph on committing to your repository. A more complete style guide is available from Chris Beams

When to commit

There are a few stages to recording your work, from saving it to disk, commiting it to your local repository, and merging or pushing it to a remote. Each step becomes progressively harder to revert the change, or remove it from the history.

  • Saving should occur all the time. Autosave whenever you run a script, change windows. File changes are recorded on disk, but not recorded in your git history.
  • Adding and committing should occur as part of the same process. Commit when you want to record your changes in the repositories history. For example, you've progressed a single cohesive unit of work, such as a new function, a bugfix, revised typos, etc. Work in progress commits are fine, but you will likely want to squash them into a single commit before pushing or merging (see git rebase or git commit --amend).
  • Pushing and/or merging is the final chance to simplify your changes or add to a commit. Once pushed or merged, modifying the history becomes a nightmare of merges, forced commands, and these are compounded if other developers have pulled your changes. You should write a detailed description for your merges! Pushing simply reflects your local changes on the remote.

Repository vs Remote vs Upstream

Repository is local (your current device), remote is elsewhere (e.g. Github, Gitlab, Office computer), and upstream is a repository that your project references, but doesn't affect. Often you will have push/pull (read/write) access to the first two, but lack permissions to push to upstream. You are welcome to propose changes to upstream, via means of a pull request.
Something that is not immediately apparent is that when you make multiple commits/merges/changes to your local repository, the entire history is only synched to the remote when you git push.

Branch Mastery

The master branch, typically named master (or if your project was created more recently, main) is your public facing side of the project. It's what we want to keep organised, and ready to run by anyone who uses your project. It comes by default when you start a new repository!

The master branch

Commits and merges to master should follow the ASD principle: Atomic, Specific, and Documented.

Atomic means when you push to the master branch, the resulting code is still a complete, ready to run project. You can also think of it as 'stateful'.
Specific means the commit/merge addresses a specific purpose. Maybe it's a new or revised feature, or maybe it bugfixes and documentation. The reason for this is ease of reverting changes.
Documented refers to a well written message. This means a title and body contents, formatted appropriately. Some developers use emojis in the titles as a stamp for different commit purposes!

Other branches

git checkout -b new_branch It is free and easy to create and remove new branches. A branch is a working copy of your code that you might develop on, or preserve for different feature streams. They are also fundamental to working with other contributors.
When you want to work on a new feature, simply checkout a branch, commit interim additions to the branch, and when it's ready to be used in the main branch, merge the two and delete the finished branch. For more complicated development environments, is good practice to merge master with your feature branch (i.e, git checkout feature_branch then git merge master), resolve conflicts, the fast foward merge the result back to master. This way you can run tests on the new feature branch that includes master, without the actual master being in a potentially buggy state.

A repository within a repository...

Subtrees, submodules and subrepos.... Still don't understand whats best. If it's version controllable elsewhere...?

Cloning, Forking and Collaborating

You interact with git project by cloning or forking. Fork if you want to contribute back at some point, or keep up to date with development as it progresses (using git pull), clone if you simply want a copy of the repository at that time.

Rebase

git rebase -i Ever make a bunch of commits that are work in progresses? Rebase lets you browse your git history and rewrite it! Dangerous but powerful. My favourite use case is to squash commits. Squashing lets you combine different commits together, and tidies up the history. Note, if you have pushed to a remote, this can cause issues! It may also require some merge conflict resolutions if you reorder too many things!

Amend

git commit --amend Typo in your commit? Made another quick revision that should have been in the last commit? With an amend commit, you can add any changes, wrapped under the last commit! Just don't amend pushed commits.


I want to... use case scenarios


Great repository examples:

Great articles on using git

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment