abstractmachines/git-workshop.md

## git-workshop.md

      
    Raw
  

              git-workshop.md
            
          
    git Workshop

Motivation for this workshop

The majority of blogs and documentation about git are one of two things:

technically correct while being very difficult for newbies to digest,

OR

technically incorrect "guides" which help people learn, but miss many CS fundamentals and details and/or advise learners to "avoid/ignore" "advanced" features which software engineers use daily on the job.

This workshop serves to instruct on the industry usage of git using methods and theory that are agreed-upon by most senior engineers, and to do so while covering CS and SWE fundamentals. As such, this workshop will address the features of git which many blogs state are "too hard to use" such as rebasing, as these git workflows are considered a fundamental practice of intermediate git operation when collaborating on teams.
git vs. Github

git is a command line tool for version tracking created by Linus Torvalds. Github is not git. Github is a website that allows for UI (Graphical User Interface) interaction with git features and functionality, but it is not necessary to use Github in order to use git.
git is the most popular version control tool used by developers today. In previous eras, SVN or Subversion was a similar version control tool.
Always read the docs

man git help

An online version of man git help : https://git.github.io/htmldocs/git.html
git scm

A great instructional site with awesome graphics. (SCM stands for Source Code Management).
https://git-scm.com/
git workflows

https://git.github.io/htmldocs/gitworkflows.html
git howtos

https://git.github.io/htmldocs/howto-index.html
https://news.ycombinator.com/item?id=3762710
git user guide

https://www.kernel.org/pub/software/scm/git/docs/user-manual.html
Software Engineering tips


Avoid merge commits, as they are noisy.
Don't rebase history that other engineers are working on, or you may make them sad, angry, or worse.

Software Engineering and CS Fundamentals : what is git?

git is an acyclic directed graph

Remember Google Maps? Dijkstra's "shortest path" algorithms continue to serve us.
Princeton Algorithms Class (available as an MOOC on Coursera site with Sedgewick, highly recommended):

https://algs4.cs.princeton.edu/lectures/42DirectedGraphs.pdf
Decent blog posts : what is git?

Blog posts from industry engineers that I've collected for this workshop. These blog posts don't seem to say anything outwardly incorrect (a few blogs out there try to make git seem really simple, and by doing so, summarize the features improperly).


git is a purely functional data structure


git is a Directed Acyclic Graph (with great graphics)


what is detached head state and how do you get there


From the Jayway post: functional programming helps with concurrent resource access (and multithreading) .... and collab

"If ... someone was in the process of iterating through that [old, now-mutated] list, they now get a nice exception."


So, it's pretty clear that functional data structures are great for concurrent accesses of the same memory!
It's also really clear that functional data structures are great for multithreaded appliations!
For use cases where you want to update data without affecting what "other people" (and things) are doing.
This would apply directly to collaborating with multiple people on a code repository.

Commits

A commit is a snapshot

That snapshot is of current state of files and metadata (commit hash, commit message, comment, author, time, pointer to parent...)
Visual commit workflow:

C is our current commit, so it's where our branch pointer currently points to

C +---> B +---> A
C is master
We add a new commit, D.

git does the following:

moves current branch pointer
makes master point to D, the new / current commit
the history C -> B -> A is the "parent" of our new commit D. That parent is ^master, and D is master.

D +--->   C +--->   B +--->   A
D is master
C is ^master and C is the parent of D
Drawing the "flow" of commits: arrows point at prev, not next!


Make the drawings correct! point "back in time" ... at previous commit. Check this helpful graphic out

Previous commit is the parent of current commit


The git workflow will look at the "previous commit" as the "parent" of current commit. Check this out. You'll notice in one of the drawings that "the current commit is master and the previous commit is master^, the parent of master." That really makes sense!

Commit labels


A commit can have many labels, including its hash, pointer to previous commit (its parent), author, time, etc.
HEAD is a label for "currently active commit".

Checking out a branch name: a branch name is a label for a commit


A branch name is just a label for a commit. You can git checkout 0289789c or git checkout branch-name, right? Try it.
So you can check out either a branch-name, OR a commit by its hash.
So it makes sense to conclude that a branch-name is related to a commit.
Hence, a branch-name is pretty much a label for a commit.
Checking out a branch name or a commit makes the label HEAD point at that commit.

Digging deeper: the .git dir and refs/heads branch

The .git directory in your repo root will have a text file called HEAD that shows you what branch you're checked out on.

git checkout master
Navigate to .git directory and cat HEAD. You'll see something like:

$ cd ~/arepo
$ ls -al
$ cd .git
$ cat HEAD 
ref: refs/heads/master
$


Looks like you're on master!

"Normal workflow": Checkout a branch

Use when: You want to write code and make commits that you can save.


Checkout a branch
git "scoots" the HEAD pointer along. Now, "you're at HEAD."
The details:
In another terminal window, git checkout -b new-branch.
In first terminal window (where you're in the .git directory), run the cat HEAD again. You'll see something like:

$ cat HEAD 
ref: refs/heads/new-branch


Clearly your HEAD is at new branch. Recall that as you make new commits that HEAD and master will "scoot along." Recall also that HEAD and branch-name are just labels for commits.
Does "normal workflow" include "going to .git directory and doing a cat on HEAD text file to see refs?" Nope. I'm just providing an in-depth way to examine "what is going on under the hood with git." Getting into the .git directory is a part of understanding git like an engineer.

"Detached Head State workflow": Checkout a specific commit by HASH

Use when: You want to explore and do stuff that you're NOT going to save.


Checkout a branch
git does NOT "scoot" the HEAD pointer along. Now, "you're NOT at HEAD."
Repeat the process above, but check out a specific commit by HASH. (Use git log to grab a hash).
Once you checkout that commit, git will warn you that you're in detached head state, and this:

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

... amongst other things, is what's going on..
Why can we make commits without affecting other branches? Well, because Detached Head State doesn't move HEAD.
You're not "at HEAD," so the changes that you make while in Detached Head State don't get saved to any branch, unless you do stuff. But committing while in Detached Head State isn't really a common workflow.
If you cat HEAD text file to see local refs, you'll see:
 $ cat HEAD 
987654321
~/arepo/.git ((HEAD detached at 987654321)*) $ 

Note the listing of a commit hash rather than a branch is another indication that you're in Detached Head State.
Checking out new commits: what happens to branch name and HEAD?


These ["scoot along as you check out new commits" (see helpful graphics)(https://medium.com/girl-writes-code/git-is-a-directed-acyclic-graph-and-what-the-heck-does-that-mean-b6c8dec65059).

Simple Collaboration Workflow : Branching and Merging

See the git SCM site for more details on branching and merging


git branch
The concept of branching is fairly simple and has been covered, see the above link for details.


git merge:
We're going to discuss how merging works on a team using Github.
There are indeed CLI strategies for merging, ours, theirs, et al. But you probably won't use those on a team.


Instead, you'll be following this kind of workflow.

Workshop team activity: Pull, commit, push, Github PR, merge
Pull down master from remote (more on git pull later and why it can be a problem)
Install packages (think of this as static and/or dynamic linking and loading, if you're a compiler person)
create a new branch off of master
now your HEAD is at the new branch
Write code locally, save it, and commit it.
Push your branch up to the remote repo.
Request a PR Review from your teammates.
Address issues. Push up new commits as needed.
After approval, merge the branch.
That merge will create what is called a "merge commit." That's a commit that happens when you merge stuff. Merge commits create extra "noise" in the commit history of your repository. That's why a lot of engineers use git fetch followed by git rebase or git merge instead of using git pull. More on Stack Overflow about git pull

Great, that was pretty easy.
Use cases in which that may not suffice:

What about if you have a branch for longer than a day? (a long term feature development branch, for example).
Tons of commits will happen to master by the team between the time you pull down changes, and the time you push up your branch for a PR.
What about if your teammate merges some code into master that you need for your local branch/work?
In that case, you want to "update the master/commit" that YOUR branch, branches off of.
You need to "update" your local working repository and "sync" it with the latest version of master.

Remember how I mentioned all those blog posts by developers that tell you to "just ignore git rebase because it's hard"?
Yeah, that's what we are going to do: git rebase.

From your branch:
git pull -r origin master
What's this do? A rebase pull, instead of a fetch && merge pull like above.
-r stands for rebase.
We are syncing up with the remote set as origin, and master branch.
This will pull in the latest changes from master and then "replay" YOUR local commits "on top of" the master branch commits.
Like this: (note that this is not syntactically correct, just conceptual):

$ git log

your commit yesterday 982374328
comment rad stuff

your commit the day before yesterday 47328472831798
comment even radder stuff

your teammate's commit today
comment super awesome stuff

your teammate's commit yesterday
comment super super great stuff

... See? It's no longer just "sequential."
Long Term Feature Development Branch Collaboration Workflow : Interactive Rebase and Conflicts


Workshop team activity: Interactive rebase, resolve merge conflicts
Do not do this on a public or shared branch. Avoid making people sad, angry, or worse.
You may also need to create a fresh branch based on upstream.
You may also run into a conflict, which you'll need to resolve using git rebase -- continue, git rebase -- skip, and commits, all intertwined together, as covered here in git scm and here on Hackernews.
See also: interactive rebase
Don't be scared of git rebase
Make sure you force push your branch up, and remember, other people can't be working on it, or they'll be sad, angry, or worse after you rewrite history!

Production Workflow : Hotfix


Workshop team activity: Do a hotfix


from git scm: "At this stage, you’ll receive a call that another issue is critical and you need a hotfix.
You’ll do the following:

Switch to your production branch.
Create a branch to add the hotfix.
After it’s tested, merge the hotfix branch, and push to production.
Switch back to your original story and continue working.


Software releases: semver, tagging, CI


Semantic versioning semver.org
npm run release:patch / major / minor et al
verify that semver bump worked. look at Changelog if one exists (hopefully you made one!)
git tags to deploy to CI (git push && git push --tags is one option)
Build artifacts covered by CI
What is continuous integration

Other materials

Distributed version control : some graphic representations

http://ericsink.com/vcbe/html/basics_clone.html
PSU tutorials on git

PSAS git tutorial

https://github.com/psas/psas-git-workshop
Industry articles and repos

Most commonly used tips and tricks

https://github.com/git-tips
https://github.com/git-tips/tips#track-upstream-branch
Why you shouldn't use git pull:

https://longair.net/blog/2009/04/16/git-fetch-and-merge/
Difference between git pull and git fetch

https://stackoverflow.com/questions/292357/what-is-the-difference-between-git-pull-and-git-fetch
git fu : intermediate git

https://www.raywenderlich.com/74258/git-tutorial-intermediate
Basic exercises

https://www.learnenough.com/git-tutorial
Note: this site says to ignore git rebase....we will most definitely not be following that advice. Merge commits are noisy and to be avoided.