Skip to content

Instantly share code, notes, and snippets.

@simoncollins
Last active December 19, 2015 23:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save simoncollins/6034200 to your computer and use it in GitHub Desktop.
Save simoncollins/6034200 to your computer and use it in GitHub Desktop.
An introduction to the internals of a Git repository by building up a basic initial commit using Git's lower level plumbing commands.
# References:
# http://git-scm.com/book/ch1-3.html
# http://git-scm.com/book/en/Git-Internals
# Create a directory for our repository
mkdir test
# Initialise an empty git repository in the directory
git init
# Create our first file
echo 'Hello World!' > helloworld.txt
# Find our SHA1 hash for this file - will be the same whichever computer we run this on
# as long as the file content is identical
# Should return 980a0d5f19a64b4b30a87d4206aade58726b60e3
git hash-object helloworld.txt
# Git stores its objects under .git/objects
# Let's check if there are any objects already in our repository ... nope
find .git/objects -type f | sort
# Let's actually store this file as a blob in the repository
git hash-object -w helloworld.txt
# Now let's check for objects again
find .git/objects -type f | sort
# There's the object we added. Git stores objects in subfolders named after the first two characters in the object hash
# Note we've simply stored our file as a blob. We don't have any history
# Verify that the file is stored in the repository using the cat-file command
git cat-file -t 980a0d
# That tells us it's a blob. Now we can show the contents with cat-file
git cat-file blob 980a0d
# If we want to actually make a commit object to store the current state of our work we'll need to do the following:
# 1. Add our file to Git's index so that it can be part of the snapshot tree Git associates with the commit
# 2. Write out the contents of the index into a tree object in the repository
# 3. Create a commit object associated with that snapshot tree
# 4. Point the head of our master branch at that commit object
# The index is currently empty. Use the ls-files command to check. The --stage option show files staged in the index
git ls-files --stage
# Add the file to the index with the update-index command. The --add flag tells Git to add the file to the index
git update-index --add helloworld.txt
# Running git ls-files --stage again now shows our file is in the index. Next let's create a tree object
# to store a snapshot of the index as it currently stands and store it into the repository
git write-tree
# write-tree returns the hash of our new tree. a52ced62461ab18f21578ed3b154283f8305efa3
# We can see we now have two objects in our repository, the blob and the tree
find .git/objects -type f | sort
# cat-file -t tells us the SHA is for a tree object. We can also use ls-tree to show the contents of the tree
# which will be one or more blob objects or nested tree objects
git cat-file -t a52ced
git ls-tree a52ced
# Next we create a commit object that references our snapshot tree. It returns our commit hash
# c4f6ecae50a19b181994454c0713b8d495f6ccc3 (note unlike the other hashes this will be different than
# what you get as it depends on the information in the commit such as timestamp and author name.
# We can also see that our objects directory now has three objects
git commit-tree a52ced -m "Our first file"
find .git/objects -type f | sort
# We can use cat-file to show the details of our commit. Note that it references the hash of the snapshot tree object we
# added earlier
git cat-file commit c4f6e
# Finally we need to link the head of our master branch to this commit. Branches are referenced simply by pointers
# to commit objects. That and the chain or references from one commit object to its parent fully define a branch
# Git stores these references as files in the .git/refs/heads/ folder
ls .git/refs/heads/
# We have no reference for master so we use the update-ref command for to add one
git update-ref refs/heads/master c4f6ec
# We now have a file called 'master' in .git/refs/heads which contains the hash of our commit object.
cat .git/refs/heads/master
# If we run git log we now have a commit log showing our initial commit
git log
# Git stores a reference to the commit that is currently checked out. This is known as HEAD. When you switch
# branches Git chances this to point to the new commit. Rather than using a commit hash Git uses
# a symbolic reference stored in the file .git/HEAD
cat .git/HEAD
# E.g if you are on the master branch HEAD will contain 'ref: refs/heads/master' referring to the reference 'master'
# stored in .git/refs/heads/master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment