Skip to content

Instantly share code, notes, and snippets.

@mhimanshu0101
Last active February 9, 2022 20:19
Show Gist options
  • Save mhimanshu0101/caa139b6a16fa24dbc17c6749dd04ab1 to your computer and use it in GitHub Desktop.
Save mhimanshu0101/caa139b6a16fa24dbc17c6749dd04ab1 to your computer and use it in GitHub Desktop.
Internal working of git pull and fasten up the process

Why cloning a new Git repo is so slow?

If you are cloning a 100MB repo, it will take around 5-10min to clone the repo even at 100MB/sec.

To understand why, we need to take a look at how git stores file changes and how it fetches repo over the network.

Git stores snapshot of each files you have changed in each commit.

Suppose your repo has 2,000 commits and 20 files changed in each commit then there will be 40,000 snapshots (+ number of files in repo).

When you do git clone, it internally uses git fetch.

You generally have remote URLs of 2 type HTTP or SSH based both of them uses TCP internally.

TCP is a 3 way handshake protocol, it means to create a TCP connection your device has to ask the server that it can connect or not.

Suppose your latency with the git server is 100ms, for creating a TCP connection it will take 300ms.

For sending a repo with 50,000 files, it will have to create over 50,000 TCP connections.

Git also does some compression in the server side, so it also adds some extra time while fetching.

How can you speed up the Git clone?

If your repo has a long history:

git clone --depth=1
then
git fetch --unshallow

Else: Create a git bundle

git bundle create
then
git clone bundle

If you own the git server, take a look at the pack compression doc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment