Skip to content

Instantly share code, notes, and snippets.

@marc-h38
Last active April 25, 2023 22:33
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save marc-h38/5de76e169bb314f9b0244b6077a46673 to your computer and use it in GitHub Desktop.
Save marc-h38/5de76e169bb314f9b0244b6077a46673 to your computer and use it in GitHub Desktop.
Shallow cloning creates a pull request with 1 million commits
#!/bin/sh
# Marc Herbert @ gmail.com
set -e
set -x
# git clone --depth "shallow cloning" can save continuous integration a
# lot of time. This demo shows how shallow cloning can turn a
# single-commit pull request into a pull request with 1 million commits!
# = all the way to the initial commit.
#
# Tools like checkpatch unsurprisingly struggle to scan 1 million
# commits :-(
#
# Real-world examples:
# https://travis-ci.org/github/thesofproject/linux/jobs/740309392
# https://travis-ci.org/github/thesofproject/linux/jobs/739977895
# This script must be run from a recent kernel git clone.
# It creates four git tags and another ../shallow/ clone.
# After git clone --depth 20 target_branch :
# I----------------------------------> upstream
# grafted \
# \
# \
# \ frequent "back merges"
# \
# \
# \
# v
# I-------------> target branch
# grafted
# After git fetch pull_request :
# Spurious
# pull request
# merge base
# --------- I----------------------------------> upstream
# \ grafted \ \
# \ \ \
# \ \ \
# 1 million \ \ \
# commits! \ \ \
# \ \ Actual \
# \ \ pull request \
# v v merge base v
# -------------------------------------------- I-------------> target branch
# \ grafted
# \
# \
# - pull_request
# ASCII art made with the excellent textik.com
validate_starting_point()
{
git rev-parse v4.20
grep -q KBUILD_VERBOSE ./Makefile
}
prepare_source_repo()
{
git remote add sof_remote_demo https://github.com/thesofproject/linux
git fetch sof_remote_demo
git tag _real_PR_base b150588d227ac0
git checkout _real_PR_base
touch dummyfile; git add dummyfile
git commit -m 'pull request' dummyfile
git tag _pull_request
git tag _target_branch 70fe32e776dafb
git tag _spurious_base 86f29c7442ac4b
}
main()
{
# Make sure we start from a reasonably recent kernel repo
validate_starting_point
# We need this only once.
git rev-parse _spurious_base > /dev/null || prepare_source_repo
# This pull request had a single commit. Its _real_PR_base was more
# than 20 commits back so not fetched by the git clone below.
git log --oneline ^_target_branch _pull_request
local src_repo; src_repo="$(pwd)"
cd ..
time git clone --bare --depth 20 --branch _target_branch \
localhost:"$src_repo" shallow/
cd shallow
# Fetches all the way to the initial commit which takes forever and
# totally negates the shallowness optimization above.
#
# Even though it takes forever, it stops at every commit "grafted"
# by the git clone --depth above to "respect" the previous
# optimization! So it will miss crucial connecting commits.
time git fetch origin _pull_request
git tag _pull_request FETCH_HEAD
# Not the real merge base
git merge-base -a _target_branch _pull_request
# In the shallow clone, the same pull request now has 1 million
# commits!
git rev-list ^_target_branch _pull_request | wc
}
main "$@"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment