Skip to content

Instantly share code, notes, and snippets.

@infotroph
Last active October 24, 2018 13:52
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save infotroph/7204930 to your computer and use it in GitHub Desktop.
Save infotroph/7204930 to your computer and use it in GitHub Desktop.
How to split a subdirectory of a Git repository off into its own repository, *without* losing history of files that have been moved from parent into subdirectory?
# The overarching problem: I'm an indecisive mofo.
# The solvable problem:
# I started a repo, later decided to move some things to a subdirectory,
# and later still decided to move that subdirectory to its own repo.
# I want the new repo to contain the history of only the files that
# currently live in the subdirectory... *including* their history
# from before I moved them into the subdirectory.
# Note that I'm more interested in preserving all history in subdir
# than I am in removing evidence of the original parent repo...
# the parent isn't secret, just large.
# example repo:
mkdir foo && cd foo
for i in 1 2 3; do echo "file $i, this line written before Git" >> file"$i".txt; done
git init && git add . && git commit -m 'Oh my, this project needs version control'
echo 'lol' >> file1.txt; echo 'SYNTAXERROR' >> file2.txt; echo 'no speling errors heare' >> file3.txt
git commit -am 'massive improvements'
mkdir subdir
git mv file1.txt subdir/file1.txt
git mv file2.txt subdir/file2.txt
git commit -m 'put files 1&2 in their own dir'
# note that we now need to use log --follow to see
# the 'massive improvements' commit in file1 or file2's history
echo 'This line added after moving to subdir' >> subdir/file1.txt
git commit -am 'more work on file1'
mkdir sibdir && echo "unrelated content" >> sibdir/sib1.txt
git add sibdir && git commit -m "stuff unrelated to subdir"
# "...Wait, subdir/ really ought to be its own repository!"
# "Let's pull it out by cloning and removing everything that's not subdir."
cd ../
git clone --no-hardlinks foo subdir-standalone
cd subdir-standalone
git remote rm origin
DEADFILES=(^subdir)
# (check that list before trusting it!)
for i in $DEADFILES; do
git filter-branch -f --prune-empty --index-filter "git rm -rf --cached --ignore-unmatch \"$i\"" HEAD
done
cd ..
# remove now-dead refs.
# Could accomplish the same thing by rewriting refs, expiring reflog, gc prune, but this is easier and leaves a backup.
git clone file://subdir-standalone subdir
# Result:
# file3 and sibdir are, correctly, gone
# commits that only touch file1 or sibdir are gone from $(git log)
# commits to file1 and file2 from before they got moved to subdir are,
# correctly, available (still need --follow).
# In short, this toy example seems to work as I wanted.
# What gotchas am I missing?
@aredridel
Copy link

That should do it. You might need to iterate over all branches instead of HEAD.

There's tools for this -- http://makingsoftware.wordpress.com/2013/02/16/using-git-subtrees-for-repository-separation/

@infotroph
Copy link
Author

Known gotchas:

  • Won't remove history for files previously removed from repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment