Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
git-cache-meta
#!/bin/sh -e
#git-cache-meta -- simple file meta data caching and applying.
#Simpler than etckeeper, metastore, setgitperms, etc.
#from http://www.kerneltrap.org/mailarchive/git/2009/1/9/4654694
#modified by n1k
# - save all files metadata not only from other users
# - save numeric uid and gid
# 2012-03-05 - added filetime, andris9
: ${GIT_CACHE_META_FILE=.git_cache_meta}
case $@ in
--store|--stdout)
case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
find $(git ls-files)\
\( -printf 'chown %U %p\n' \) \
\( -printf 'chgrp %G %p\n' \) \
\( -printf 'touch -c -d "%AY-%Am-%Ad %AH:%AM:%AS" %p\n' \) \
\( -printf 'chmod %#m %p\n' \) ;;
--apply) sh -e $GIT_CACHE_META_FILE;;
*) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
esac

source:

git-cache-meta --store

destination:

git-cache-meta --apply

Download jgit.sh

Config

cat > ~/.jgit
accesskey: aws access key
secretkey: aws secret access key
<Ctrl-D>

Setup repo

git remote add origin amazon-s3://.jgit@bucket.name/repo-name.git

Push

jgit push origin master

Clone

jgit clone amazon-s3://.jgit@bucket.name/repo-name.git

Pull

jgit fetch
git merge origin/master

This git-cache-meta.sh has a bug:
When storing, the script stores the files' ACCESS time. But when applying, it applies the access time of the files as their MODIFICATION time. This is in contrast to most file archiving programs which stores the modification time and applies the modification time.

The script should stores/applies the access time and the modification time separately.

Also, the touch command should be the LAST command among the other metadata commands. This is because other commands will affect the access time of the files.

kickiss commented Jul 27, 2013

Apparently bug: it fails if file names has spaces.

snippet bellow fixes files with spaces on them ...

#!/bin/sh -e

#git-cache-meta -- simple file meta data caching and applying.
#Simpler than etckeeper, metastore, setgitperms, etc.
#from http://www.kerneltrap.org/mailarchive/git/2009/1/9/4654694
#modified by n1k
# - save all files metadata not only from other users
# - save numeric uid and gid

#2012-03-05 - added filetime, andris9
pIFS=$IFS
IFS=$'\n'

: ${GIT_CACHE_META_FILE=.git_cache_meta}
case $@ in
--store|--stdout)
case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
find $(IFS=$'\n' ; git ls-files)\
\( -printf 'chown %U "%p"\n' \) \
\( -printf 'chgrp %G "%p"\n' \) \
\( -printf 'touch -c -d "%AY-%Am-%Ad %AH:%AM:%AS" "%p"\n' \) \
\( -printf 'chmod %#m "%p"\n' \) ;;
--apply) sh -e $GIT_CACHE_META_FILE;;
*) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
esac

IFS=$pIFS

I had experienced problems with the line breaks presented in IFS (IE:
pi@Legion ~/.ssh $ ../scripts/git-cache-meta.sh --store
find: .git_cache_meta\nauthorized_keys\nco': No such file or directory find:fig\nid_dsa\nid_dsa.pub\nid_rsa\nid_rsa.pub\nide': No such file or directory)

This seems to work well.

As suggested earlier, moved touch to be the final command.

#!/bin/sh -e

#git-cache-meta -- simple file meta data caching and applying.
#Simpler than etckeeper, metastore, setgitperms, etc.
#from http://www.kerneltrap.org/mailarchive/git/2009/1/9/4654694
#modified by n1k
# - save all files metadata not only from other users
# - save numeric uid and gid

#2012-03-05 - added filetime, andris9
pIFS=$IFS

IFS='
'

: ${GIT_CACHE_META_FILE=.git_cache_meta}
case $@ in
    --store|--stdout)
    case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
    find $(git ls-files)\
    \( -printf 'chown %U "%p"\n' \) \
    \( -printf 'chgrp %G "%p"\n' \) \
    \( -printf 'chmod %#m "%p"\n' \) \
    \( -printf 'touch -c -d "%AY-%Am-%Ad %AH:%AM:%AS" "%p"\n' \) ;;
    --apply) sh -e $GIT_CACHE_META_FILE;;
    *) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
esac

IFS=$pIFS

The above version seems to result in "No such file or directory" if there are non ASCII letters in the path;
also it seems likely to fail if the file list is "too long", that is larger then ~120 KB (on Linux) .

Further chown can usually also set the group, thus compacting the output further.

#!/bin/sh -e

#git-cache-meta -- simple file meta data caching and applying.
#Simpler than etckeeper, metastore, setgitperms, etc.
#from http://www.kerneltrap.org/mailarchive/git/2009/1/9/4654694
#modified by n1k
# - save all files metadata not only from other users
# - save numeric uid and gid
#2012-03-05 - added filetime, andris9
#2012-05-22 - added fix for non ASCII characters and list size, merge chgrp into chown command

IFS='
'

: ${GIT_CACHE_META_FILE=.git_cache_meta}
case $@ in
    --store|--stdout)
    case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
    git ls-files -z | xargs -0r -I '{}' find '{}' \
    \( -printf 'chown %U:%G "%p"\n' \) \
    \( -printf 'chmod %#m "%p"\n' \) \
    \( -printf 'touch -c -d "%AY-%Am-%Ad %AH:%AM:%AS" "%p"\n' \) ;;
    --apply) sh -e $GIT_CACHE_META_FILE;;
    *) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
esac

Edited the version above, I think this should be the best way to handle unusual filenames in git source tree.
You need GNU versions of "find", "xargs", and "ls" of course.
Note the other changes made by me:

  • 'touch' commands are moved to the bottom.
  • File modification time and access time are stored separately.
  • Timezone offsets are stored. (Strictly, the %Tz and %Az things are not documented in GNU find, but they will work as long as you have a C99-complaint library.)
  • Added '-h' switch to chown and chgrp. This allows the script to handle symlinks.
  • 'chmod' only if the file is not a symlink.
  • All unusual filenames are properly escaped, thanks to '-exec ls --quoting-style=shell'. Notice that '--quoting-style=c' does not work as it seems when there are filenames that contain newlines.
: ${GIT_CACHE_META_FILE=.git_cache_meta}

if [ -n "$(find -prune -printf '%Tz %Az\n' | tr -d ' 0-9+-')" ]; then
echo "%z not supported in 'strftime' in C library." >&2
exit 1
fi

case $@ in
    --store|--stdout)
    case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
    git ls-files -z | xargs -0 -I NAME find NAME \
        \( -printf 'chown -h %U ' -exec ls --quoting-style=shell '{}' \; \) , \
        \( -printf 'chgrp -h %G ' -exec ls --quoting-style=shell '{}' \; \) , \
        \( \! -type l -printf 'chmod %#m ' -exec ls --quoting-style=shell '{}' \; \) , \
        \( -printf 'touch -c -h -m -d "%TY-%Tm-%Td %TH:%TM:%TS %Tz" ' -exec ls --quoting-style=shell '{}' \; \) , \
        \( -printf 'touch -c -h -a -d "%AY-%Am-%Ad %AH:%AM:%AS %Az" ' -exec ls --quoting-style=shell '{}' \; \) ;;
    --apply) sh -e $GIT_CACHE_META_FILE;;
    *) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
esac

Edited the version above to store directory metadata too. Useful if you for example want to add some www-data writable directories.

#!/bin/sh -e

#git-cache-meta -- simple file meta data caching and applying.
#Simpler than etckeeper, metastore, setgitperms, etc.
#from http://www.kerneltrap.org/mailarchive/git/2009/1/9/4654694
#modified by n1k
#modified by the-mars
# - save all files metadata not only from other users
# - save numeric uid and gid
#2012-03-05 - added filetime, andris9
#2012-05-22 - added fix for non ASCII characters and list size, merge chgrp into chown command
#2014-03-18 - the-mars: store properties for dirs too

pIFS=$IFS
IFS='
'

: ${GIT_CACHE_META_FILE=.git_cache_meta}

if [ -n "$(find -prune -printf '%Tz %Az\n' | tr -d ' 0-9+-')" ]; then
echo "%z not supported in 'strftime' in C library." >&2
exit 1
fi

case $@ in
    --store|--stdout)
    case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
    git ls-files -z | sed -z -r 's~/[^/]+$~~' | uniq -z | xargs -0 -I NAME find NAME \
        \( -printf 'chown -h %U:%G ' -exec ls -d --quoting-style=shell '{}' \; \) , \
        \( \! -type l -printf 'chmod %#m ' -exec ls -d --quoting-style=shell '{}' \; \) , \
        \( -printf 'touch -c -h -m -d "%TY-%Tm-%Td %TH:%TM:%TS %Tz" ' -exec ls -d --quoting-style=shell '{}' \; \) , \
        \( -printf 'touch -c -h -a -d "%AY-%Am-%Ad %AH:%AM:%AS %Az" ' -exec ls -d --quoting-style=shell '{}' \; \)
    git ls-files -z | xargs -0 -I NAME find NAME \
        \( -printf 'chown -h %U:%G ' -exec ls --quoting-style=shell '{}' \; \) , \
        \( \! -type l -printf 'chmod %#m ' -exec ls --quoting-style=shell '{}' \; \) , \
        \( -printf 'touch -c -h -m -d "%TY-%Tm-%Td %TH:%TM:%TS %Tz" ' -exec ls --quoting-style=shell '{}' \; \) , \
        \( -printf 'touch -c -h -a -d "%AY-%Am-%Ad %AH:%AM:%AS %Az" ' -exec ls --quoting-style=shell '{}' \; \) ;;
    --apply) sh -e $GIT_CACHE_META_FILE;;
    *) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
esac

IFS=$pIFS

With regard to the version by the-mars above: the -z option for sed is not available in e.g. Debian 7 Wheezy... (sed 4.2.1) :-(
It is in Ubuntu 14.04 (sed 4.2.2).

Great script.
I needed to add a space before the \ on this line though:
find $(git ls-files)\

JPT77 commented Nov 30, 2014

Well. It's getting complex. For me it did not really work.
If anyone has got trouble too, I recommend using: https://github.com/przemoc/metastore
Metastore seems to have some plans for the future.

btw. would be nice to put this script into a repo.

I also don't have the -z option available in sed (I am using MinGw for Windows) :-(
Any workaround?

@Barzi2001,
Yes: use MSYS2 with its up-to-date version of mingw(64) and sed (currently at 4.2.2, so -z is supported)

Hope that helps...

Here's a quick fix for the-mars' version:

  • Fix time zone detection. Use date +%z as a fallback if find -print %Tz gets an unset (empty) result.
  • Add ./ prefix for file names to prevent a leading-dash-name issue (rare, though; just in case).
  • Add -maxdepth 0 to avoid deeper find if a file/directory happens to be inexist.
  • Use awk post-replace to improve performance (by reducing mass ls calls).
  • Use git ls-tree to list all git versioned directories to improve performance and to avoid potential errors ("some.file" is added as a directory; "aaa/bbb/ddd.txt" doesn't make "aaa" added). This also eliminates the "No sed -z in MsysGit" issue since we no more use sed -z.
  • Merge short options.
#!/bin/sh -e

#git-cache-meta -- simple file meta data caching and applying.
#Simpler than etckeeper, metastore, setgitperms, etc.
#from http://www.kerneltrap.org/mailarchive/git/2009/1/9/4654694
#modified by n1k
#modified by the-mars
# - save all files metadata not only from other users
# - save numeric uid and gid
#2012-03-05 - added filetime, andris9
#2012-05-22 - added fix for non ASCII characters and list size, merge chgrp into chown command
#2014-03-18 - the-mars: store properties for dirs too
#2015-04-17 - time zone offset fallback; fix leading-dash-name error; avoid deeper find;
#              better quote file names; better directory listing; merge short opts; by Danny Lin

: ${GIT_CACHE_META_FILE=.git_cache_meta}
: ${Tz:=$(find -prune -printf '%Tz')}
: ${Tz:=$(date +%z)}
if ! [ "$Tz" ]; then
    echo "%z not supported in 'strftime' in C library." >&2
    exit 1
fi

case $@ in
    --store|--stdout)
    case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
    { git ls-tree --name-only -rdz $(git write-tree) | xargs -0 -I NAME find ./NAME -maxdepth 0 \
        \( -printf 'chown -h %U:%G \0%p\n' \) , \
        \( \! -type l -printf 'chmod %#m \0%p\n' \) , \
        \( -printf 'touch -hcmd "%TY-%Tm-%Td %TH:%TM:%TS '$Tz'" \0%p\n' \) , \
        \( -printf 'touch -hcad "%AY-%Am-%Ad %AH:%AM:%AS '$Tz'" \0%p\n' \)
      git ls-files -z | xargs -0 -I NAME find ./NAME -maxdepth 0 \
        \( -printf 'chown -h %U:%G \0%p\n' \) , \
        \( \! -type l -printf 'chmod %#m \0%p\n' \) , \
        \( -printf 'touch -hcmd "%TY-%Tm-%Td %TH:%TM:%TS '$Tz'" \0%p\n' \) , \
        \( -printf 'touch -hcad "%AY-%Am-%Ad %AH:%AM:%AS '$Tz'" \0%p\n' \)
    } | awk 'BEGIN {FS="\0"}; {print $1 "'\''" gensub(/'\''/, "'\''\\\\'\'''\''", "g", $2) "'\''" }' ;;
    --apply) sh -e $GIT_CACHE_META_FILE;;
    *) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
esac

MsysGit (1.9.5) doesn't seem to support chown, chgrp, and touch -h, just remove them to be compatible. e.g.:

@@ -25,15 +25,13 @@
     --store|--stdout)
     case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
     { git ls-tree --name-only -rdz $(git write-tree) | xargs -0 -I NAME find ./NAME -maxdepth 0 \
-        \( -printf 'chown -h %U:%G \0%p\n' \) , \
         \( \! -type l -printf 'chmod %#m \0%p\n' \) , \
-        \( -printf 'touch -hcmd "%TY-%Tm-%Td %TH:%TM:%TS '$Tz'" \0%p\n' \) , \
-        \( -printf 'touch -hcad "%AY-%Am-%Ad %AH:%AM:%AS '$Tz'" \0%p\n' \)
+        \( -printf 'touch -cmd "%TY-%Tm-%Td %TH:%TM:%TS '$Tz'" \0%p\n' \) , \
+        \( -printf 'touch -cad "%AY-%Am-%Ad %AH:%AM:%AS '$Tz'" \0%p\n' \)
       git ls-files -z | xargs -0 -I NAME find ./NAME -maxdepth 0 \
-        \( -printf 'chown -h %U:%G \0%p\n' \) , \
         \( \! -type l -printf 'chmod %#m \0%p\n' \) , \
-        \( -printf 'touch -hcmd "%TY-%Tm-%Td %TH:%TM:%TS '$Tz'" \0%p\n' \) , \
-        \( -printf 'touch -hcad "%AY-%Am-%Ad %AH:%AM:%AS '$Tz'" \0%p\n' \)
+        \( -printf 'touch -cmd "%TY-%Tm-%Td %TH:%TM:%TS '$Tz'" \0%p\n' \) , \
+        \( -printf 'touch -cad "%AY-%Am-%Ad %AH:%AM:%AS '$Tz'" \0%p\n' \)
     } | awk 'BEGIN {FS="\0"}; {print $1 "'\''" gensub(/'\''/, "'\''\\\\'\'''\''", "g", $2) "'\''" }' ;;
     --apply) sh -e $GIT_CACHE_META_FILE;;
     *) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;

MsysGit don't support find -print %Tz, either. However, while it supports date +%z, my patch with this fallback works.

Still another issue is that MsysGit doesn't support touch using a timestamp with fractional seconds. If the repo works only on MsysGit, it would work fine since MsygGit's %TS and %AS writes no fractional seconds. However if the .git_cache_meta has been created on a system that writes fractional seconds, an error would occur when it's being applied on MsysGit.

Many platforms and softwares just ignore the fractional seconds. To make the script more platform-free, we could add a replace command to pre-exclude the fractional seconds. For example:

@@ -34,7 +34,8 @@
         \( \! -type l -printf 'chmod %#m \0%p\n' \) , \
         \( -printf 'touch -hcmd "%TY-%Tm-%Td %TH:%TM:%TS '$Tz'" \0%p\n' \) , \
         \( -printf 'touch -hcad "%AY-%Am-%Ad %AH:%AM:%AS '$Tz'" \0%p\n' \)
-    } | awk 'BEGIN {FS="\0"}; {print $1 "'\''" gensub(/'\''/, "'\''\\\\'\'''\''", "g", $2) "'\''" }' ;;
+    } | awk 'BEGIN {FS="\0"}; {print $1 "'\''" gensub(/'\''/, "'\''\\\\'\'''\''", "g", $2) "'\''" }' |
+        sed -r 's!^(touch -[a-z]* "[0-9 :+\-]+)(\.[0-9]+)? !\1 !';;
     --apply) sh -e $GIT_CACHE_META_FILE;;
     *) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
 esac

bizonix commented May 7, 2015

for Mac OS X, brew install findutils gawk coreutils

#!/bin/sh -e

#git-cache-meta -- simple file meta data caching and applying.
#Simpler than etckeeper, metastore, setgitperms, etc.
#from http://www.kerneltrap.org/mailarchive/git/2009/1/9/4654694
#modified by n1k
#modified by the-mars
#modified by bizonix
# - save all files metadata not only from other users
# - save numeric uid and gid
#2012-03-05 - added filetime, andris9
#2012-05-22 - added fix for non ASCII characters and list size, merge chgrp into chown command
#2014-03-18 - the-mars: store properties for dirs too
#2015-04-17 - time zone offset fallback; fix leading-dash-name error; avoid deeper find;
#              better quote file names; better directory listing; merge short opts; by Danny Lin
#2015-05-07 - for Mac OS X, `brew install findutils gawk coreutils`

: ${GIT_CACHE_META_FILE=.git_cache_meta}

if [[ "$OSTYPE" == "darwin"* ]]; then
    GNU='g'
fi
for bin in find touch awk ; do
    BIN=$( echo $bin | tr '[:lower:]' '[:upper:]')
    eval ': ${'$BIN':=$(which $GNU$bin)}'
    if [ "$GNU" == 'g' ] && ! [[ "${!BIN}" =~ /$GNU$bin ]]  ; then
        echo "gnu version of '$bin' file not found." >&2
        exit 1
    fi
done

: ${Tz:=$($FIND -prune -printf '%Tz')}
: ${Tz:=$(date +%z)}
if ! [ "$Tz" ]; then
    echo "%z not supported in 'strftime' in C library." >&2
    exit 1
fi

case $@ in
    --store|--stdout)
    case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
    { git ls-tree --name-only -rdz $(git write-tree) | xargs -0 -I NAME $FIND ./NAME -maxdepth 0 \
        \( -printf 'chown -h %U:%G \0%p\n' \) , \
        \( \! -type l -printf 'chmod %#m \0%p\n' \) , \
        \( -printf $TOUCH' -hcmd "%TY-%Tm-%Td %TH:%TM:%TS '$Tz'" \0%p\n' \) , \
        \( -printf $TOUCH' -hcad "%AY-%Am-%Ad %AH:%AM:%AS '$Tz'" \0%p\n' \)
      git ls-files -z | xargs -0 -I NAME $FIND ./NAME -maxdepth 0 \
        \( -printf 'chown -h %U:%G \0%p\n' \) , \
        \( \! -type l -printf 'chmod %#m \0%p\n' \) , \
        \( -printf $TOUCH' -hcmd "%TY-%Tm-%Td %TH:%TM:%TS '$Tz'" \0%p\n' \) , \
        \( -printf $TOUCH' -hcad "%AY-%Am-%Ad %AH:%AM:%AS '$Tz'" \0%p\n' \)
    } | $AWK 'BEGIN {FS="\0"}; {print $1 "'\''" gensub(/'\''/, "'\''\\\\'\'''\''", "g", $2) "'\''" }' ;;
    --apply) sh -e $GIT_CACHE_META_FILE;;
    *) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
esac

arno01 commented Oct 19, 2015

Hi all,
Thanks to everyone for great additions above!
I decided to share my version of this script. The idea is to keep it as simple as possible.

#!/bin/sh -e

# git-cache-meta -- simple file meta data caching and applying.
# Simpler than etckeeper, metastore, setgitperms, etc.
# from http://www.kerneltrap.org/mailarchive/git/2009/1/9/4654694
# modified by n1k
#  - save all files metadata not only from other users
#  - save numeric uid and gid

# Changes by: Andrey Arapov <andrey.arapov@nixaid.com>
#   2015-10-16 - add '-h' flag to chown, chgrp and touch so that symlink is
#                NOT followed
#              - chmod cannot be applied to symlink
#              - add "--" to stop processing arguments (e.g when file name has
#                leading "-")
#   2015-10-14 - added quotes around path %p

# Initial release by andris9
#   2012-03-05 - added filetime, andris9

: ${GIT_CACHE_META_FILE=.git_cache_meta}
case $@ in
    --store|--stdout)
    case $1 in --store) exec > $GIT_CACHE_META_FILE; esac
    find $(git ls-files)\
        \( -printf 'chown -h %U -- "%p"\n' \) \
        \( -printf 'chgrp -h %G -- "%p"\n' \) \
        \( -printf 'touch -h -c -d "%AY-%Am-%Ad %AH:%AM:%AS" -- "%p"\n' \) \
        ! -type l \( -printf 'chmod %#m -- "%p"\n' \) ;;
    --apply) sh -e $GIT_CACHE_META_FILE;;
    *) 1>&2 echo "Usage: $0 --store|--stdout|--apply"; exit 1;;
esac

This is failing for me on a mac, since mac does not seem to support the -printf parameter.

cmw commented Jan 5, 2016

@heaversm: arno01's version doesn't work for me either, but bizonix' does.

danny0838 commented Jun 8, 2016 edited

I created another project git-store-meta, which is written in Perl and a bit more complicated but has better performance, flexibility, security, and cross-platform compatibility, while it still keeps very light dependency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment