Skip to content

Instantly share code, notes, and snippets.

@benmccallum
Last active August 26, 2023 14:36
Show Gist options
  • Star 8 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save benmccallum/28e4f216d9d72f5965133e6c43aaff6e to your computer and use it in GitHub Desktop.
Save benmccallum/28e4f216d9d72f5965133e6c43aaff6e to your computer and use it in GitHub Desktop.
git pre-commit hook preventing large files

Usage

You can use in two ways.

  1. Directly as the pre-commit hook in your .git/hooks folder.

  2. With Husky by updating your package.json with:

"husky": {
    "hooks": {
      "pre-commit": "sh ./some-path/pre-commit-prevent-large-files.sh"
    }
}

Installation

@guysmoilov wrote an awesome installer script over here that means this hook will be added to all future repos you clone; how cool!

Credits

Based on @kiwidamien's original gist here

Alternatives

pre-commit is "a framework for managing and maintaining multi-language pre-commit hooks" and has a hook you can plug-in called: check-added-large-files. pre-commit is built with Python though, so you'll need Python installed.

Use a GitHub Aaction (see .yml file)

#!/bin/bash
# This is a pre-commit hook that ensures attempts to commit files that are
# larger than $limit to your _local_ repo fail, with a helpful error message.
# Maximum file size limit in bytes
limit=$(( 5 * 2**20 )) # 5MB
limitInMB=$(( $limit / 2**20 ))
# Move to the repo root so git files paths make sense
repo_root=$( git rev-parse --show-toplevel )
cd $repo_root
empty_tree=$( git hash-object -t tree /dev/null )
if git rev-parse --verify HEAD > /dev/null 2>&1
then
against=HEAD
else
against="$empty_tree"
fi
# Set split so that for loop below can handle spaces in file names by splitting on line breaks
IFS='
'
echo "Checking staged file sizes"
shouldFail=false
# `--diff-filter=d` -> skip deletions
for file in $( git diff-index --cached --diff-filter=d --name-only "$against" ); do
# Skip for directories (git submodules)
if [[ -f "$file" ]]; then
file_size=$( ls -lan $file | awk '{ print $5 }' )
if [ "$file_size" -gt "$limit" ]; then
echo File $file is $(( $file_size / 2**20 )) MB, which is larger than our configured limit of $limitInMB MB
shouldFail=true
fi
fi
done
if $shouldFail
then
echo If you really need to commit this file, you can push with the --no-verify switch, but the file should definitely, definitely be under $limitInMB MB!!!
echo Commit aborted
exit 1;
fi
# This workflow will check for large files being added in PRs
# and label the PR if one is found that exceeds the configured limit.
#
# For more information, see: https://github.com/marketplace/actions/lfs-warning
name: Large file size warning
on:
pull_request:
# Ignore some files to avoid consuming Actions minutes unnecessarily
# for file types we're fairly confident we'll never need to worry about
paths-ignore:
- '**.config'
- '**.cs'
- '**.cshtml'
- '**.cs'
- '**.csproj'
- '**.cmd'
- '**.dockerignore'
- '**.gitignore'
- '**.graphql'
- '**.jsx?'
- '**.md'
- '**.props'
- '**.ps1'
- '**.scss'
- '**.sh'
- '**.sln'
- '**.tsx?'
- '**.yml'
- '**/appsettings.*.json'
- '**/Dockerfile'
jobs:
run-check:
runs-on: ubuntu-latest
steps:
- uses: actionsdesk/lfs-warning@v2.0
with:
#token: ${{ secrets.GITHUB_TOKEN }} # Optional
filesizelimit: '5242880' # 5MB
@guysmoilov
Copy link

Hi, small fix, it should be:

against="$empty_tree"

@guysmoilov
Copy link

Also, I wrote a small installer script for this, since I anticipate it could have wide usage: https://gist.github.com/guysmoilov/ddb3329e31b001c1e990e08394a08dc4
Thanks for you work!

@benmccallum
Copy link
Author

Thanks @guysmoilov, awesome work and thanks for sharing!

@Cyberbeni
Copy link

file_size=$( ls -la $file | awk '{ print $5 }')
This doesn't work for groups with a space in them.
stat -f "%z" "$file" works on macOS but on Linux, it's stat -c "%s" "$file"
so the best is probably ls -lan $file | awk '{ print $5 }' which uses numerical value instead of user/group name

@Cyberbeni
Copy link

The echo on line 39 doesn't use $limitInMB
Line 7 should be limit=$(( 5*2**20 ))
Line 8: limitInMB=$(( $limit / 2**20 ))
Line 32 should also use 2**20

@Cyberbeni
Copy link

Also I think it would be better to use /bin/bash shebang: https://askubuntu.com/a/141932

@Cyberbeni
Copy link

Line 3 ends with are and line 4 starts with are

@Cyberbeni
Copy link

Cyberbeni commented Jan 25, 2021

The current version will also print errors for deletions and will get messed up for submodules, this fixes that:

# `--diff-filter=d` -> skip deletions
for file in $( git diff-index --cached --diff-filter=d --name-only "$against" ); do
	# Skip for directories (git submodules)
	if [[ -f "$file" ]]; then

@benmccallum
Copy link
Author

Hey @Cyberbeni, thanks for the feedback. We don't actually use this at work anymore, I put a GitHub action in place instead, but your feedback seems valuable and in the interest of helping others I think we should apply your feedback here for sure.

@benmccallum
Copy link
Author

benmccallum commented Jan 26, 2021

I've updated the gist with all your feedback everything except one I'm not sure about (see below). Do you mind checking it looks OK now to you?

Also I think it would be better to use /bin/bash shebang: https://askubuntu.com/a/141932

Do you mean in the example usage? e.g. it should be this?
"pre-commit": "!/bin/sh ./some-path/pre-commit-prevent-large-files.sh"

@benmccallum
Copy link
Author

cc: @guysmoilov, you might want to make these updates to your gist too

@Cyberbeni
Copy link

I've updated the gist with all your feedback everything except one I'm not sure about (see below). Do you mind checking it looks OK now to you?

Also I think it would be better to use /bin/bash shebang: https://askubuntu.com/a/141932

Do you mean in the example usage? e.g. it should be this?
"pre-commit": "!/bin/sh ./some-path/pre-commit-prevent-large-files.sh"

No, I mean the first line in the file should be #!/bin/bash instead of #!/bin/sh because the latter can be different based on OS

@benmccallum
Copy link
Author

Thanks @Cyberbeni, updated

@george-angel
Copy link

I put a GitHub action in place instead

@benmccallum is that available at all? :D

@benmccallum
Copy link
Author

Hey @george-angel , just added a yml file that details how we still do it in a GH Action.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment