Skip to content

Instantly share code, notes, and snippets.

Created December 16, 2016 22:34
Show Gist options
  • Save dingzeyuli/f07c126b74371adba4b7dbe181cb57d2 to your computer and use it in GitHub Desktop.
Save dingzeyuli/f07c126b74371adba4b7dbe181cb57d2 to your computer and use it in GitHub Desktop.
Check the size of a github repo before downloading
# tested on macOS
echo | perl -ne 'print $1 if m!([^/]+/[^/]+?)(?:\.git)?$!' | xargs -I{} curl -s -k'{}' | grep size
# output:
# "size": 1746294,
Copy link

sainak commented Oct 12, 2020

curl -s | jq '.size' | numfmt --to=iec --from-unit=1024

Copy link

What about private repos?

Copy link

sainak commented Oct 14, 2020

curl -s -H "Authorization: token GITHUB_TOKEN" | jq '.size' | numfmt --to=iec --from-unit=1024

Copy link

curl -s -H "Authorization: Bearer <GITHUB_TOKEN>" | jq '.size' | numfmt --to=iec --from-unit=1024

I had to prefix Bearer last time I used this! 🎆

Copy link

CodeIter commented Sep 3, 2023


#!/usr/bin/env -S bash -euo pipefail

get_github_repo_size() {

  if [[ "$1" == "--help" || "$1" == "-h" ]]; then
    echo "Usage: get_github_repo_size [OPTIONS] {GITHUB_REPO_URLS...}"
    echo "Retrieve and display the sizes of GitHub repositories."
    echo "The function will automatically use the GITHUB_TOKEN if it's set as an environment variable."
    echo "Warning: GitHub API repository size information may not be reliable."
    echo "Explanation:"
    echo "- GitHub support has indicated that due to Git Alternates and the way files are stored in GitHub repositories,"
    echo "  the numbers returned from the GitHub API cannot be relied upon for the actual size"
    echo "- Therefore, the size values returned by this script should not be considered"
    echo "  a reliable or accurate representation of the true repository size"
    echo "- The API size values may differ significantly from the actual repository size"
    echo "Options:"
    echo "  -h, --help     Display this help message."
    echo "Environment variables:"
    echo "  GITHUB_TOKEN   Use the provided GitHub token for authentication."
    echo "Exit codes:"
    echo "  0              Success"
    echo "  1              Fail"

  local _i
  local  _err=0
  for _i in "${@}" ; do
    local _url="${_i}"
    local _owner="$(basename "$(dirname "${_url}")")"
    local _repo="$(basename "${_url}")"
    local _domain="$(basename "$(dirname "$(dirname "${_url}")")")"
    local _api_url="https://api.${_domain}/repos/${_owner}/${_repo}"

    local _headers=""
    if [ -n "$GITHUB_TOKEN" ]; then
      # Use the provided GitHub token if available
      local _headers="-H 'Authorization: Bearer $GITHUB_TOKEN'"

    # Fetch repository info, including size, in a single request
    local _response=$(curl -sS $_headers "$_api_url")

    # Check if the GitHub repo exists
    if jq -e '.message' <<< "$_response" >/dev/null; then
      # The API response contains a message field (indicating an error)
      local _message=$(jq -r '.message' <<< "$_response")
      >&2 printf 'error %s Failed to retrieve information about the GitHub repository: %s\n' "${_domain}/${_owner}/${_repo}" "${_message:-dns or http connection error}"
      # The GitHub repo exists; fetch and format its size
      local _size=$(jq '.size' <<< "$_response")
      if [ "$_size" != "null" ]; then
        # Valid size found; format and print it
        printf '%s %s\n' "$(numfmt --to=iec --from-unit=1024 <<< "$_size")" "${_domain}/${_owner}/${_repo}"
        >&2 printf 'error %s Invalid size data retrieved.\n'  "${_domain}/${_owner}/${_repo}"
  >&2 echo
  >&2 echo "Warning: GitHub API repository size information may not be reliable (see --help for details)."
  return ${_err}

Example working usage:

$ get_github_repo_size "" ""

expected output:

To standard output Stream stdout only except Warning message to stderr


Warning: GitHub API repository size information may not be reliable (see --help for details)

Example failing usage:

$ get_github_repo_size "" "" ""

Expected output:

To standard error Stream stderr only

error Failed to retrieve information about the GitHub repository: Not Found
error Failed to retrieve information about the GitHub repository: Not Found
curl: (6) Could not resolve host:
error Failed to retrieve information about the GitHub repository: dns or http connection error

Warning: GitHub API repository size information may not be reliable (see --help for details)

Copy link

prajwal89 commented Oct 19, 2023

I've developed a tool for checking GitHub repository sizes by leveraging the GitHub API.
You can find the tool here:
GitHub Repository Size Checker

Copy link

pa-0 commented May 22, 2024

Based on what I've read, GitHub support has indicated that due to Git Alternates and the way files are stored in GitHub repositories, the numbers returned from the GitHub API cannot be relied upon for the actual size.

Copy link

I've developed a tool for checking GitHub repository sizes by leveraging the GitHub API. You can find the tool here: GitHub Repository Size Checker

Great tool!
Could you add checking based on the last commit?

Copy link

I've developed a tool for checking GitHub repository sizes by leveraging the GitHub API. You can find the tool here: GitHub Repository Size Checker

I've been using this tool for quite some time, but it throws an error lately. So, I developed another web-based tool.
Site link: Repo Size Checker

Copy link

I've developed a tool for checking GitHub repository sizes by leveraging the GitHub API. You can find the tool here: GitHub Repository Size Checker

I've been using this tool for quite some time, but it throws an error lately. So, I developed another web-based tool. Site link: Repo Size Checker

Thanks a lot for this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment