Skip to content

Instantly share code, notes, and snippets.

@davelacy
Last active October 20, 2022 19:07
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save davelacy/5efbc0e7931af2636d90b55a38e19cfb to your computer and use it in GitHub Desktop.
Save davelacy/5efbc0e7931af2636d90b55a38e19cfb to your computer and use it in GitHub Desktop.
Unique active committer count for non-GitHub customers
#! /usr/bin/env bash
# ***DISCLAIMER: Only tested on MacOS & Linux***
# Author: Dave Lacy (@davelacy)
# Created: 2022-05-17
# Updated: 2022-05-19
# Purpose: To provide a best guess estimate of the number of unique committers for GHAS pricing estimates. For organizations currently using 3rd party git version control products like ADO, GitLab, BitBucket, etc.
# Before use, make sure you:
# Have stashed your commits or committed your work
# Clone the repos you want to include into one directory
# Place this script in the root of the directory containing your cloned repos
# Options:
# --since YYYY-MM-DD (omitting this will result in a default of $today - 90 days)
# --branch master (omitting this the script will find the default branch for each repo)
# How to use:
# 1. cd into a directory containing the cloned repos you want to count active committers for
# 2. Add this script and execute it like `bash ./unique-commiters.sh`
# 3. Should display an estimated count in the terminal output
# 4. Have a look at the output in the active_committers.txt file
# Examples:
# bash ./unique-commiters.sh
# bash ./unique-commiters.sh --since 2020-01-01
# bash ./unique-commiters.sh --since 2020-01-01 --branch main
# Caveats:
# If you have >1 committers with the same name (typically first and last), the count won't be accurate as there is a de-dupe method used
# create and overwrite active_committers.txt
echo "" > active_committers.txt
# get since and branch flags and set to variables
since=$(date -v -90d +%Y-%m-%d)
while [[ $# -gt 0 ]]
do
key="$1"
case $key in
--since)
since="$2"
shift # past argument
;;
--branch)
branch="$2"
shift # past argument
;;
*)
# unknown option
;;
esac
shift # past argument or value
done
# print the date variable
echo "Finding active committers since: $since"
# if there are no directories in this directory, print error message
if [ "$(ls -d */)" = "" ]; then
echo "Please ensure this script is run in a directory containing your repositories"
exit 1
fi
git_command() {
# if branch is not defined, use git command to find default branch
if [ -z "$branch" ]; then
git_branch=$(git rev-parse --abbrev-ref HEAD)
else
git_branch=$branch
fi
# branch = git remote show origin | sed -n '/HEAD branch/s/.*: //p'
git checkout $git_branch && git pull && git log --pretty="%an" --since="$1" | sort | uniq >> ../active_committers.txt
echo "" >> ../active_committers.txt
}
# loop over each directory and run the git_command function from inside each directory
for directory in ./*/ ; do
cd "$directory"
git_command "$since"
cd ..
done
# Remove the lines that include the word "dependabot"
awk '!/dependabot/' active_committers.txt > tmpfile && mv tmpfile active_committers.txt
# Remove duplicates
sort active_committers.txt | uniq -u > tmpfile && mv tmpfile active_committers.txt
# print the number of lines in the file active_committers.txt with a message 'Number of active committers: '
echo "Approximate number of active committers: $(wc -l active_committers.txt | awk '{print $1}')"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment