Skip to content

Instantly share code, notes, and snippets.

@cmbuckley
Last active May 16, 2018 11:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cmbuckley/d94ab47c3676517a528d6d86b049761d to your computer and use it in GitHub Desktop.
Save cmbuckley/d94ab47c3676517a528d6d86b049761d to your computer and use it in GitHub Desktop.
Group a list of numbers into discrete buckets
# Round to the nearest multiple of "size"
function bucket(a) {
return int((a + (size/2)) / size) * size;
}
{ buckets[bucket($1)]++ }
END {
for (b in buckets) {
print b, buckets[b]
}
}
#!/bin/bash
size="$1"
shift
if ! [ "$size" -eq "$size" ] 2>/dev/null; then
echo "Bucket size must be a number" >&2
exit 1
fi
awk -v size="$size" -f bucket.awk "$@"
@cmbuckley
Copy link
Author

Example usage:

cat ids
123
342
546
567
352
234
234
567

bucket 10 ids
550 1
570 2
340 1
350 1
120 1
230 2

Separated out the Shell script and the Awk script in order to be able to pass awk params to the script, e.g.:

bucket 10 -v 'OFS=,' ids
550,1
570,2
340,1
350,1
120,1
230,2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment