Skip to content

Instantly share code, notes, and snippets.

@cevaris
Created March 11, 2015 14:58
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cevaris/a911fb8823a153331203 to your computer and use it in GitHub Desktop.
Save cevaris/a911fb8823a153331203 to your computer and use it in GitHub Desktop.
Naive sampling of file for bash/zsh
# Sample file
samplef() {
# set -x
if [ -z "${1}" ]; then
echo 'Error: Missing file path'
echo
echo 'Usage:'
echo 'samplef <FILEPATH> <SAMPLE_SIZE>'
echo
echo 'Optional paramters: <SAMPLE_SIZE>, default is 0.1'
echo 'ex; samplef ./myfile.txt 0.25'
return
fi
SAMPLE_RATIO=${2:-0.1}
cat $1 | perl -n -e "print if (rand() < $SAMPLE_RATIO)"
}
@cevaris
Copy link
Author

cevaris commented Mar 11, 2015

Just drop in your ~/.bashrc or ~/.zshrc.

To dump random 25% of your file
samplef ./myfile.txt 0.25

Defaults to dumping10 percent of your file
samplef ./myfile.txt

Credit goes to http://stackoverflow.com/a/692317/3538289

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment