Skip to content

Instantly share code, notes, and snippets.

@ianmilligan1
Created February 11, 2016 21:35
Show Gist options
  • Save ianmilligan1/1975052cd6f602855c01 to your computer and use it in GitHub Desktop.
Save ianmilligan1/1975052cd6f602855c01 to your computer and use it in GitHub Desktop.
CSV Filtering
pip install csvfilter
## grabs the year and language field
csvfilter -f 2,5 derivative-data.csv > year_language.csv
## sorts them so that you have the years by language
cat year_language.csv | sort | uniq -c > sorted_year_language.csv
## pulls languages out, arranged by year
grep "en" sorted_year_language.csv
grep "fr" sorted_year_language.csv
## counts total number of items per year so you can normalize if you want
awk -F, '{a[$1]+=$2;}END{for(i in a)print i", "a[i];}' years.csv | sort -d
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment