Skip to content

Instantly share code, notes, and snippets.

@psu
Last active March 1, 2024 15:18
Show Gist options
  • Save psu/5c08eea3d583e14a7a9bfab428036234 to your computer and use it in GitHub Desktop.
Save psu/5c08eea3d583e14a7a9bfab428036234 to your computer and use it in GitHub Desktop.
MILLER commands
#########################################
# MILLER commands -- Pontus Sundén 2024 #
#########################################
# Transpose two CSV rows to a tab-separated list
mlr --c2d --ops tab --ofs newline cat
# Change each occurrence of value X
mlr --csv put '$* = apply(func(k,v){if(v=="X"){v="Y"};return {k:v}})'
# Cleaning verbs
mlr --csv clean-whitespace then put '$*=apply($*,func(k,v){if(v==0){v=""}return {k:v}})' then remove-empty-columns
# Distinct for col1,col2 - Keep only first row
mlr --csv head -n 1 -g col1,col2
# Exclude columns with empty header
mlr --csv cut -xrf "^_\d+$|^$"
# Filter by field value
mlr --csv filter '$col=~"^7"'
# Merge several CSV files
mlr --csv unsparsify *.csv
# Left join on column JOIN_COLUMN
mlr --csv join --ur -j JOIN_COLUMN -f "LEFT.csv" then unsparsify then uniq -a "RIGHT.csv" > all-records-in-RIGHT-complemented-with-values-from-LEFT.csv
# Inverted left join on column JOIN_COLUMN
mlr --csv join --np --ul -j JOIN_COLUMN -f "LEFT.csv" "RIGHT.csv" > records-in-LEFT-not-found-in-RIGHT.csv
# Convert data copied from a spreadsheet to CSV
mlr --csv --ifs tab --ofs semicolon remove-empty-columns then skip-trivial-records
# The clipboard data in macOS seems to be a RFC 4180 compliant format (ie. CSV) with tabs instead of commas.
# Miller handles TSV files diffrently from CSV files: https://github.com/johnkerl/miller/issues/238
# Miller documentation
# https://miller.readthedocs.io/en/latest/reference-main-flag-list/#csvtsv-only-flags
# https://miller.readthedocs.io/en/latest/file-formats/#csvtsvasvusvetc
# https://miller.readthedocs.io/en/latest/record-heterogeneity/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment