I tend to wrangle a lot of spreadsheet data but don't always want to use use google sheets and habitually convert to csv and process them locally.
This is a list of some of my favourite tools, its non exhaustive and biased towards my preference for terminal based user interfaces
if you install one tool this is the one
- handles many formats, json, yaml, xlsx, parquet, sqlite, postgresql, zip
- pythonic
- extensible
- fast
- reliable
- handle multitab excel files
- can open a zipfile containing json, csv, json, yaml ...
assumes you have python3, has a bit of a learning curve because of all the key mappings but there is a menu system.
pip3 install visidata
vd foo.bar
https://csvkit.readthedocs.io/en/latest/
command line tools for unix style wrangling with csvgrep, csvcut, csvsort, csvlook, csvjoin
- convert json, xlsx to csv
- filter, join, format, stack files
assumes python 3
pip3 install csvkit
https://github.com/BurntSushi/xsv
xsv is a command line program for indexing, slicing, analyzing, splitting and joining CSV files
if your data is too big for csvkit try this
brew install xsv
- turns a wealth of cloud apis into postgres SQL
- dozens of plugins
- makes csv available to postgres
- supports third party tools like pgadmin, psql
Uses a pluggable Postgres foreign data wrapper to process data csv, json, yaml but also AWS, Azure, prometheus etc
brew install steampipe
steampipe plugin install csv
steampipe query