First, fetch the published file and prepare it for comparison
# Fetch file
curl 'https://data.phila.gov/resource/6quj-54t7.csv?$limit=5000' | \
# Fix currency formatting
sed -E 's/,\$([0-9]+)\.00/,\1/g' | \
# Sort
body csvsort > data/out/published.csv
Then, run clean.js
from the fy16-adopted repo, processing the data/in/FY2016-adopted-unique.csv
file to data/out/FY2016-adopted-unique.csv
node clean.js
Then sort the result
cat data/out/FY2016-adopted-unique.csv | body csvsort > data/out/published.csv
And there are a few differences. Nearly all are additions of lines with 0
totals, but there are two arbitrary removals.