You got your hands on some data that was leaked from a social network and you want to help the poor people.
Luckily you know a government service to automatically block a list of credit cards.
The service is a little old school though and you have to upload a CSV file in the exact format. The upload fails if the CSV file contains invalid data.
The CSV files should have two columns, Name and Credit Card. Also, it must be named after the following pattern:
YYYYMMDD
.csv.
The leaked data doesn't have credit card details for every user and you need to pick only the affected users.
The data was published here:
You don't have much time to act.
What tools would you use to get the data, format it correctly and save it in the CSV file?
Do you have a crazy vim configuration that allows you to do all of this inside your editor? Are you a shell power user and write this as a one-liner? How would you solve this in your favorite programming language?
Show me your solution in the comments below!
Thank you all for participating!
I never thought so many people might be willing to submit a solution. This is exactly the overview about different technologies and ways of thinking I anticipated to get.
We have solutions without any coding, solutions in one line of code and solutions with over a hundred lines.
I hope everyone else also learned something new by looking at this different styles!
Make sure to also checkout the solutions on Hackernews, Reddit (and /r/haskell) and dev.to!
Cheers, Jorin
Probably just use awk. Download separately. Do a few test/refinement runs on the first 10/100 lines, output a CSV and call it good. At least if I only had to do it once.
For something longer-term, I'd grab a JSON (parser) and CSV (formatter) library in... whatever language (Haskell is my favorite right now, but it's advantages are not leveraged here.) and do it that way. Once I was happy with the results of a slurp/process/dump with everything in memory, I'd try and refactor to something streaming, in case future dumps where to large to process without swapping.