Instantly share code, notes, and snippets.

Embed
What would you like to do?
This is a little challenge to find out which tools programmers use to get their everyday tasks done quickly.

You got your hands on some data that was leaked from a social network and you want to help the poor people.

Luckily you know a government service to automatically block a list of credit cards.

The service is a little old school though and you have to upload a CSV file in the exact format. The upload fails if the CSV file contains invalid data.

The CSV files should have two columns, Name and Credit Card. Also, it must be named after the following pattern:

YYYYMMDD.csv.

The leaked data doesn't have credit card details for every user and you need to pick only the affected users.

The data was published here:

https://gist.githubusercontent.com/jorinvo/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json

You don't have much time to act.

What tools would you use to get the data, format it correctly and save it in the CSV file?


Do you have a crazy vim configuration that allows you to do all of this inside your editor? Are you a shell power user and write this as a one-liner? How would you solve this in your favorite programming language?

Show me your solution in the comments below!

Update

Thank you all for participating!

I never thought so many people might be willing to submit a solution. This is exactly the overview about different technologies and ways of thinking I anticipated to get.

We have solutions without any coding, solutions in one line of code and solutions with over a hundred lines.

I hope everyone else also learned something new by looking at this different styles!

Make sure to also checkout the solutions on Hackernews, Reddit (and /r/haskell) and dev.to!

Cheers, Jorin

@jorinvo

This comment has been minimized.

Owner

jorinvo commented Apr 25, 2015

If you want to check your result you could download my solution and compare them (cmp --silent $one $two || echo "files are different").

I guess I would use Ruby in this case:

require 'open-uri'
require 'json'
require 'csv'
require 'date'

uri = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json'
file = "#{Date.today.strftime '%Y%m%d'}.csv"

CSV.open file, 'wb', headers: %w(name creditcard), write_headers: true do |csv|
  JSON.parse(open(uri).read)
    .each { |x| csv << x if x['creditcard'] }
end
@xat

This comment has been minimized.

xat commented Apr 25, 2015

Here is a iojs solution:

#!/usr/bin/env iojs --harmony_arrow_functions

var got = require('got')
var _ = require('highland')
var csv = require('csv-write-stream')
var JSONStream = require('JSONStream')
var dateformat = require('dateformat')
var fs = require('fs')

_(got(process.argv[2]))
  .through(JSONStream.parse('*'))
  .filter(doc => !!doc.creditcard)
  .through(csv({ headers: ['name', 'creditcard'] }))
  .pipe(fs.createWriteStream(dateformat('yyyymmdd') + '.csv'))
@fbcom

This comment has been minimized.

fbcom commented Apr 25, 2015

Here is a somewhat more lengthly solution in python:

#!/usr/bin/env python
import urllib, json, csv, requests, time

src_url = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
dst_url = "https://not.given.yet"
tempfile = time.strftime("%Y%m%d.csv")

with open(tempfile, 'wb') as csvfile:
  fh = csv.writer(csvfile, delimiter=',')
  for line in json.loads(src_urllib.urlopen(url).read()):
    if 'name' in line and 'creditcard' in line:
      if line['name'] != None and line['creditcard'] != None:
        fh.writerow([line['name'], line['creditcard']])

files = {'file': open(tempfile, 'rb')}
res = requests.post(dst_url, files=files)
print res.text
@chengsun

This comment has been minimized.

chengsun commented Apr 25, 2015

Why complicate things?

echo "name,creditcard" > 20150425.csv
sed -n 's/^{"name":"\([^"]*\)".*"creditcard":"\([0-9\-]*\)".*$/\1,\2/p' data.json >> 20150425.csv
@jeremyis

This comment has been minimized.

jeremyis commented Apr 25, 2015

vim. Probably shorter ways to do it, but this is all interactive so you can watch what you're doing when you record the macro.:

delete open and close [, ]

Gdd
1Gdd

Record macro:

qa
9x # delete before name
f" # go to after name
d/creditcard:" # delete until creditcard
df" # delete text before 'creditcard:"'
c2f" # delete 'creditcard:":"
, # add comma
f"d$ # delete rest of line
j^ # go to start of next line
q # end recording
9999@a # repeat 999 times

write to file

:w 20150425.csv

@th0br0

This comment has been minimized.

th0br0 commented Apr 25, 2015

Why not just use jq for a simple one-liner?
The CSV column values include quotes, but that's usually valid CSV and leads to the same parse result. (If not, well,sed -i 's/"//g' $name)

name=`date +'%Y%m%d'`; echo "name,creditcard" > $name; jq -r '.[] | select(.creditcard) | [.name, .creditcard] | @csv' data.json >> $name
@nlohmann

This comment has been minimized.

nlohmann commented Apr 25, 2015

Here is a shameless plug for my JSON for Modern C++ project:

#include <fstream>
#include <algorithm>
#include <chrono>
#include <iomanip>
#include <sstream>
#include <json.hpp> // from https://github.com/nlohmann/json

using nlohmann::json;

int main()
{
    // get filename
    auto now =  std::chrono::system_clock::to_time_t(std::chrono::system_clock::now());
    std::stringstream ss;
    ss << std::put_time(std::localtime(&now), "%Y%m%d");
    auto filename = ss.str() + ".CSV";

    // open outfile
    std::ofstream output(filename);

    // parse JSON
    auto data = std::ifstream("data.json");
    json j_data;
    j_data << data;

    // iterate
    std::for_each(j_data.begin(), j_data.end(), [&output](json& value)
    {
        // check if "creditcard" key is present
        if (not value["creditcard"].is_null())
        {
            output << value["name"] << "," << value["creditcard"] << std::endl;;
        }
    });
}

C++ may not be the prettiest language for that (creating the filename is really ugly...), but its possible to build something quickly.

@vitorbaptista

This comment has been minimized.

vitorbaptista commented Apr 25, 2015

Here using jq, pretty similar to what @chengsun did. This only works if all timestamps are from 2015/04/25.

echo 'Name,Credit Card' > 20150425.csv
jq -r '.[] | [.name, .creditcard] | @csv' data.json >> 20150425.csv

Edit: oh, @th0br0 was faster than me 😄

@jamestomasino

This comment has been minimized.

jamestomasino commented Apr 25, 2015

I tried to go with a bash only solution using only stuff installed by default: curl, sed, awk, (cmp for validation). No json parsing in this one, just string manipulation.

#!/bin/bash

# Prepare output
file="$(date +"%Y%m%d").csv"
echo "name,creditcard" > "$file"

# Parse data
curl -s "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json" | sed -n '/creditcard/p' | sed '/card\":null/d' | awk -F "\"*,\"*" '{print $1 ":" $6}' | awk -F "\"*:\"*" '{print $2 "," $4}' | sed 's/\"\}//' >> $file

# Test Result
cmp <(curl -s "https://gist.githubusercontent.com/jorin-vogel/09c6e7dcd7a70a475fa7/raw/44bb65f63681211ce17bde16519d5ce3d343af8e/20150425.csv") $file || echo "files are different"

@chengsun has some great sed skills above. Greatly simplifies what I was getting at. Nice!

@useerup

This comment has been minimized.

useerup commented Apr 25, 2015

PowerShell

&{
    "name,creditcard"
    (iwr https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | ConvertFrom-Json) | ? creditcard | %{ "$($_.name),$($_.creditcard)" }
} | Out-File ('{0:yyyyMMdd}.csv' -f (Get-Date))

Edit:

PowerShell has built-in conversion from json to objects. Since a json array is returned it can be handled directly as a collection/stream. The ? creditcard is a filter that only lets objects through if there is a property called creditcard that is non-empty.

The above script produces output that is equal to the "solution".

Edit 2:

A real one-liner (i.e. no fake compund statements) could look like this, if quotes would be ok around values:

(iwr https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | ConvertFrom-Json) | ? creditcard | select name,creditcard | epcsv ('{0:yyyyMMdd}.csv' -f (Get-Date)) -not

This uses the built-in cmdlets

  • iwr (alias for Invoke-WebRequest)
  • ConvertFrom-Json
  • ? (alias for Where-Object)
  • select (alias for Select-Object)
  • epcsv (alias for Export-Csv)
@filmor

This comment has been minimized.

filmor commented Apr 25, 2015

Pandas:

import pandas as pd
df = pd.read_json("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")
df2 = df[['name', 'creditcard']]
df2.columns = ["Name", "Credit Card"]
df2.dropna().to_csv(pd.Timestamp("now").strftime("%Y%m%d") + ".csv", index=False)

And here the same as a "one-liner":

pd.read_json("...")[["name", "creditcard"]].rename(columns={"name": "Name", "creditcard": "Credit Card"}) \
   .dropna().to_csv(pd.Timestamp("now").strftime("%Y%m%d") + ".csv", index=False)
@sawidis

This comment has been minimized.

sawidis commented Apr 25, 2015

import time
import pandas as pd

data_source = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json'
data_destination = time.strftime("%Y%m%d.csv")

records = pd.read_json(data_source)
records_with_card = records[records.creditcard.notnull()]
records_with_card[['name', 'creditcard']].to_csv(data_destination, index=False)

Edit: Too late =p

@def-

This comment has been minimized.

def- commented Apr 25, 2015

Nim:

import httpclient, json, times

let file = open(getTime().getLocalTime.format("yyyyMMdd") & ".csv", fmWrite)
for x in parseJson getContent "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json":
  if x["creditcard"].kind != JNull:
    file.writeln x["name"].str, ",", x["creditcard"].str
@dbohdan

This comment has been minimized.

dbohdan commented Apr 25, 2015

@th0br0 Nice!

Tcl:

#!/usr/bin/env tclsh
package require json
package require csv
package require fileutil
set dataByDate {}
foreach entry [::json::json2dict [::fileutil::cat data.json]] {
    set card [dict get $entry creditcard]
    if {$card eq {null}} { continue }

    set seconds [clock scan [dict get $entry timestamp] \
            -format {%Y-%m-%d %H:%M:%S %z}]
    # We don't know what time zone the bank is in, so we will just use the same
    # one all the input seems to.
    set date [clock format $seconds -format %Y%m%d -timezone :Asia/Krasnoyarsk]
    if {![dict exists $dataByDate $date]} {
        dict set dataByDate $date name,creditcard
    }
    dict lappend dataByDate $date [::csv::join \
            [list [dict get $entry name] $card]]
}
foreach {date lines} $dataByDate {
    puts "writing $date.csv"
    ::fileutil::writeFile $date.csv [join $lines \n]\n
}

Edit: simplified the code.

Edit 2: handle dates correctly.

@nautical

This comment has been minimized.

nautical commented Apr 25, 2015

echo '"Name","Card"'; jq -r '.[] | select(.creditcard) | [.name, .creditcard] | @csv' data.json

@vitorbaptista .. u didnt select ..

EDIT
ohh .. @th0br0 he has a similar approach ..

@AntonioMeireles

This comment has been minimized.

AntonioMeireles commented Apr 25, 2015

@th0br0 indicated afaict the sanest/fastest answer. for this kind of datasets/tasks there's hardly anything that beats jq.

fwiw my oneliner wouldn't be too different from his'...

(echo "Name,Credit Card" ; curl -Ls https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | jq -r ".[]| select(.creditcard) | [.name,.creditcard] |@csv" | sed -e 's,",,g' ) > $(date +'%Y%m%d').csv
@lizheming

This comment has been minimized.

lizheming commented Apr 25, 2015

PHP:

<?php 
file_put_contents( 
    date("Ymd").".csv", 
    "name,creditcard\r\n".implode('\r\n', array_map(function($d) { 
        return "{$d['name']}, {$d['creditcard']}";
    }, json_decode(file_gete_contents("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"), 1))) 
);
@LuRsT

This comment has been minimized.

LuRsT commented Apr 25, 2015

Using jq:

echo "Name, Credit Card" >  `date +%Y%m%d`.csv && curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json  | jq '.[] | [.name, .creditcard?]' | jq '.[0] + "," + .[1]' | sed 's/"//g' >> `date +%Y%m%d`.csv

For readability, here is the de-constructed one-liner:

echo "Name, Credit Card" >  `date +%Y%m%d`.csv && \
curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json  | \
jq '.[] | [.name, .creditcard?]' | \
jq '.[0] + "," + .[1]' | \
sed 's/"//g' >> `date +%Y%m%d`.csv
@sergioramos

This comment has been minimized.

sergioramos commented Apr 25, 2015

An alternative node approach:

const HREF = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json';

var timethat = require('timethat');
var pump = require('pump');
var dateformat = require('dateformat');
var JSON = require('jsonstream2');
var csv = require('csv-write-stream');
var request = require('request');
var filter = require('through2-filter');
var through = require('through2');
var format = require('util').format;
var isString = require('is-string');
var fs = require('fs');


var read = 0;
var wrote = 0;

var _data = request(HREF, {
  json: true
});

var _json = JSON.parse('*');

var _read = through.obj(function(user, enc, fn) {
  read += 1;
  fn(null, user);
});

var _filter = filter.obj(function(user) {
  return isString(user.creditcard);
});

var _wrote = through.obj(function(user, enc, fn) {
  wrote += 1;
  fn(null, user);
});

var _csv = csv({
  headers: ['name', 'creditcard']
});

var _out = fs.createWriteStream(format('%s.csv', dateformat('yyyymmdd')));


var start = new Date();
pump(_data, _json, _read, _filter, _wrote, _csv, _out, function(err) {
  if (err) throw err;

  var took = timethat.calc(start, new Date());
  console.log('Read %d records', read);
  console.log('Wrote %d records', wrote);
  console.log('Took %s', took);
});
@navarr

This comment has been minimized.

navarr commented Apr 25, 2015

PHP! Bam, easy.

#!/usr/bin/php
<?php

$data = json_decode(file_get_contents('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json'), true);
$file = fopen(date('Ymd').'.csv', 'w');
$headers = ['Name', 'Credit Card'];
fputcsv($file, $headers);
foreach($data as $person) {
        if (empty($person['creditcard'])) continue;
        fputcsv($file, [$person['name'], $person['creditcard']]);
}
fclose($file);

Assuming that it expects headers in the CSV, if not I'd just exclude the two headers lines.

PHP - Json_decode, file_get_contents, and fputcsv. B-E-A-utiful.

@cgp

This comment has been minimized.

cgp commented Apr 25, 2015

Runs right here in github

dataUrl= "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json";
t="";
$.ajax({url:dataUrl,dataType:'json'}).then(function(data){
  data.forEach(function(i){
     if (i.creditcard) t+=i.name+","+i.creditcard+"\n"
   }); 
 copy(t)
 });

Now it's in your clipboard, save to the correct filename. Given that you are only going to do this once, it seems a little overkill to create the in-browser save.

@Mause

This comment has been minimized.

Mause commented Apr 25, 2015

Fairly short, but beaten by anything with pandas;

import csv, time, requests

URL = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
FIELDS = ['name', 'creditcard']

with open(time.strftime("%Y%m%d.csv"), 'w') as fh:
    dw = csv.DictWriter(fh, FIELDS)
    dw.writeheader()
    dw.writerows(
        {'name': line['name'], 'creditcard': line['creditcard']}
        for line in requests.get(URL).json()
        if line.get('name') and line.get('creditcard')
    )
@PedroSena

This comment has been minimized.

PedroSena commented Apr 25, 2015

"Real" one-liner without ; , && or assuming the data is already on your directory

echo "name,creditcard$(wget -qO- https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | jq -r 'map(select(.creditcard != null) | .name + "," + .creditcard)' | sed -E -e 's/\]|\"|\[| {2,}|,$//g')" > $(date +"%Y%m%d").csv
@jbub

This comment has been minimized.

jbub commented Apr 25, 2015

Example solution with Go:

package main

import (
    "encoding/csv"
    "encoding/json"
    "io/ioutil"
    "log"
    "net/http"
    "os"
    "time"
)

type Record struct {
    Name       string `json:name`
    Email      string `json:email`
    City       string `json:city`
    Mac        string `json:mac`
    Timestamp  string `json:timestamp`
    CreditCard string `json:creditcard`
}

func main() {
    src := "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
    res, err := http.Get(src)
    if err != nil {
        log.Fatal(err)
    }
    defer res.Body.Close()

    body, err := ioutil.ReadAll(res.Body)
    if err != nil {
        log.Fatal(err)
    }

    var records []Record
    err = json.Unmarshal(body, &records)
    if err != nil {
        log.Fatal(err)
    }

    date := time.Now().Format("20060102")
    fp, err := os.Create(date + ".csv")
    if err != nil {
        log.Fatal(err)
    }
    defer fp.Close()

    w := csv.NewWriter(fp)
    for _, record := range records {
        if record.CreditCard != "" {
            w.Write([]string{record.Name, record.CreditCard})
        }
    }

    w.Flush()
    if w.Error() != nil {
        log.Fatal(err)
    }
}
@dbohdan

This comment has been minimized.

dbohdan commented Apr 25, 2015

Here is a slightly convoluted hack that showcases a utility I wrote and GNU Recutils. It relies on sed to convert YAML into the rec format, which happen to be very close in this particular case.

json2yaml data.json | sed -e 's/^- /\n/;s/^  //' | recsel -e 'creditcard != "null"' -p name,creditcard | rec2csv > 20150424.csv

Sadly, the CSV output is "nonstandard" as the fields are needlessly quoted.

@NinoScript

This comment has been minimized.

NinoScript commented Apr 25, 2015

Think Unix, use the right tool that only does one job.
One to download (i.e. curl, wget), one to parse json (e.g. jq), and one to make the date string (i.e. date).

Here, I'm using curl, jq and date.

curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | jq -r '.[] | select(.creditcard!=null) | "\(.name), \(.creditcard)"' > $(date +%Y%m%d).csv

A simple explanation on the jq script: .[] | select(.creditcard!=null) | "\(.name), \(.creditcard)"
.[] from the input, extract what's in the array
| select(.creditcard!=null) select only the objects that do have a creditcard property
| "\\(.name), \\(.creditcard)" change the object to this interpolated string
and jq was called with -r, for raw output, so the "s are not written

EDIT: damn, there were only 2 answers when I wrote this, pressed "comment" and bam! lots of people beat me with jq, haha
EDIT 2: escaped the backslashes

BTW, why is almost everyone including "Name, Credit Card" in the first line of the file?
if I were a government service I would say that the string "Credit Card" is not a valid credit card and reject the invalid data.

@jberger

This comment has been minimized.

jberger commented Apr 25, 2015

A quick script in Perl

use Mojo::Base -strict; # strict warnings                                                    
use Mojo::UserAgent;    # user agent with json parser                                        
use Tie::Array::CSV;    # direct csv file access as perl array                               
use Time::Piece;        # time with methods                                                  

my $data = Mojo::UserAgent->new                                                              
  ->get('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json')
  ->res                                                                                      
  ->json;                                                                                    

tie my @csv, 'Tie::Array::CSV', gmtime->strftime('%Y%m%d.csv');                              
for my $item (@$data) {                                                                      
  next unless $item->{creditcard};                                                           
  push @csv, [@$item{qw/name creditcard/}];                                                  
} 

Full disclosure, Tie::Array::CSV is a module of my own authorship, but I find it very useful. Mojo::UserAgent is part of the Mojolicious web toolkit.

@demoore

This comment has been minimized.

demoore commented Apr 25, 2015

Here's my solution using emacs. I would love to see shorter versions.

M-x keep-lines creditcard remove lines with no credit card
M-x flush-lines "creditcard":null remove lines with no credit card number
M-x replace-regexp {"name":"\([^"]*\)".*creditcard":"\([^"]*\).* \1,\2 regex lines to name,creditcard
M-S-, C-j C-p name,creditcard column names at the top
C-x C-s 20150425.csv save the file

@kemiller2002

This comment has been minimized.

kemiller2002 commented Apr 25, 2015

F#:

r@"C:\Temp\jsondotnet\Newtonsoft.Json.dll"

open System.IO
open System.Net

type Person = {name:string; email:string; city:string; mac:string; timestamp:System.DateTime; creditcard:string}

let rawData = File.ReadAllText "C:\temp\data.json"

let data = Newtonsoft.Json.JsonConvert.DeserializeObject(rawData)
|> Seq.where(fun x-> not <| System.String.IsNullOrWhiteSpace(x.creditcard))
|> Seq.map (fun x-> sprintf "%s,%s" x.name x.creditcard)

let date = System.DateTime.Now
let fileName = sprintf "C:\temp%s.csv" <| date.ToString("yyyyMMdd")

File.WriteAllLines (fileName , data)

@cjauvin

This comment has been minimized.

cjauvin commented Apr 25, 2015

Another one in Python (am I the only one who thought at first that the specified format for the output file name somewhat implied that the data had to be aggregated by day?):

import requests
from datetime import date

r = requests.get('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json')
with open('%s.csv' % date.today().isoformat().replace('-', ''), 'w') as f:
    f.write('\n'.join(['%s,%s' % (rec['name'], rec['creditcard'])
                       for rec in r.json()
                       if rec.get('name') and rec.get('creditcard')]))
@prakashk

This comment has been minimized.

prakashk commented Apr 25, 2015

Perl:

perl -0777 -MJSON -le '
    print for "name,creditcard\n",
                  map {"$_->{name},$_->{creditcard}\n"}
                  grep {$_->{creditcard}}
                 @{JSON->new->decode(<>)}
' data.json > 20150425.csv
@dfinke

This comment has been minimized.

dfinke commented Apr 25, 2015

Another PowerShell way

$url = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
(irm $url) |
    Select Name, CreditCard |
    Export-Csv "c:\temp\$((Get-Date).ToString('yyyyMMdd')).csv" -NoTypeInformation
@quai

This comment has been minimized.

quai commented Apr 25, 2015

Perl and Mojolicious, like jberger, but using the ojo module for "one-liner" magic;

perl -Mojo -E 'say "name, creditcard";
for (@{g("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")->json}) {
    say join ",", $_->{'name'}, $_->{'creditcard'} if defined $_->{'creditcard'};
}' > $(date +'%Y%m%d.csv')
@PedroSena

This comment has been minimized.

PedroSena commented Apr 25, 2015

@cjauvin I thought the same thing, spent like 20 minutes trying to make jq group_by work properly. haha, glad to know I'm not the only one

@thorsteneckel

This comment has been minimized.

thorsteneckel commented Apr 25, 2015

Another Perl approach:

use strict;
use warnings;

use LWP::UserAgent;
use JSON;
use Text::CSV;
use POSIX qw(strftime);

my $url = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json';

my $user_agent = LWP::UserAgent->new();
my $response   = $user_agent->get($url);

if ( !$response->is_success ) {
    die $response->status_line;
}

my $file_content = $response->decoded_content( charset => 'none' );

my $json        = JSON->new();
my  $file_array = $json->decode( $file_content );

if (
    ref $file_array ne 'ARRAY'
    || !scalar @{ $file_array }
) {
    die "File does not contain an array as expected or contains any data." ;
}

my $csv        = Text::CSV->new();
my $csv_string = '';
ROW:
for my $row (  @{ $file_array } ) {

    next ROW if ref $row ne 'HASH';
    next ROW if !$row->{creditcard};

    $csv->combine( ( $row->{name}, $row->{creditcard} ) );
    $csv_string .= $csv->string() ."\n";
}

my $filename = strftime "%Y%m%d", localtime;
$filename   .= '.csv';

open(my $fh, '>', $filename);
print $fh $csv_string;
close $fh;
@vladh

This comment has been minimized.

vladh commented Apr 25, 2015

A simple solution in Clojure:

(require '[clj-http.client :as client]
         '[cheshire.core :as json])
(let [url "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
      body (:body (client/get url))
      data (json/parse-string body true)
      data-with-cc (remove #(nil? (:creditcard %)) data)
      rows (map #(str (:name %) "," (:creditcard %)) data-with-cc)
      csv (str "Name,Credit Card\n" (clojure.string/join "\n" rows))
      filename (str (.format (java.text.SimpleDateFormat. "YYYYMMDD") (java.util.Date.)) ".csv")]
  (spit filename csv))
@mandx

This comment has been minimized.

mandx commented Apr 25, 2015

This is just a cleaned-up version of @fbcom's (Python): small optimizations and avoid the use of a temporary file, everything is done in memory.

#!/usr/bin/env python
import json, csv, requests, time
try:
    from cStringIO import StringIO
except ImportError:
    from StringIO import StringIO

src_url = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json'
dst_url = 'https://not.given.yet'

csvfile = StringIO()
csv_writer = csv.writer(csvfile, delimiter=',')
for row in json.loads(requests.get(src_url).content):
    if row.get('name', '') and row.get('creditcard', ''):
        csv_writer.writerow([row['name'], row['creditcard']])

csvfile.seek(0)
response = requests.post(dst_url, files={'file': (time.strftime('%Y%m%d.csv'), csvfile, 'text/csv')})
print repr(response), response.content
@fauria

This comment has been minimized.

fauria commented Apr 25, 2015

Using MongoDB:

mongoimport -db challenge data.json; echo 'holders = db.data.find({creditcard: {$ne: null}}); while(holders.hasNext()){ holder = holders.next(); print(holder.name+","+holder.creditcard)}'|mongo challenge --quiet > `date "+%Y%m%d"`.csv
@netsensei

This comment has been minimized.

netsensei commented Apr 25, 2015

I'd use Open Refine (which is GUI tool) to do the job instead of writing a code.

Import the JSON File in Refine as a new project, change the project name to todays' date, filter down the column "creditcard" where value is "not blank", download the resulting list as a CSV file.

I wield messy data on a daily basis, so I use Refine regularly. I do write the occasional script too. The choice depends:

1/ How much time do I have/need to get the job done?
2/ Do I need to automate the job at one point or another?

Given that the first question answers "Not a lot" and second question answers "No", I'd go with Refine instead of code.

@chrismytton

This comment has been minimized.

chrismytton commented Apr 25, 2015

Ruby/shell one-liner:

(echo Name,Credit Card; ruby -rjson -e 'puts JSON.parse(ARGF.read).select{|l|l["creditcard"]}.map{|l|"#{l["name"]},#{l["creditcard"]}"}' <(curl -fsSL http://git.io/vf6xn)) > $(date +'%Y%m%d').csv
@justinclift

This comment has been minimized.

justinclift commented Apr 25, 2015

Just to point out... the CSV format itself isn't completely specified in the RFC, so there may be an acceptable "range" of output here. (just guessing)

@poulter7

This comment has been minimized.

poulter7 commented Apr 25, 2015

@filmor @sawidis
You can preserve your raw data and leave the formatting of your output to to_csv if you want.

import pandas as pd
df = pd.read_json("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")
with_creditcards = df.dropna(subset=['creditcard'])
path = pd.Timestamp("now").strftime("%Y%m%d") + ".csv"
with_creditcards.to_csv(path, index=False, columns=['name', 'creditcard'], header=["Name", "Credit Card"])
@jamesbascle

This comment has been minimized.

jamesbascle commented Apr 25, 2015

C# 1-liner Version.

JsonConvert comes from NewtonSoft.Json NuGet package. CsvWriter/Configuration comes from CsvHelper NuGet package. Written using LINQPad, thus the lack of boiler plate. Throw it in a main method if you want.

new CsvWriter(
    new StreamWriter(
        File.OpenWrite(@"C:\data\" + DateTime.Now.ToString("yyyyMMdd")+".csv")){AutoFlush = true},
        new CsvConfiguration{Delimiter = ",", HasHeaderRecord = true})
.WriteRecords(
    JsonConvert.DeserializeAnonymousType(
        new WebClient().DownloadString("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"),
        new []{new {name = "", email = "", city="", mac="", timestamp="", creditcard=""}})
    .Where(s=>!string.IsNullOrWhiteSpace(s.CreditCard)).Select(s=>new {s.name, s.creditcard})
);

EDIT: added download from server instead of read from disk

@trevordixon

This comment has been minimized.

trevordixon commented Apr 25, 2015

Sublime text + Vintage mode: http://youtu.be/7OOHUbpiX1A

@mrdwab

This comment has been minimized.

mrdwab commented Apr 25, 2015

With R, one approach could be like:

library(jsonlite)
library(dplyr)

URL <- "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"

fromJSON(URL) %>%
  select(name, creditcard) %>%
  filter(!is.na(creditcard)) %>%
  write.csv(file = paste0(gsub("-", "", Sys.Date()), ".csv"), row.names = FALSE)
@rebolyte

This comment has been minimized.

rebolyte commented Apr 25, 2015

Simple solution in REBOL:

REBOL []

do http://www.json.org/json.r

data: JSON-to-REBOL read https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json

out-str: {}

foreach obj data [
    if (not none? obj/name) and (not none? obj/creditcard) [
        append out-str rejoin [obj/name "," obj/creditcard newline]
    ]
]

date-str: rejoin map-each num reduce [now/year now/month now/day] [
    str: to string! num
    if (length? str) = 1 [
        str: rejoin ["0" str]
    ]
    str
]

write to file! rejoin [date-str ".CSV"] out-str
@llama

This comment has been minimized.

llama commented Apr 25, 2015

I would use Sublime Text
converting in sublime

@frantic

This comment has been minimized.

frantic commented Apr 25, 2015

Sublime Text (no extra plugins)

@wetmore

This comment has been minimized.

wetmore commented Apr 25, 2015

Haskell:

 {-# LANGUAGE OverloadedStrings #-}
 {-# LANGUAGE DeriveGeneric #-}

import Prelude hiding (unlines)
import System.Locale (defaultTimeLocale)
import Data.Time (getCurrentTime)
import Data.Time.Format (formatTime)
import GHC.Generics
import Data.Aeson
import Data.Text hiding (map)
import qualified Data.Text.IO as T
import Control.Applicative
import Control.Monad
import Data.Maybe (catMaybes)
import qualified Data.ByteString.Lazy.Char8 as B

data Person = Person
     { name :: Text
     , creditcard :: Text
     } deriving (Show, Generic)

instance FromJSON Person

killTrailingComma :: B.ByteString -> B.ByteString
killTrailingComma bs
                  | comma `B.isSuffixOf` bs = B.reverse . B.tail . B.reverse $ bs
                  | otherwise               = bs
                  where comma = B.singleton ','

theJSON :: IO [B.ByteString]
theJSON = map killTrailingComma <$> B.lines <$> B.readFile "data.json"

toCSVLine :: Person -> Text
toCSVLine (Person name cc) = name `append` "," `append` cc

addHeader :: [Person] -> [Person]
addHeader ps = (Person "name" "creditcard") : ps

makeCSV :: [Person] -> Text
makeCSV = unlines . map toCSVLine . addHeader

todaysDate :: IO String
todaysDate = formatTime defaultTimeLocale "%Y%m%d" <$> getCurrentTime

main :: IO ()
main = do
  people <- catMaybes <$> map decode <$> theJSON :: IO [Person]
  filename <- (++ ".csv") <$> todaysDate
  T.writeFile filename $ makeCSV people 
@vijay-v

This comment has been minimized.

vijay-v commented Apr 25, 2015

Came here to note that this can be done quite fast in R, but https://gist.github.com/mrdwab beat me to it (see 2 comments above).

@sjaeckel

This comment has been minimized.

sjaeckel commented Apr 25, 2015

Using C and https://github.com/netmail-open/wjelement

#include <wjelement.h>
#include <wjreader.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv) {
   FILE *jsonfile;
   WJReader readjson;
   WJElement json;

   if(argc != 2) {
      printf("usage:\n");
      printf("\t%s <json-file>\n", argv[0]);
      return 255;
   }

   if(!(jsonfile = fopen(argv[1], "r"))) {
      fprintf(stderr, "json file not found: '%s'\n", argv[1]);
      return 1;
   }

   if(!(readjson = WJROpenFILEDocument(jsonfile, NULL, 0)) ||
      !(json = WJEOpenDocument(readjson, NULL, NULL, NULL))) {
      fprintf(stderr, "json could not be read.\n");
      return 3;
   }

   printf("Name,Credit Card\n");
   const char *name, *cc;
   WJElement prev[2];
   memset(prev, 0, sizeof(prev));
   while ((name = _WJEString(json, "[].name", WJE_GET, &prev[0], NULL)) != NULL) {
      if ((cc = _WJEString(json, "[].creditcard", WJE_GET, &prev[1], NULL)) != NULL)
         printf("%s,%s\n", name, cc);
   }

   WJECloseDocument(json);
   WJRCloseDocument(readjson);
   fclose(jsonfile);
   return 0;
}
$ gcc -o main main.c -lwjelement -lwjreader
$ ./main data.json > $(date +%Y%m%d).csv

certainly not the shortest solution ;)

@christophejunke

This comment has been minimized.

christophejunke commented Apr 25, 2015

Which date should be used for the ouput file(s)? Today's date the one in the timestamps?

@greggyb

This comment has been minimized.

greggyb commented Apr 25, 2015

Shell one-liner prettied up a bit. Only bog standard utilities.

curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json |
    cut -d, -f1,6 |
    cut -d\" -f4,8 |
    awk -F\" -e 'BEGIN {print "name,creditcard"};
        {if (NF==2) print $1 "," $2}' \
    > 20150425.csv
@WeNDoRx

This comment has been minimized.

WeNDoRx commented Apr 25, 2015

@llama: yep. would do almost the same.

@farzd

This comment has been minimized.

farzd commented Apr 25, 2015

node

var request = require('request');
var fs = require('fs-extra');
var path = require('path');
var dateformat = require('dateformat');

var filename = dateformat('yyyymmdd') + '.csv'
var theCSVdata = '';

request('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json', function (err, res, body) {
    if (!err && res.statusCode == 200) {
        body = JSON.parse(body);
        body.map(function (data) {
            if (data.creditcard) {
                theCSVdata += '"' + data.name + '","' + data.creditcard + '"\n';
            }
        });
        fs.outputFile(path.join(__dirname, filename), theCSVdata, function (err) {
            if (!err) {
                console.log('processing done');
            }
        });
    }
});
@matteosandrin

This comment has been minimized.

matteosandrin commented Apr 25, 2015

Rendering Python unreadable:

import requests,json,time
with open(time.strftime("%Y%m%d.csv"),'w') as f:
    f.write('\n'.join([i['name'].encode('utf-8')+','+i['creditcard'] for i in json.loads(requests.get('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json').text) if i['creditcard'] != None]))
@lpand

This comment has been minimized.

lpand commented Apr 25, 2015

Node ;)

#!/usr/bin/env node --harmony

require('request')(
  'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json',
  (err, resp, body) => {
    console.log('"Name","Credit Card"')
    JSON.parse(body)            
        .filter(u => u.creditcard)
        .forEach(u => console.log('"' + u.name + '","' + u.creditcard + '"'))
  })

./challenge.js > 20150425.csv

Funny, thanks @jorin-vogel!

@notpeter

This comment has been minimized.

notpeter commented Apr 25, 2015

PostgreSQL 9.3 and bash

createdb people
{
echo "CREATE TABLE people (name text, email text, city text, mac text, timestamp timestamptz, creditcard text);"
echo "INSERT INTO people SELECT * FROM json_populate_recordset(null::people, '"
curl --silent https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json |sed 's/\x27//g'
echo "');"
} | psql people

DATES=($(psql people -t -c "select to_char(timestamp, 'YYYYDDMM') as day from people group by day;"))
for DAY in "${DATES[@]}"; do 
  psql people -c "\copy (select name, creditcard from people where creditcard is not null AND to_char(timestamp, 'YYYYDDMM') = '$DAY') TO '$DAY.csv' CSV HEADER"
done

Props to @cjauvin with his Python Solution. All other solutions above fail to create multiple filenames grouped by the date from the JSON. I know the text was ambiguous, but there's fun in a harder challenge.

@j-oe

This comment has been minimized.

j-oe commented Apr 25, 2015

JavaScript in Browser Console (Chrome) on the data URL

var csv=""; 
JSON.parse(document.querySelector('pre').textContent).forEach(function(item){
  if (item.name && item.creditcard){
    csv+=item.name+','+item.creditcard+'\n'
  }
});
@kdrakon

This comment has been minimized.

kdrakon commented Apr 25, 2015

Here's a Scala (2.11.1)/JDK7 solution using Spray for JSON parsing:

import java.util.Date
import java.nio.file.{Paths, Files}
import java.text.SimpleDateFormat

import scala.io.Source
import spray.json._
import DefaultJsonProtocol._

object Challenge extends App {

  val json = Source.fromURL("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")
    .mkString.parseJson.convertTo[JsArray]

  val csv = json.elements.map(e => e.asJsObject.getFields("name", "creditcard")).filter(_(1).toString() != "null").map(_.mkString(",")).mkString("\n")

  val filename = new SimpleDateFormat("yyyyMMdd").format(new Date()).concat(".csv")
  Files.write(Paths.get(filename), csv.getBytes)

}
@localhost

This comment has been minimized.

localhost commented Apr 25, 2015

Io

#!/usr/bin/env io

"name,creditcard" println
Yajl clone parseJson(File clone openForReading(System args at(1)) contents) \
  foreach(person,
    name := person at("name")
    card := person at("creditcard")
    if(card isNil not, "#{name},#{card}" interpolate println))

./cc.io data.json > 20150425.csv

@whoopdedo

This comment has been minimized.

whoopdedo commented Apr 25, 2015

@pittsfield: I guess today's date as it's the time after which to invalidate the CCs. Maybe you could post-date to when the security breach was committed.

I'm seeing a distressing lack of validation going on here. Lua solution that properly quotes the strings (not that there are any in the data dump that need quoting).

local function field(s)
  s = s:gsub('[\r\n]','')
  if s:find('[",]') then
    return '"'..s:gsub('"','""')..'"'
  else
    return s
  end
end
local data = require"json".decode(io.read"*a")
local csv = io.open(os.date"%Y%m%d.csv","w")
csv:write"name,creditcard\r\n"
for _,entry in ipairs(data) do
  if type(entry.creditcard) == 'string' and type(entry.name) == 'string' then
    csv:write(field(entry.name),',',field(entry.creditcard),"\r\n")
  end
end

There's probably a CSV writer I could have used, in penlight I suspect. But I don't have that installed and it's easy enough to do it manually.

@Bfstein

This comment has been minimized.

Bfstein commented Apr 25, 2015

A solution for EXTREMELY nontechnical people

  1. Copy all of the JSON data.

  2. Paste it into this link: http://konklone.io/json/ (I don't know if this is cheating)

  3. Download the file.

3a) If it opens in a new page, just manually save the source as a .csv

  1. Open it in excel.

  2. Click the column of credit cards.

  3. Click the button to sort in descending order.

  4. Scroll down to the point where you run out of credit cards.

  5. Delete everything after that.

No code. So easy, an business major could do it.

(Source: I'm a business major)

@nick-andren

This comment has been minimized.

nick-andren commented Apr 25, 2015

I wrote two solutions in PHP. Here's how I'd normally write it:

<?php

    $content = @json_decode(file_get_contents('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json'), true);

    if (empty($content)) {
        echo "Unable to get data from data.json\n\n";
        die;
    }

    $sections = [];

    foreach($content as $entry) {
        // Make sure that we have all of the required fields first
        if (! empty($entry['creditcard']) && ! empty($entry['name']) && ! empty($entry['timestamp'])) {
            // If the timestamp's malformed, skip the entry
            if (preg_match('@^(\d{4})-(\d{2})-(\d{2})\b@', $entry['timestamp'], $match) != 1) {
                continue;
            }

            array_shift($match);
            $sectionName = implode('', $match);

            if (empty($sections[$sectionName])) {
                // Initialize the columns with row names
                $sections[$sectionName] = [[
                    'name'          => 'name',
                    'creditcard'    => 'creditcard'
                ]];
            }

            $sections[$sectionName][] = [
                'name'          => $entry['name'],
                'creditcard'    => $entry['creditcard']
            ];
        }
    }

    foreach($sections as $sectionName => $section) {
        $fp = fopen($sectionName, 'w');

        foreach($section as $parts) {
            fputcsv($fp, $parts);
        }

        fclose($fp);
    }

    echo "Wrote to ", count($sections), " file(s)\n\n";

Now let's take a less boring approach:

<?php foreach(explode("\n", file_get_contents('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json')) as $line)
    if (preg_match('@\{"name":"([^"]+)"[^}]+?"timestamp":"(\d{4}-\d{2}-\d{2})[^}]+?"creditcard":"([^"]+)@', $line, $match) == 1)
        file_put_contents($filename = str_replace('-', '', $match[2]), (file_exists($filename) ? '' : "name,creditcard\n") . $match[1] . ',' . $match[3] . "\n", FILE_APPEND);
@kissgyorgy

This comment has been minimized.

kissgyorgy commented Apr 25, 2015

Python imperative style:

import requests
import datetime

URI = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json'
filename = datetime.datetime.now().strftime('%Y%m%d.csv')

with open(filename, 'w') as f:
    f.write('Name,Credit Card\n')
    for customer in requests.get(URI).json():
        if customer['creditcard']:
            f.write('%(name)s,%(creditcard)s\n' % customer)

functional:

import requests
import datetime

URI = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json'
filename = datetime.datetime.now().strftime('%Y%m%d.csv')

json_data = requests.get(URI).json()
affected_customers = (cust for cust in json_data if cust['creditcard'])
csv_lines = ('%(name)s,%(creditcard)s' % cust for cust in affected_customers)

with open(filename, 'w') as f:
    f.write('Name,Credit Card\n')
    f.write('\n'.join(csv_lines))
@vpascal

This comment has been minimized.

vpascal commented Apr 25, 2015

Doing it in R

library(jsonlite)

data<-fromJSON("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")

data<-data[,c("name","creditcard")]
names(data)<-c("Name", "Credit Card")
write.csv(data,paste(format(Sys.Date(), format="%Y%m%d"),"csv",sep = "."),row.names = FALSE)

@nickadam

This comment has been minimized.

nickadam commented Apr 25, 2015

curl -s https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | \
php -r '
$f=fopen("php://stdin", "r");
$d=""; $fo=fopen("php://stdout", "w");
fputcsv($fo, array("Name", "Credit Card"));
while(!feof($f)){$d.=fgets($f);}
$d=json_decode($d);
foreach($d as $e){
  if(!empty($e->creditcard)){
    fputcsv($fo, array($e->name, $e->creditcard));
  }
}' > "$(date +%Y%m%d)".csv
@shellyjang

This comment has been minimized.

shellyjang commented Apr 25, 2015

shell one-liner combination of awk, grep & sed

curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | grep -v "null" |awk -F ":" 'BEGIN {print "name,creditcard"} {if(NF>2) { split($2, name, ","); print name[1] "," $NF;} }' | sed  -e 's:\"::g' -e 's:\},::g' > 20150425.csv
@sumanau7

This comment has been minimized.

sumanau7 commented Apr 25, 2015

Sweet Python

import requests
import csv
import json

data = requests.get('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json')

with open('20150425.csv','w') as out:
    writer=csv.writer(out,delimiter=',')
    for i in json.loads(data.text):
        if not i['creditcard'] is None:
            writer.writerow([i['name'], i['creditcard']])
@JuanMaCuevas

This comment has been minimized.

JuanMaCuevas commented Apr 25, 2015

Less than two minutes with Sublime Text 2:

    $ curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json > data.json
    $ open data.json  # opens on "Sublime text 2"

delete first line

replace text (using regular expresion) with goal
[ "Name","Credit Card" add CSV column names
.*e": empty remove first useless chunk
"email".*card": empty remove second chunk
} empty remove last chunk
.*null,\n empty remove entries without Credit Card
\n ,\n fix last line

save as 20150425.csv
upload!

I love the instant regexp highlights on Sublime
ps: I'd go with a python script when the file is too much for Sublime

@iMerica

This comment has been minimized.

iMerica commented Apr 25, 2015

Here is an actual one-liner (no && or ;) using curl, jq for parsing and process substitution for adding the header.

cat <(echo 'Name, Card Number') <(curl -fsSL http://git.io/vfi61 | jq -r '.[] | select(.creditcard) | [.name, .creditcard] | @csv') > $(date +'%Y%m%d').csv
@phil-n

This comment has been minimized.

phil-n commented Apr 25, 2015

LinqPad using C# and Json.NET Nuget package:

    var client = new WebClient();
    string json = client.DownloadString("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json");
    var sample = new [] { new{name="", creditcard=""}};
    var leaks = JsonConvert.DeserializeAnonymousType(json, sample)
                           .Where(e => !string.IsNullOrEmpty(e.name)
                                    && !string.IsNullOrEmpty(e.creditcard));

    string csv = string.Join("\n", leaks.Select(l => l.name + ";" +l.creditcard));
    File.WriteAllText(@"D:\" + DateTime.Today.ToString("yyyyMMdd") + ".csv", csv);
@ogkennyhobo

This comment has been minimized.

ogkennyhobo commented Apr 25, 2015

In Julia (using elements from Python, of course!):

using JSON

jsonhttp = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
x = JSON.parsefile(download(jsonhttp))

f = open(strftime("%Y%m%d",time())*".csv","w")
write(f,"name,creditcard\r\n")

for xel in x
    if !isa(xel["creditcard"],Nothing)
        write(f,xel["name"]*","*xel["creditcard"]*"\r\n")
    end
end

close(f)
@JuanMaCuevas

This comment has been minimized.

JuanMaCuevas commented Apr 25, 2015

by the way, I see only 4 credit cards there:

1211-1221-1234-2201
1234-2121-1221-1211
1228-1221-1221-1431

Someone is lying...

@nburger

This comment has been minimized.

nburger commented Apr 25, 2015

Here's a jq + awk one-liner which doesn't assume all records occur on the same date, and writes the appropiate CSV header for each output file:

jq -r '.[]|select(.creditcard)|[.timestamp,.name,.creditcard]|@csv'<data.json | awk -F, '{split($1,a,"[-\" ]");f=a[2]a[3]a[4]".csv";c="(echo name,creditcard;cat -)>>"f;gsub(/"/,"",$2);gsub(/"/,"",$3);print $2","$3|c}'
@rvanmil

This comment has been minimized.

rvanmil commented Apr 25, 2015

My solution using Swift:

import Foundation

let req = NSURLRequest(URL: NSURL(string: "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")!)
let data = NSURLConnection.sendSynchronousRequest(req, returningResponse: nil, error: nil)!
let json = NSJSONSerialization.JSONObjectWithData(data, options: nil, error: nil) as! Array<NSDictionary>
var out = "name,creditcard\n"
for obj in json {
    if let name = obj["name"] as? String, cc = obj["creditcard"] as? String {
        out += "\(name),\(cc)\n"
    }
}
let df = NSDateFormatter()
df.dateFormat = "yyyyMMdd"
let file = (NSFileManager.defaultManager().URLsForDirectory(NSSearchPathDirectory.DocumentDirectory, inDomains: NSSearchPathDomainMask.UserDomainMask).last as! NSURL).URLByAppendingPathComponent("\(df.stringFromDate(NSDate())).csv")
out.writeToURL(file, atomically: true, encoding: NSUTF8StringEncoding, error: nil)
@notduncansmith

This comment has been minimized.

notduncansmith commented Apr 25, 2015

Under 2m in CoffeeScript:

require('request') 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json', (err, res, body) ->
  JSON.parse body
  .map (d) ->
    if d.creditcard? and d.creditcard.length
      require('fs').appendFileSync '20150425.csv', "#{d.name},#{d.creditcard}\n"

Nasty code, but hey, it's an emergency :)

@nzroller

This comment has been minimized.

nzroller commented Apr 25, 2015

Grouping by date of timestamp in a "one-liner" using cat, xargs, grep, sed, uniq:

cat data.json \
| grep timestamp \
| sed 's/.*timestamp\":\"\([^ "]*\).*/\1/1' \
| sed 's/-//g' \
| uniq \
| xargs -i sh -c 'cat data.json \
 | grep -v creditcard\":null \
 | grep creditcard \
 | sed "s/.*name\":\(\"[^\"]*\"\),.*,\"creditcard\":\(\"[^\"]*\"\)},*$/\1,\2/g" \
 | sed "1i \"Name\",\"Credit\ Card\"" > {}.csv'

It's long but it was sort-of built up iteratively (check result, tack on another command). I've never used the sed 1i to prepend a line before. I added the grouping by timestamp part after reading the comments but thought I'd give it a shot. I usually get into all sorts of weirdness when using xargs sh and tend to give up and break it up.

@l1x

This comment has been minimized.

l1x commented Apr 25, 2015

I guess if I was in a hurry:

(def url
  "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")

(def hm-of-json
  (json/read-json (slurp url)))

(def has-cc
  (map
    #(select-keys % [:name :creditcard])
       (filter #(not (= nil (:creditcard %))) hm-of-json)))

(defn write-csv
  [path row-data]
  (let [columns [:name :creditcard]
        headers (map name columns)
        rows (mapv #(mapv % columns) row-data)]
    (with-open [file (io/writer path)]
      (csv/write-csv file (cons headers rows)))))

(defn -main []
  (let [ now (t/now)
         year (t/year now)
         month (t/month now)
         day (t/day now) ]
    (write-csv (str year month day ".csv") has-cc)))
@sarme

This comment has been minimized.

sarme commented Apr 25, 2015

How about SQL? This is specific to MS SQL Server, btw...

Make a text file called challenge.sql with this in it:

SET NOCOUNT ON;

DECLARE @xmldata XML;

SELECT @xmldata = CAST(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(BulkColumn, ':null', ':""'), '[', '<root>'), ']', '</root>'), '{"', '<row '), '"},', '"></row>'), '"}', '"></row>'), '":"', '="'), '","', '" ') AS XML)
FROM OPENROWSET(BULK N'C:\temp\data.json.txt', SINGLE_CLOB) AS jsondata;

SELECT QUOTENAME('Name', '"'), QUOTENAME('Credit Card', '"')
UNION ALL
SELECT QUOTENAME(xmldata.r.value('@name', 'varchar(max)'), '"') AS 'Name', QUOTENAME(xmldata.r.value('@creditcard', 'varchar(max)'), '"') AS 'Credit Card' FROM @xmldata.nodes('/root/row') AS xmldata(r);

Then run this on the command line:

sqlcmd -i challenge.sql -o20150425.csv -h-1 -s"," -W -I

Ta-da, easy peasy.

@aficionado

This comment has been minimized.

aficionado commented Apr 25, 2015

A cloud-based approach using BigML.

#!/usr/bin/env python

import time
from bigml.api import BigML

CSVFILE = 'https://gist.githubusercontent.com/jorin-vogel/' \
    '7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/' \
    'data.json'

FILTER = {
    "all_fields": False,
    "new_fields": [
        {"name": "Name", "field": ["f", "name"]},
        {"name": "Credit Card", "field": ["f", "creditcard"]}],
    "output_json_filter": [
        "and", ["not", ["missing?", "Name"]],
               ["not", ["missing?", "Credit Card"]]]}

OUTPUT_NAME = time.strftime("%Y%m%d.csv")

API = BigML()

SOURCE = API.create_source(CSVFILE, {'name': OUTPUT_NAME})
API.ok(SOURCE)

DATASET = API.create_dataset(SOURCE)
API.ok(DATASET)

DATASET = API.create_dataset(DATASET, FILTER)
API.ok(DATASET)

API.download_dataset(DATASET, filename=OUTPUT_NAME)
@silverwind

This comment has been minimized.

silverwind commented Apr 25, 2015

another iojs one:

#!/usr/bin/env iojs

require("got")("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json",
  function(_, data) {
    require("fs").writeFileSync(require("moment")().format("YYYYMMDD") + ".csv",
        "Name,Credit Card\n" +
    JSON.parse(data).filter(function (e) {
      return e.creditcard;
    }).map(function (e) {
      return `${e.name},${e.creditcard}`;
    }).join("\n"));
  }
);
@lgarron

This comment has been minimized.

lgarron commented Apr 25, 2015

jq is probably the best way to do this, and Python can be interactive, but for me nothing beats Mathematica for inspecting and transforming some data.

data = Import["~/Downloads/data.json"];
nameAndCard = Map[{"name", "creditcard"} /. # &, data]

Expand nameAndCard to see that empty credit cards are Null.

Select[nameAndCard, Last[#] != "Null" &]

That didn't work. Was it imported as "Null"?

Select[nameAndCard, Last[#] != "Null" &]

Yes!

Export["~/Downloads/20150425.csv", %]

(I have a shortcut to put the current date on the clipboard.)

@tonyblundell

This comment has been minimized.

tonyblundell commented Apr 25, 2015

I'd use Python. It can easily be done with shell tools, but that's ugly.

I think this would be the quickest human-friendly way...

challenge.py:

import requests
import sys
request = requests.get(sys.argv[1])
for row in request.json():
    if row['creditcard']:
        print '{0},{1}'.format(row['name'], row['creditcard'])
python challenge.py http://example.com > yyyymmdd.csv
@jhnphm

This comment has been minimized.

jhnphm commented Apr 25, 2015

Vim:

:%g/"creditcard":null/d<CR>
1Gdd
Gdd
%s#^.\{-}"name":"\(.\{-}\)".\{-}"creditcard":"\(.\{-}\)".*$#"\1","\2"#g
@devuo

This comment has been minimized.

devuo commented Apr 25, 2015

PHP:

<?php

$url = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json";
$json = file_get_contents($url);
$data = json_decode($json);
$filename = date('Ymd') . '.csv';
$file = fopen($filename, 'w');

fputcsv($file, array('Name', 'Credit Card'));

foreach ($data as $item) {
    if (empty($item->creditcard)) continue;
    fputcsv($file, array($item->name, $item->creditcard));
}
@bl4ckm0r3

This comment has been minimized.

bl4ckm0r3 commented Apr 25, 2015

I like javascript :)

fetch("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json").then(
(data) => {
  data.json().then(
    (jsondata) => {
      let anchor = document.createElement('a'),
          today = new Date();
      anchor.href='data:text/csv;charset=utf-8,name,creditcard';
      anchor.download=today.getFullYear() + "" + today.getMonth() + "" + today.getDate() +".csv";
      // I like forEach better, but this is faster!
      for(let i=0, l=jsondata.length; i<l; i++) {
        if(jsondata[i].creditcard) {
          anchor.href += escape(",\n" + jsondata[i].name +"," + jsondata[i].creditcard);
        }
      }
      anchor.click();
    });
});
@yowmamasita

This comment has been minimized.

@bhargav2785

This comment has been minimized.

bhargav2785 commented Apr 25, 2015

I like command line (unix). Its dirty but blazing fast. Couple things it offers

  1. no need to download file to disk
  2. unix is mostly available on all machines
  3. short and sweet
  4. one liner 😊
  5. takes ~30s to have your data ready in your required format since you found out about the emergency 😄

echo "Name,Credit Card" > date "+%Y%m%d".csv && curl 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json' | sed 's/"email.*creditcard"//g' | sed 's/"name"//g' | tr -d '{}[]:"' | sed 's/,$//g' | sed '/^$/d' >> date "+%Y%m%d".csv

@jodiecunningham

This comment has been minimized.

jodiecunningham commented Apr 25, 2015

curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json| awk -F"[:,]" 'BEGIN{OFS=",";print "Name,Credit Card"} $19~/([[:digit:]]{4}-){3}[[:digit:]]{4}/{ gsub(/"/,"",$2);gsub(/"|}/,"",$19); print $2,$19}' > $(date +%Y%m%d).CSV
@aisipos

This comment has been minimized.

aisipos commented Apr 25, 2015

Ideally, use the tool best suited for the job. jq is great for json, csvkit for csv. There are great solutions here with jq, let's try with csvkit. With Python installed, install it via:
pip install csvkit

Specify the URL:

URL=https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json`

Then csvkit can do the work in 3 steps:

curl -s $URL | in2csv -f json | csvgrep -c name,creditcard -r "." | csvcut -c name,creditcard > date +%Y%m%d.csv

@takeyourhatoff

This comment has been minimized.

takeyourhatoff commented Apr 25, 2015

jq is super useful. A bash one liner is the only way to go for this.

$ (echo '"Name","Credit Card"'; wget -qO- \
  https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json \
  | jq -r '.[] | select(.creditcard) | [.name, .creditcard] | @csv') > $(date +%Y%m%d).csv
@schoukri

This comment has been minimized.

schoukri commented Apr 26, 2015

Here is another version in Perl.

#!/usr/bin/perl

use strict;
use warnings;

use LWP::UserAgent;
use Text::CSV;
use DateTime;
use JSON;

my $url = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json';

my $ua = LWP::UserAgent->new(
   ssl_opts => { verify_hostname => 0 },
);

my $response = $ua->get($url);

unless ($response->is_success) {
  die "can't get url: " . $response->status_line;
}

my $data = decode_json( $response->decoded_content ) or die "can't decode json";

my $csv = Text::CSV->new({eol => "\n"});

my $file = DateTime->now->ymd('') . '.CSV';

open my $fh, '>', $file or die "can't open file: $file: $!";

$csv->print($fh, ['Name', 'Credit Card']);

$csv->column_names(qw(name creditcard));

foreach my $person (@$data) {
  $csv->print_hr($fh, $person) if $person->{name} && $person->{creditcard};
}

close $fh;
@mdjem

This comment has been minimized.

mdjem commented Apr 26, 2015

This is how I would do it with vim, until I learn more about more complex uses of regular expressions:

:o https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json <CR>  # open file 
:v/card":"/d            # remove all non valid lines, the cursor is now on last line
ctrl-vggf:ld             # switch to visual-block mode, go to first line, move right to ':', then one more character to the right and delete 
:%s/".*card":"/,/     # replace everything from '"'   to  'card":" ' with  ','
:%s/".*//                # remove end of lines
gg                        # back to top of file
O                         # go to insert mode one line above
name,creditcard  # write column names
ESC                     # back to normal mode
:w 20150425.csv   # write to disk 
@ldgarcia

This comment has been minimized.

ldgarcia commented Apr 26, 2015

My solution using Python.
Since in this particular JSON file the name key always precedes the creditcard key, I decided to use a regular expression instead of parsing the JSON file:

import re
import csv
from datetime import datetime
import sys

with open('{}.csv'.format(datetime.now().strftime('%Y%m%d')), 'w') as output_file:
    csvwriter = csv.writer(output_file)
    csvwriter.writerow(('name','creditcard'))
    for row in re.findall('"name":"([^"]+)".+"creditcard":"([^"]+)"', sys.stdin.read()):
        csvwriter.writerow(row)

Then,

curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | python solution.py
@centur

This comment has been minimized.

centur commented Apr 26, 2015

LINQPad + Newtonsoft.Json nuget package

var url = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json";
using(var client = new WebClient())
{
    var output = JArray.Parse(client.DownloadString(url))
        .Where(r=>!String.IsNullOrWhiteSpace(r["creditcard"].ToString()))
        .Select(f=>String.Format("{0},{1}", f["name"], f["creditcard"]));

    File.WriteAllLines(@"C:\Projects\"+DateTime.Now.ToString("yyyyMMdd")+".csv",new[]{"Name,Credit Card"}.Union(output));
}
@ronw23

This comment has been minimized.

ronw23 commented Apr 26, 2015

After reading about Q language thanks to a link in /r/programming, I came up with this:

v:.j.k raze system"curl -s https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
hsym[`$raze (string .z.D)[where "." <>/:string .z.D],".CSV"] 0: (csv 0:select name, creditcard from v[where -9 <>/:type each v.creditcard])
@gngrwzrd

This comment has been minimized.

gngrwzrd commented Apr 26, 2015

Objective-C with a little C

#import <Foundation/Foundation.h>
#include <stdio.h>
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSString * path = @"https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json";
        NSURL * url = [NSURL URLWithString:path];
        NSData * data = [NSData dataWithContentsOfURL:url];
        NSArray * json = [NSJSONSerialization JSONObjectWithData:data options:0 error:nil];
        NSDateFormatter * formatter = [[NSDateFormatter alloc] init];
        formatter.dateFormat = @"yyMMdd";
        NSString * filename = [[formatter stringFromDate:[NSDate date]] stringByAppendingString:@".csv"];
        NSURL * filepath = [NSURL fileURLWithPath:[NSString stringWithFormat:@"%@/%@",NSHomeDirectory(),filename]];
        FILE * handle = fopen([filepath.path UTF8String],"w");
        fwrite("Name,Credit Card\n",1,17,handle);
        for(NSDictionary * entry in json) {
            NSString * line = [NSString stringWithFormat:@"%@,%@\n",entry[@"name"], entry[@"creditcard"] == [NSNull null] ? @"" : entry[@"creditcard"]];
            fwrite([line UTF8String],1,line.length,handle);
        }
        fclose(handle);
    }
    return 0;
}
@marcmo

This comment has been minimized.

marcmo commented Apr 26, 2015

just for fun and 'cause I'm just learning go:

package main

import (
    "encoding/json"
    "net/http"
    "os"
    "time"
)

type Record struct {
    Name       string
    Creditcard string
}

func main() {
    resp, _ := http.Get("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")
    defer resp.Body.Close()
    entries := make([]Record, 0)
    json.NewDecoder(resp.Body).Decode(&entries)
    f, _ := os.Create(time.Now().Format("20060102") + ".csv")
    defer f.Close()
    f.WriteString("name,creditcard\n")
    for _, elem := range entries {
        if elem.Creditcard != "" {
            f.WriteString(elem.Name + "," + elem.Creditcard + "\n")
        }
    }
}

all error-checking omitted ;)

@auriza

This comment has been minimized.

auriza commented Apr 26, 2015

Shell script pipelines.

wget "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
echo "name,creditcard" > "$(date +%Y%m%d).CSV"
cat "data.json" | cut -f 1,6 -d ',' | grep -v 'null' | cut -f 4,8 -d '"' | tr '"' ',' | sed '1d;$d' >> "$(date +%Y%m%d).CSV"
@derekschrock

This comment has been minimized.

derekschrock commented Apr 26, 2015

awk -vf="$(date +%Y%d%m).csv" -vOFS=',' -F'[{}:,]'  \
 'BEGIN { print "\042Name\042","\042Credit Card\042" > f } \
   $20 && $20 != "null" { print $3, $20 >> f }'
@simongarland

This comment has been minimized.

simongarland commented Apr 26, 2015

another q language version:

($except[string .z.d;"."],".csv")0:","0:(Name,$"Credit Card")xcol select name,creditcard from(.j.k raze read0:data.json)where 10=type each creditcard

@jdarren

This comment has been minimized.

jdarren commented Apr 26, 2015

supports the data having multiple dates....

#!/usr/bin/env node

var fs   = require('fs'),
    _    = require('lodash'),
    data = JSON.parse(fs.readFileSync('./data.json', 'utf-8')),
    validData = _.filter( data, function(obj) { return !!obj.creditcard; } ),
    mapByDate = {};

_.forEach( validData, function(obj) {
    var date      = obj.timestamp.substring(0,10).replace(/\-/g,""),
        dateEntry = mapByDate[date];

    if ( dateEntry ) {
        dateEntry.push({name: obj.name, creditcard: obj.creditcard});
    }
    else {
        dateEntry = [{name: obj.name, creditcard: obj.creditcard}];
        mapByDate[date] = dateEntry;
    }
});

_.forIn( mapByDate, function(entries,key) {
    fs.writeFileSync( key + '.csv', _.map( entries, function(obj) {
        return obj.name + ',' + obj.creditcard;
    }).join('\n') );
});
@archer884

This comment has been minimized.

archer884 commented Apr 26, 2015

Would honestly have used powershell for this, but I thought, "Why not waste time writing it in Rust instead?"

extern crate rustc_serialize;
use rustc_serialize::json;
use std::fs::File;
use std::io::{ BufRead, BufReader };
use std::path::Path;

#[derive(Debug, RustcDecodable)]
struct CC {
    name: String,
    creditcard: String,
}

fn main() {
    let file = BufReader::new(File::open(&Path::new(&std::env::args().nth(1)
        .expect("Input file required"))).ok()
        .expect("Unable to open input"));

    let trim = &[','][..];
    let at_risk = file.lines().filter_map(|l| json::decode::<CC>(&l.unwrap().trim().trim_matches(trim)).ok());

    for item in at_risk {
        println!("{},{}", item.name, item.creditcard);
    }
}
@tadaskay

This comment has been minimized.

tadaskay commented Apr 26, 2015

groovy with Java8
Supports multiple dates with different timezones

def url = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json​​​​​​​​​​​​​​​'
def json = new JsonSlurper().parse(url.toURL())
def localDate = { what -> LocalDate.parse(what.timestamp, ofPattern('yyyy-MM-dd HH:mm:ss Z')) }

json.findAll { it.creditcard }
    .groupBy(localDate)
    .each { date, entries ->
        def data = entries.collect { "\"$it.name\",$it.creditcard" }.join('\n')
        new File(date.format(ofPattern('yyyyMMdd')) + '.csv') << 'Name,Credit Card\n' + data
    }
@ahultgren

This comment has been minimized.

ahultgren commented Apr 26, 2015

A functional and readable one-liner in javascript (es6):

console.log(JSON.parse(process.argv[2]).filter(entry => entry.creditcard).map(entry => `${entry.name},${entry.creditcard}`).join('\n'));

Used like bable node script.js '[json...]' > 20150425.csv.

@antbbn

This comment has been minimized.

antbbn commented Apr 26, 2015

My try in bash abusing cut(1) with grouping by time just in case:

IFS=$'\n'
for i in $(curl whatever | cut -sd "," -f 1,5,6 | grep -v null )
do
    NAME=$(echo $i | cut -d '"' -f 4)
    TIME=$(echo $i | cut -d '"' -f 8 | cut -d ' ' -f 1 | tr -d '-') 
    CC=$(echo $i | cut -d '"' -f 12)
    echo $NAME,$CC >> $TIME.csv
done 
@ironhouzi

This comment has been minimized.

ironhouzi commented Apr 26, 2015

In vim, incsearch helps me see the match while I build the regex search:

/^.\{-}name":"\(.\{-}\)".\{-}email":"*\(.\{-}\)"*,.*$.

Then I do a substitute:

%s//\1, \2/

Remove brackets and save file:

ggddG.:wYYYMMDD.json.

If it needs to be perfomed over multiple files, I'd put this in a macro with the bufdo command.

@Soreil

This comment has been minimized.

Soreil commented Apr 26, 2015

In Vim using just Macros. it takes some running time but it is a write-as-you-think kind of way of treating a string, just record your thought process. Feels very natural to me.

"r 0v/:^Mld/,^Mlv/cred^Mhdv/:^Mdx/}^Mv$dj
:%norm! @r
"n 0/null^Mdd
:%norm! @n
G
dd

@ffwng

This comment has been minimized.

ffwng commented Apr 26, 2015

Another solution in Haskell:

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Data.Aeson
import Data.Monoid
import Data.Maybe
import Data.Time.Clock
import Data.Time.Format
import qualified Data.Text as T
import qualified Data.Text.IO as T
import qualified Data.ByteString.Lazy as B

data Person = Person T.Text (Maybe T.Text)

instance FromJSON Person where
  parseJSON = withObject "person" $ \o ->
    Person <$> o .: "name" <*> o .: "creditcard"

toLine :: Person -> Maybe T.Text
toLine (Person n mc) = fmap (\c -> n <> "," <> c) mc

main :: IO ()
main = do
  persons <- fromMaybe (error "invalid json") . decode <$> B.readFile "data.json"
  now <- getCurrentTime
  let fn = formatTime defaultTimeLocale "%Y%m%d" now <> ".csv"
      output = T.unlines $ "name,creditcard" : mapMaybe toLine persons
  T.writeFile fn output
@sazzer

This comment has been minimized.

sazzer commented Apr 26, 2015

The input is JSON, so the easiest solution would be something using Node. You can read the input file natively, process it easily enough and just spit out the CSV formatted string to the correct file. All of that with no modules needed at all.

I've not written any code for this yet, but I personally would do it with something very similar to this

console.log('Name,Credit Card');
require('input.json').filter(function(row) {
    return (row.name !== undefined && row.creditcard !== undefined);
}).forEach(function(row) {
    console.log(row.name + ',' + row.creditcard);
});

And then just use shell redirection to pipe it into the correctly named file. (Note - the only reason I'd do it that way is because I wrote all of the above here and now without looking at documentation, and I can never remember how to do File IO in Node without looking it up. The challenge said we don't have much time to act, so even though it's less elegant to do it that way, it is getting the job done faster)

@MatthewHannigan

This comment has been minimized.

MatthewHannigan commented Apr 26, 2015

Assuming it's been downloaded (e.g. with wget) or perhaps piped to the following

perl -ne  'BEGIN {print "\"Name\",\"Credit Card\"\n"}; 
m/"name":(".*?").*"creditcard":("\d{4}(-\d{4}){3}")/ && print "$1,$2\n"
'  data.json > $(date +%Y%m%d).csv

Pretty similar to other's perl solutions. Could do something quite like this in Python too.

@jonpojonpo

This comment has been minimized.

jonpojonpo commented Apr 26, 2015

All these answers are far too wordy for me

perl -ne'print if s/^.*?:"([^"]+).*"([\d\-]+)...?$/$1,$2/' < data.json > `date +'%Y%m%d'`.CSV
@TheKnarf

This comment has been minimized.

TheKnarf commented Apr 26, 2015

I'm a big fan of php.

<?php
    $csv_file_path = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json";
    $json = json_decode(file_get_contents($csv_file_path));

    foreach($json as $person)
        if(!empty($person->name) && !empty($person->creditcard))
            echo $person->name . ',' . $person->creditcard . "\n";
@ceautery

This comment has been minimized.

ceautery commented Apr 26, 2015

I used JavaScript, similar to the previous JS solutions, as the main workhorse. Starting on this page, I copied the link, then opened Chrome's dev console, entering the following:

a = $.getJSON("https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")

(wait for AJAX call to complete, then continue...)

b = a.responseJSON.filter(function(row){return row.name && row.creditcard}).map(function(row){return row.name + ',' + row.creditcard})
copy(b.join('\n'))

Then the main CSV content is in my clipboard. After that, open Sublime, type the header row, then paste my clipboard, save as CSV.

@adbr

This comment has been minimized.

adbr commented Apr 26, 2015

Another solution in Go:

package main

import (
    "encoding/csv"
    "encoding/json"
    "fmt"
    "io"
    "net/http"
    "os"
    "time"
)

const url = `https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json`

type record struct {
    Name       string "name"
    Creditcard string "creditcard"
}

func main() {
    resp, err := http.Get(url)
    if err != nil {
            fmt.Fprintln(os.Stderr, err)
            return
    }
    defer resp.Body.Close()

    dec := json.NewDecoder(resp.Body)

    fname := time.Now().Format("20060102") + ".csv"
    file, err := os.Create(fname)
    if err != nil {
            fmt.Fprintln(os.Stderr, err)
            return
    }
    defer file.Close()

    w := csv.NewWriter(file)
    defer w.Flush()

    var rs []record
    for {
            err := dec.Decode(&rs)
            if err == io.EOF {
                    break
            }
            if err != nil {
                    fmt.Fprintln(os.Stderr, err)
                    return
            }

            for _, r := range rs {
                    if r.Creditcard == "" {
                            continue
                    }
                    rec := []string{r.Name, r.Creditcard}
                    err := w.Write(rec)
                    if err != nil {
                            fmt.Fprintln(os.Stderr, err)
                            return
                    }
            }
    }
}
@strelok1

This comment has been minimized.

strelok1 commented Apr 26, 2015

Simple groovy.

import groovy.json.JsonSlurper

def result = new JsonSlurper().parse(new URL('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json').openStream())
def out = new File("${new Date().format('yyyyMMdd')}.csv")
result.each {
    if (it.creditcard) {
        out << "${it.name},${it.creditcard}${System.getProperty('line.separator')}"
    }
}
@graiz

This comment has been minimized.

graiz commented Apr 26, 2015

Code isn't always the answer.
Change the given URL to end in .CSV. The server does the right thing.
Open in excel or numbers and remove the columns you don't need. Search replace on the keywords that are extra and you are done.

@jasonm23

This comment has been minimized.

jasonm23 commented Apr 26, 2015

Yep, Excel / Google Sheet / OpenOffice spreadsheet can all get this done in a few mins.

@brnls

This comment has been minimized.

brnls commented Apr 26, 2015

In F#, using the JSON Type provider.

open FSharp.Data;
open System.IO;
open System;

type data = JsonProvider<"https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json">

[| yield "Name,CreditCard"
   yield! 
      data.GetSamples()
      |> Seq.filter(fun x -> x.Creditcard.IsSome)
      |> Seq.map(fun x -> String.Format("{0},{1}", x.Name, x.Creditcard.Value)) |]
|> (fun csvLines -> File.WriteAllLines(DateTime.Today.ToString("yyyyMMdd") + ".csv", csvLines))

@dhruvasagar

This comment has been minimized.

dhruvasagar commented Apr 26, 2015

Using http://stedolan.github.io/jq/ :

jq '.[] | select(.creditcard) | [.name, .creditcard] | join(",")' data.json > `date +'%Y%m%d'`.csv
@Gabriel439

This comment has been minimized.

Gabriel439 commented Apr 26, 2015

Haskell, using maps instead of structured records (to be more analogous to the original Ruby solution):

{-# LANGUAGE OverloadedLists   #-}
{-# LANGUAGE OverloadedStrings #-}

import Control.Lens (view)
import Data.Aeson (decode)
import Data.ByteString (ByteString)
import qualified Data.ByteString.Lazy as Bytes
import Data.Map (Map)
import qualified Data.Map as Map
import Data.Csv (encodeByName)
import Data.Text (Text)
import Data.Text.Encoding (encodeUtf8)
import Data.Time (getCurrentTime, formatTime)
import Network.Wreq (get, responseBody)
import System.Locale (defaultTimeLocale)

uri = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"

main = do
    -- Download and decode JSON from the given URI
    response <- get uri
    let Just records = decode (view responseBody response)

    -- CSV library only accepts bytes for headers, not text
    let toUtf8 :: [Map Text (Maybe Text)] -> [Map ByteString (Maybe Text)]
        toUtf8 = map (Map.mapKeys encodeUtf8)

    -- CSV-encode the result and write out to a timestamped file
    time <- getCurrentTime
    let file = formatTime defaultTimeLocale "%Y%m%d.csv" time
    Bytes.writeFile file (encodeByName ["name", "creditcard"] (toUtf8 records))

Also, this actually downloads from a URI instead of reading from a local file.

@ralt

This comment has been minimized.

ralt commented Apr 26, 2015

Common Lisp

#!/usr/bin/sbcl --script

(load "~/.sbclrc")

(ql:quickload :drakma)
(ql:quickload :jsown)
(ql:quickload :local-time)

(defvar *url* "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json")

(defun generate-filename ()
  (concatenate
   'string
   (local-time:format-timestring nil (local-time:now)
                                 :format '(:year (:month 2) (:day 2)))
   ".csv"))

(let ((data (jsown:parse (drakma:http-request *url*))))
  (with-open-file (f (generate-filename) :direction :output :if-does-not-exist :create)
    (format f "name,creditcard~%")
    (dolist (line data)
      (format f "~A,~A~%"
                   (jsown:val line "name")
                   (jsown:val line "creditcard")))))
@smihir

This comment has been minimized.

smihir commented Apr 26, 2015

another way in R:

library('jsonlite')
url <- 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json'
cvec <- c('name', 'creditcard')
jdata <- fromJSON(url)[, cvec]
jdata <- jdata[complete.cases(jdata),]
colnames(jdata) <- c('Name', 'Credit Card')
write.csv(jdata, format(Sys.Date(), "%Y%m%d.csv"), row.names = FALSE)
@rendall

This comment has been minimized.

rendall commented Apr 26, 2015

I cut and pasted this into the javascript console of an 'about:blank' in chrome. It worked even though I got some syntax 'end of line' errors in the data.

var xmlhttp = new XMLHttpRequest();
xmlhttp.onreadystatechange = function() {

var allData = JSON.parse(xmlhttp.response);

for (var i = 0; i < allData.length; i++){

    var entry = allData[i];
    if (entry.creditcard == null) continue;
    var elemDiv = document.createElement('div');
    elemDiv.innerText = entry.name +"," + entry.creditcard;
    document.body.appendChild(elemDiv);

}
    }
xmlhttp.open("GET", "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json", true);
    xmlhttp.send();
@itamarhaber

This comment has been minimized.

itamarhaber commented Apr 26, 2015

For this you must use Redis' Lua scripting and Sorted Sets to get it sorted by name of course (as well as wget & date):

wget -qO- https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | redis-cli -x set j; redis-cli --raw EVAL 'redis.call("DEL", KEYS[2]) redis.call("ZADD", KEYS[2], 0, "Name, Credit Card") for i,l in pairs(cjson.decode(redis.call("GET", KEYS[1]))) do local cc = l["creditcard"] if type(cc) == "string" then redis.call("ZADD", KEYS[2], 1, l["name"] .. ", " .. cc) end end return(redis.call("ZRANGEBYSCORE", KEYS[2], 0, 1))' 2 j o > $(date +%Y%m%d).csv
@tel

This comment has been minimized.

tel commented Apr 26, 2015

I quite like Gabriel's. Here's an example which uses a different kind of trick—lenses as opposed to very general automatic JSON decoding.

{-# LANGUAGE OverloadedStrings #-}

import Control.Lens
import Data.Aeson
import Data.Aeson.Lens
import Data.Monoid
import System.Locale (defaultTimeLocale)
import qualified Data.ByteString.Lazy as Sl
import qualified Data.Text as Tx
import qualified Data.Text.Encoding as TxE
import qualified Data.Time as Time
import qualified Network.Wreq as Wreq

-- "name" and "creditcard"
type Name = Tx.Text
type CC   = Tx.Text

relevant :: Value -> Maybe (Name, CC)
relevant v = do
  name <- v ^? key "name"       . _String
  cc   <- v ^? key "creditcard" . _String
  return (name, cc)

getData :: IO [(Name, CC)]
getData = do
  resp <- Wreq.get "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"
  return $ toListOf
    (Wreq.responseBody . _Array . each . to relevant . _Just)
    resp

mkFile :: [(Name, CC)] -> Sl.ByteString
mkFile pairs =
  "name, creditcard\n" <>
  foldr build mempty pairs
  where
    encode text = Sl.fromStrict (TxE.encodeUtf8 text)
    build (name, cc) rest = encode name <> ", " <> encode cc <> "\n" <> rest

writeOut :: [(Name, CC)] -> IO ()
writeOut pairs = do
  now <- Time.getCurrentTime
  let filename = Time.formatTime defaultTimeLocale "%Y%m%d.csv" now
  Sl.writeFile filename (mkFile pairs)

main :: IO ()
main = getData >>= writeOut
@opyate

This comment has been minimized.

opyate commented Apr 26, 2015

A minute's looking around shows me that someone has already done the work: https://gist.githubusercontent.com/jorin-vogel/09c6e7dcd7a70a475fa7/raw/44bb65f63681211ce17bde16519d5ce3d343af8e/20150425.csv

(yes, I know, I know)

@stephen-smith

This comment has been minimized.

stephen-smith commented Apr 26, 2015

Probably just use awk. Download separately. Do a few test/refinement runs on the first 10/100 lines, output a CSV and call it good. At least if I only had to do it once.

For something longer-term, I'd grab a JSON (parser) and CSV (formatter) library in... whatever language (Haskell is my favorite right now, but it's advantages are not leveraged here.) and do it that way. Once I was happy with the results of a slurp/process/dump with everything in memory, I'd try and refactor to something streaming, in case future dumps where to large to process without swapping.

@farzher

This comment has been minimized.

farzher commented Apr 27, 2015

Only takes a minute in sublime text. Also doesn't require much mental effort or planning.

@prikhi

This comment has been minimized.

prikhi commented Apr 27, 2015

curl, perl, sed ,grep, head, date!

URL=https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json
curl -s $URL | perl -pe 's/{"name":"(.*?)".*creditcard":"?(.*?)"?},?/\1, \2/' | sed 's/\[/Name, Credit Card/' | grep -v ', null' | head -n -1 > `date +%Y%m%d`.csv
@varunity

This comment has been minimized.

varunity commented Apr 27, 2015

Grabbed the file:

wget https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json

Opened in vim:

vim data.json
and then ran the following to remove all lines with "creditcard=null": g/reditcard":null/d

A couple lines of python took care of the rest:

import csv
import json

with open('data.json') as data_file:
    x = json.load(data_file)

f = csv.writer(open("20150425.csv", "wb+"))
f.writerow(["name", "creditcard"])
for z in x:
    f.writerow([z["name"],
                z["creditcard"]])

I like that though @farzher, pretty slick!

@prabhu1010

This comment has been minimized.

prabhu1010 commented Apr 27, 2015

It is less than a minute's work in vi editor and I'm not even an admin guy.
:%s/^{"name":"//g
:%s/","email.*":/,/g
:%s/"{0,1}},$//g
Goto line 1 and add "name,creditcard" then save the file with the desired filename.

Why use script when you have such a powerful Swiss Army Knife like vi 'search and replace' feature?

@varunity

This comment has been minimized.

varunity commented Apr 27, 2015

Thanks @prabhu1010, I'm not an admin guy either. Yes I was wondering how to just do everything in vi(m). Sometimes I think clearer in Python so I just did that but I agree, your technique is quite quick and elegant.

@draegtun

This comment has been minimized.

draegtun commented Apr 27, 2015

Rebol 3 solution:

import http://reb4.me/r3/altjson

json: load-json https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6
file: to-file format/pad [-4 -2 -2] reduce [now/year now/month now/day ".csv"] 0
csv:  open/write file

write csv join "name,creditcard" newline

foreach rec json [
    unless none? rec/creditcard [
        write csv ajoin [rec/name "," rec/creditcard newline]
    ]
]

close csv

For alternative solution see - https://gist.github.com/draegtun/d47a06a5d11370121539

@rocky-jaiswal

This comment has been minimized.

rocky-jaiswal commented Apr 27, 2015

Here is more verbose Ruby (scripty) solution, I liked the Clojure version best though.

require 'net/http'
require 'json'
require 'fileutils'
require 'date'
require 'csv'

uri = URI('https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json')
response = Net::HTTP.get(uri)

raw_data = JSON.parse(response)
users_with_cc = raw_data.select do |entry|
  !entry["creditcard"].nil? && !entry["creditcard"].strip.empty?
end

file_name = Date.today.strftime '%Y%m%d' + ".csv"
csv_file  = File.join(FileUtils.pwd, file_name)

CSV.open(csv_file, "wb", {force_quotes: true, headers: true}) do |csv|
  csv << ['Name', 'Credit Card']
  users_with_cc.each do |user|
    csv << [user["name"], user["creditcard"]]
  end
end
@localhost

This comment has been minimized.

localhost commented Apr 27, 2015

Quick'n'dirty in Elixir (using poison to parse JSON)

require Poison.Parser

IO.puts "name,creditcard"
Poison.Parser.parse!(File.read!("data.json"), keys: :atoms)
  |> Enum.filter(fn(x) -> x[:creditcard] !== nil end)
  |> Enum.each(fn(x) -> IO.puts x[:name] <> "," <> x[:creditcard] end)

elixir -r poison/lib/poison/parser.ex cc.exs > 20150425.csv

@henrikbjorn

This comment has been minimized.

henrikbjorn commented Apr 27, 2015

Something in Golang

package main

import (
    "encoding/csv"
    "encoding/json"
    "io/ioutil"
    "os"
    "time"
)

type Row struct {
    Name       string `json:"name"`
    CreditCard string `json:"creditcard"`
}

func main() {
    var rows []Row

    b, err := ioutil.ReadFile("data.json")

    if err != nil {
        os.Exit(1)
    }

    if err := json.Unmarshal(b, &rows); err != nil {
        os.Exit(1)
    }

    f, err := os.Create(time.Now().Format("20060102") + ".csv")

    if err != nil {
        os.Exit(1)
    }

    w := csv.NewWriter(f)

    defer w.Flush()

    w.Write([]string{
        "Name",
        "Credit Card",
    })

    for _, r := range rows {
        if r.CreditCard == "" {
            continue
        }

        w.Write([]string{
            r.Name,
            r.CreditCard,
        })
    }
}
@swuecho

This comment has been minimized.

swuecho commented Apr 27, 2015

json_xs <data.json -e '$_ = join "\n", "Name, Credit Card", map { $_->{creditcard} ?  "$_->{name}, $_->{creditcard}" : ()  } @$_'  -t string

this is also perl, using the json_xs provided by JSON::XS

@Magnap

This comment has been minimized.

Magnap commented Apr 27, 2015

Here's my Haskell solution. By default it uses the supplied URL as the data source, but it can combine the data from any number of urls and/or files given as command line arguments.

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE DeriveGeneric #-}

module Main (main) where

import qualified System.Environment as E
import qualified Network.HTTP.Conduit as N
import qualified Data.Aeson as A
import qualified Data.Text as T
import qualified Data.ByteString.Lazy as B
import qualified Data.Maybe as DM
import qualified Data.Csv as C
import Data.Csv ((.=))
import qualified Data.Vector as V
import GHC.Generics
import qualified Data.Time as DT
import qualified Data.List as DL
import qualified Network.URI as U
import qualified Control.Monad as M

data Person = Person
  { name :: T.Text
  , creditcard :: T.Text
  } deriving (Show, Generic)

instance A.FromJSON Person

instance C.ToNamedRecord Person where
  toNamedRecord (Person name creditcard) = C.namedRecord [
    "Name" .= name, "Credit Card" .= creditcard]

defaultURL = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json"

main = do
  args <- E.getArgs
  let (urls,files) = DL.partition U.isURI (if null args then [defaultURL] else args)
  M.when (null args) (putStrLn "No arguments provided. Downloading from default URL. Any number of files and/or URLs can be provided as arguments.")
  M.unless (null files) (putStrLn "Reading the following files:" >> mapM_ putStrLn files)
  M.unless (null urls) (putStrLn "Downloading from the following URLs:" >> mapM_ putStrLn urls)
  contents <- M.liftM2 (++) (mapM N.simpleHttp urls) (mapM B.readFile files)
  let lines = concatMap (map B.init . B.split 10) contents
      parse = DM.mapMaybe A.decode lines :: [Person]
      csv = C.encodeByName (V.fromList ["Name", "Credit Card"]) parse
  now <- DT.getCurrentTime
  let fileName = DT.formatTime DT.defaultTimeLocale "%Y%m%d.csv" now
  putStrLn $ "Writing to " ++ fileName
  B.writeFile fileName csv
@cgp

This comment has been minimized.

cgp commented Apr 27, 2015

No love for Java? /s

import java.io.IOException;
import java.io.PrintWriter;
import java.net.MalformedURLException;
import java.net.URL;
import java.time.LocalDate;
import java.time.format.DateTimeFormatter;
import java.util.Scanner;


public class JavaChallenge {
    public static void main(String[] args) throws MalformedURLException, IOException {
        String url = "https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json";
        String in = new Scanner(new URL(url).openStream(), "UTF-8").useDelimiter("\\A").next();
        String[] entries = in.split("\n");
        StringBuilder sb = new StringBuilder("");
        for(String entry:entries) {
            String[] fields = entry.split("\"");
            if (fields.length < 10) continue;           
            if (!"creditcard".equals(fields[fields.length-2])) {
                sb.append("\""+fields[3]+","+fields[fields.length-2]+"\"\n");
            }           
        }
        LocalDate ld = LocalDate.now();     
        PrintWriter out = new PrintWriter(ld.format(DateTimeFormatter.ofPattern("YYYYMMdd"))+".csv");
        out.println(sb.toString());
        out.close();         
    }
}
@ebastos

This comment has been minimized.

ebastos commented Apr 27, 2015

Only with standard Unix tools:

file=$(date +"%Y%m%d").csv; echo name,creditcard > $file; curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json|sed -r 's/(.*name":)(.*)(,"email.*creditcard":")(.*)}/\2,\4/g' |egrep -v "null|\[|\]"|cut -d "," -f1,2|tr -d '"' >> $file
@rupa

This comment has been minimized.

rupa commented Apr 29, 2015

nice thing. quick and dirty as I did not have much time to act!

curl https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | grep -v '"creditcard":null' | sed -e 's/.*name":"//' -e 's/".*creditcard":"/,/' -e 's/".*//' -e 's/^\[$/name,creditcard/' | grep -v '^]$' >  $(date +%Y%m%d).csv
@singareddyb

This comment has been minimized.

singareddyb commented Apr 29, 2015

Using a combination of bash\perl --

wget https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json; filename=`date +%Y%m%d`.csv; echo "Name , CreditCard" > $filename; perl -w -n -e 'print "$1 , $2\n" if(m/^{"name":"(.*)","email.*"creditcard":"(.*)"}[,]?$/)' data.json >> $filename

Then, I have a pure Perl approach --

#!/usr/bin/perl

use strict;
use warnings;

use IO::File;
use LWP::Simple;

my $file = 'data.json';
my $uri = 'https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json';

getstore($uri,$file);

my $rf = IO::File->new('data.json','r');
my $filename = calculateDateBasedFile();
my $wf = IO::File->new("$filename",'w');

$wf->print("Name , Credit\n");

while(defined($_ = $rf->getline)){

if(m/^{"name":"(.*)","email.*"creditcard":"(.*)"}[,]?$/)
  {
      $wf->print("$1 , $2\n");
  }

}

sub calculateDateBasedFile{

 my @time=localtime(time);
 my $year = 1900 + $time[5];
 my $month = $time[4] + 1;

  $month=date_quirk($month);

 my $day = $time[3];
  $day = date_quirk($day);


return $year.$month.$day.".csv";

}
sub date_quirk{

  my $m_or_d = shift;

  if($m_or_d < 10){

      $m_or_d = "0".$m_or_d;

  }
  return $m_or_d;
}

$wf->close;
$rf->close;
@xtradev

This comment has been minimized.

xtradev commented May 15, 2015

\B=name,creditcard\n
\L{"name"\:"<U>"*"creditcard"\:"<U>"=$1,$3\n

gema -match -f json2csv.gema data.json > 20150515.csv

@raine

This comment has been minimized.

raine commented May 16, 2015

Solution using ramda-cli:

#!/usr/bin/env bash

data_url=https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json
file=`date "+%Y%m%d"`.csv
curl $data_url | R 'filter where creditcard: (!= null)' 'project [\name \creditcard]' -o csv > $file
@snahor

This comment has been minimized.

snahor commented May 24, 2015

The lovely awk:

curl -s https://gist.githubusercontent.com/jorin-vogel/7f19ce95a9a842956358/raw/e319340c2f6691f9cc8d8cc57ed532b5093e3619/data.json | awk -F '("[,:]"|"})' '{if ($12!="") print $2","$12}' > (date +%Y%m%d).csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment