Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
Converts stream of tabular records to stream of JSON records.
if (NR == 1) {
split($0, tags);
if (EC == "") EC = "\"";
else {
split($0, vals);
jrec = "{";
for (i = 1; i <= NF; ++i) {
if (vals[i] ~ /[^0-9.]/)
jrec = jrec EC tags[i] EC ":" EC vals[i] EC;
jrec = jrec EC tags[i] EC ":" vals[i];
if (i < NF)
jrec = jrec ", ";
jrec = jrec "},";
print jrec;

This comment has been minimized.

Copy link
Owner Author

@drjerry drjerry commented Aug 26, 2012

The column headers must be the first record of the stream (or file) and are used as the field keys. By default, in the output, all keys are enclosed in quotes, as are non-numeric values. The field separator of the input stream should be specified via the -F argument.

Example: Convert plain CSV into JSON

 $ cat myfile.csv
 some text,123.4,2012-08-26 12:00:00
 $ awk -F "," -f tab2json.awk myfile.csv 
 { "text":"some text", "number":123.4, "timestamp":"2012-08-26 12:00:00" },

Example 2:

MySQL uses the tab character as its default field separator:

 $ mysql -e "select * from db.mytable;" |awk -F "\\t" -f tab2json.awk

The Enclosing Character

By default double-quotes are used to enclose string values, but this can be changed to any other delimiter via the EC variable. For example, to transform the CSV file using single-quotes instead of double quotes:

 $ awk -F "," --assign EC=\' -f tab2json.awk myfile.csv 
 { 'text':'some text', 'number':123.4, 'timestamp':'2012-08-26 12:00:00' },

This comment has been minimized.

Copy link

@yasirbam yasirbam commented Mar 20, 2014

great script,
One small bug, is it possible to remove the last comma on the last record?


This comment has been minimized.

Copy link

@minkymorgan minkymorgan commented Sep 30, 2014

the last comma problem is corrected by adding two lines to the script, seen here:
minkymorgan / tab2json.awk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment