public
Last active — forked from bycoffe/csvcut

  • Download Gist
csvcut
Python
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78
#!/usr/bin/env python
"""
Like cut, but for CSVs. To be used from a shell command line.
 
Note that fields are zero-based, as opposed to 'cut' where they are 1-based.
 
Should use something better than getopt, but this works...
 
Usage:
csvcut foobar.csv
(prints the first column of each row of foobar.csv)
head -10 foobar.csv | csvcut -f 0,2
(prints the first and third columns of the first ten lines of foobar.csv)
 
csvcut -f 0,2 -d "|" foobar.csv
(prints the first and third columns of the pipe-delimited foobar.csv)
 
csvcut -f 0,2 -t foobar.csv
(prints the first and third columns of the tab-delimited foobar.csv
if present, the -d option will be ignored.)
 
csvcut -h foobar.csv
(prints the values of the first line of foobar.csv, preceded by the field index which would
be used to display that column. If present, the -f option will be ignored.)
 
csvcut -f 0,1,2 -d "|" -o , foobar.csv
(prints the first three columns of the pipe-delimited foobar.csv; output
will be comma-delimited.)
 
csvcut -f 0,1,2 -o "|" foobar.csv
(prints the first three columns of the comma-delimited foobar.csv; output
will be pipe-delimited.)
 
csvcut -f : -o "|" foobar.csv
(prints all the columns of the comma-delimited foobar.csv; output will be
pipe-delimited.)
 
csvcut -f 0,1 -d "," -q "|" foobar.csv
(prints the first two columns of the comma-delimited, pipe-quoted foorbar.csv.)
"""
import sys, csv, getopt
opts, args = getopt.getopt(sys.argv[1:], "f:d:o:q:ht", [])
if args:
i = open(args[0],"U")
else:
i = sys.stdin
 
delimiter = ','
output_delimiter = ' '
cols = [0, ]
show_headers = False
 
if opts:
opts = dict(opts)
show_headers = '-h' in opts
if '-f' in opts:
cols = opts['-f'].split(",")
if '-t' in opts:
delimiter = "\t"
elif '-d' in opts:
delimiter = opts['-d']
if '-o' in opts:
output_delimiter = opts['-o']
if '-q' in opts:
quotechar = opts['-q']
else:
quotechar = None
 
for row in csv.reader(i, delimiter=delimiter, quotechar=quotechar):
if show_headers:
for i,c in enumerate(row):
print "%3i: %s" % (i,c)
break
writer = csv.writer(sys.stdout, delimiter=output_delimiter)
if cols == [':']:
cols = range(len(row))
writer.writerow([row[int(c)] for c in cols])

Added "U" to the open call to handle common case of Mac OS Excel CSV exports, which have Mac line feeds instead of UNIX.

Please sign in to comment on this gist.

Something went wrong with that request. Please try again.