Created — forked from bycoffe/csvcut

Embed URL

HTTPS clone URL

SSH clone URL

You can clone with HTTPS or SSH.

Download Gist
View csvcut
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78
#!/usr/bin/env python
"""
Like cut, but for CSVs. To be used from a shell command line.
Note that fields are zero-based, as opposed to 'cut' where they are 1-based.
Should use something better than getopt, but this works...
Usage:
csvcut foobar.csv
(prints the first column of each row of foobar.csv)
head -10 foobar.csv | csvcut -f 0,2
(prints the first and third columns of the first ten lines of foobar.csv)
csvcut -f 0,2 -d "|" foobar.csv
(prints the first and third columns of the pipe-delimited foobar.csv)
csvcut -f 0,2 -t foobar.csv
(prints the first and third columns of the tab-delimited foobar.csv
if present, the -d option will be ignored.)
csvcut -h foobar.csv
(prints the values of the first line of foobar.csv, preceded by the field index which would
be used to display that column. If present, the -f option will be ignored.)
csvcut -f 0,1,2 -d "|" -o , foobar.csv
(prints the first three columns of the pipe-delimited foobar.csv; output
will be comma-delimited.)
csvcut -f 0,1,2 -o "|" foobar.csv
(prints the first three columns of the comma-delimited foobar.csv; output
will be pipe-delimited.)
csvcut -f : -o "|" foobar.csv
(prints all the columns of the comma-delimited foobar.csv; output will be
pipe-delimited.)
csvcut -f 0,1 -d "," -q "|" foobar.csv
(prints the first two columns of the comma-delimited, pipe-quoted foorbar.csv.)
"""
import sys, csv, getopt
opts, args = getopt.getopt(sys.argv[1:], "f:d:o:q:ht", [])
if args:
i = open(args[0],"U")
else:
i = sys.stdin
 
delimiter = ','
output_delimiter = ' '
cols = [0, ]
show_headers = False
 
if opts:
opts = dict(opts)
show_headers = '-h' in opts
if '-f' in opts:
cols = opts['-f'].split(",")
if '-t' in opts:
delimiter = "\t"
elif '-d' in opts:
delimiter = opts['-d']
if '-o' in opts:
output_delimiter = opts['-o']
if '-q' in opts:
quotechar = opts['-q']
else:
quotechar = None
 
for row in csv.reader(i, delimiter=delimiter, quotechar=quotechar):
if show_headers:
for i,c in enumerate(row):
print "%3i: %s" % (i,c)
break
writer = csv.writer(sys.stdout, delimiter=output_delimiter)
if cols == [':']:
cols = range(len(row))
writer.writerow([row[int(c)] for c in cols])

Added "U" to the open call to handle common case of Mac OS Excel CSV exports, which have Mac line feeds instead of UNIX.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.