Skip to content

Instantly share code, notes, and snippets.

@cjdd3b
Created April 2, 2015 21:54
Show Gist options
  • Save cjdd3b/abbc811900d2ca64ee3e to your computer and use it in GitHub Desktop.
Save cjdd3b/abbc811900d2ca64ee3e to your computer and use it in GitHub Desktop.
CSV-flattening code for Harsh's research
import csv, os
# This chunk iterates through all of the csv files in a directory, turns them
# into 2-dimensional arrays (lists of lists), and puts all those arrays into
# a list called "tables"
tables = []
# Loop over all files in the current directory (which is what "." means)
for f in os.listdir('.'):
# Skip non-CSV files by checking the file extension
if not f.split('.')[1] == 'csv':
continue
# Open the CSV, grab lines 7-2010 and add them to the tables list
with open(f, 'r') as csvfile:
reader = list(csv.reader(csvfile))[7:2010]
tables.append(reader)
# Now that the "tables" list contains the relevant parts of all the CSVs,
# we can stitch them together into another list, which we'll call "output"
output = []
# This is weird syntax for a weird tool, but you can read about zip here:
# https://docs.python.org/2/library/functions.html#zip. It's basically a
# tool for smashing lists together. Also, we're feeding in a special type
# of argument, which is what the "*" is for. That's described here:
# https://docs.python.org/2/tutorial/controlflow.html#arbitrary-argument-lists
for row in zip(*tables):
# This appends each row together, skipping the first 5 items from each.
# It then adds the result to the output. Stolen from StackOverflow here:
# http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python
output.append([item for sublist in row for item in sublist[5:]])
# Write the CSV, using the syntax from class
with open('test.csv', 'w') as csvfile:
my_writer = csv.writer(csvfile, delimiter=',')
my_writer.writerows(output)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment