Skip to content

Instantly share code, notes, and snippets.

@hughlilly
Created February 27, 2020 20:27
Show Gist options
  • Save hughlilly/2c3394f42660e9e04398f3d60ce6f0ec to your computer and use it in GitHub Desktop.
Save hughlilly/2c3394f42660e9e04398f3d60ce6f0ec to your computer and use it in GitHub Desktop.
Use Pandas to split a csv
import pandas as pd
count = 0
# Load csv as Pandas dataframe, set number of rows in each file to 9000
df = pd.read_csv('bigfile.csv', iterator=True, chunksize=9000)
# Iterate over each chunk
for chunk in df:
# Set output file name, appending count variable
outname = 'out'
outname += str(count)
outname += '.csv'
# Create files, increase count
chunk.to_csv(outname, index=False)
count += 1
# There should probably be some error-handling here...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment