Skip to content

Instantly share code, notes, and snippets.

@ayushkumarshah
Forked from cccntu/csv.py
Created February 9, 2021 12:31
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ayushkumarshah/f395e29053e7f65b1e1b1d0ba26a1826 to your computer and use it in GitHub Desktop.
Save ayushkumarshah/f395e29053e7f65b1e1b1d0ba26a1826 to your computer and use it in GitHub Desktop.
python mmap to concatenate csv files
❯ rm out.csv
❯ cat 1.py
from glob import glob
import mmap
files = glob("data/*")
files.sort(key=lambda x: int(x.split("/")[-1].split(".")[0]))
write_f = open("out.csv", "w+b")
for i, fname in enumerate(files):
with open(fname, "r+b") as f:
with mmap.mmap(f.fileno(), 0) as mm:
if i == 0:
write_f.write(mm.readline())
else:
mm.readline()
write_f.write(mm.read())
write_f.close()
❯ time python 1.py
python 1.py 0.90s user 1.12s system 99% cpu 2.022 total
❯ wc -l out.csv
10000001 out.csv
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment