Skip to content

Instantly share code, notes, and snippets.

@julienanquetil
Created August 30, 2018 07:18
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save julienanquetil/c0072b47a609e3fd92b8270af8afba13 to your computer and use it in GitHub Desktop.
Save julienanquetil/c0072b47a609e3fd92b8270af8afba13 to your computer and use it in GitHub Desktop.
Merge csv
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
sku;col1;col2;test
123;456;99;A
234;786;99;
345;678;99;A
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
sku;col3;col4;test
123;18-123;9999;A
234;18-786;9999;
345;12-678;9999;A
We can make this file beautiful and searchable if this error is corrected: No commas found in this CSV file in line 0.
sku;col5;col6;test
123;18-123;9999;
234;18-786;9999;A
345;12-678;9999;
#!/usr/bin/env python
import csv
import pandas as pd
inputs = ["test/csv1.csv", "test/csv2.csv","test/csv3.csv"]
# First determine the field names from the top line of each input file
fieldnames = []
for filename in inputs:
with open(filename, "rb") as f_in:
reader = csv.reader(f_in,delimiter=';')
headers = next(reader)
for h in headers:
if h not in fieldnames:
fieldnames.append(h)
# Then copy the data
with open("final/out.csv", "w",) as f_out:
writer = csv.DictWriter(f_out, fieldnames=fieldnames,delimiter=";")
#headers
f_out.write('; '.join(fieldnames)+"\n")
for filename in inputs:
with open(filename, "rb") as f_in:
reader = csv.DictReader(f_in,delimiter=';') # Uses the field names in this file
for line in reader:
writer.writerow(line)
#make unique result
a = pd.read_csv("final/out.csv",sep =';', low_memory=False)
a = a.astype('object')
print (a.groupby(['sku'],as_index=False).first())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment