Skip to content

Instantly share code, notes, and snippets.

@arunkarnann
Created January 28, 2019 04:32
Show Gist options
  • Save arunkarnann/ab775bb8010e0bd5a97f6ad53624bb93 to your computer and use it in GitHub Desktop.
Save arunkarnann/ab775bb8010e0bd5a97f6ad53624bb93 to your computer and use it in GitHub Desktop.
A Gist in python using pandas that merge all csv files in a given directory , Also removes duplicate rows and filter rows based on Content matching.
import pandas as pd
import glob,os
#folder in which files are there
files="D:\\Projects\\CSVMERGE\\muse\\"
#Combining all CSVs
combined_csv = pd.concat( [ pd.read_csv(f,encoding = "ISO-8859-1") for f in glob.glob(os.path.join(files,'*.csv')) ] ,ignore_index=True)
#Removing Duplicates
combined_csv = combined_csv[combined_csv['Name'].duplicated() == False]
#Filter with containing specific texts
combined_csv = combined_csv[combined_csv['Address'].str.contains(texts[6])]
print(combined_csv)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment