Skip to content

Instantly share code, notes, and snippets.

@pe7er
Created November 22, 2023 12:31
Show Gist options
  • Save pe7er/8531177c87e831d027bef90269d0bcc7 to your computer and use it in GitHub Desktop.
Save pe7er/8531177c87e831d027bef90269d0bcc7 to your computer and use it in GitHub Desktop.
Remove duplicate records from CSV
import pandas as pd
# Read the data from the exported CSV file
df = pd.read_csv('exported_data.csv')
# Remove duplicates based on user_id, mail_id, and send_date
df_cleaned = df.drop_duplicates(subset=['user_id', 'mail_id', 'send_date'])
# Write the cleaned data to a new file
df_cleaned.to_csv('cleaned_data.csv', index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment