Skip to content

Instantly share code, notes, and snippets.

@ManojDatt
Last active April 10, 2024 11:12
Show Gist options
  • Save ManojDatt/6f9f8abe834596928aaa5e7d5ecbfdb7 to your computer and use it in GitHub Desktop.
Save ManojDatt/6f9f8abe834596928aaa5e7d5ecbfdb7 to your computer and use it in GitHub Desktop.
We have qa, uat and prod 3 csv files with co_type, co_name, co_description, co_id, Source columns we wanted to compine these based on co_type, with priority of prod file, if prod has same co_type that qa and uat has then use from prod if not the use from qa and then from uat and finally combined data save in a separate csv file Make use of pytho…
import pandas as pd
# Read Excel files
df1 = pd.read_csv('uat.csv')
df2 = pd.read_csv('qa.csv')
df3 = pd.read_csv('prod.csv')
# Add source column
df1['source'] = 'UAT'
df2['source'] = 'QA'
df3['source'] = 'PROD'
df1 = df1.drop('co_id', axis=1)
df2 = df2.drop('co_id', axis=1)
df3 = df3.drop('co_id', axis=1)
# Concatenate DataFrames
# combined_df = pd.concat([df3, df2[~df2[['co_type', 'co_name', 'co_description']].isin(df3[['co_type', 'co_name', 'co_description']]).any(axis=1)], df1[~df1[['co_type', 'co_name', 'co_description']].isin(df3[['co_type', 'co_name', 'co_description']]).any(axis=1)]])
combined_df = pd.merge(df3, df2, on = ['co_type', 'co_name', 'co_description'], how='outer', suffixes=('_prod', '_qa'))
combined_df = pd.merge(combined_df, df1, on = ['co_type', 'co_name', 'co_description'], how='outer', suffixes=('_prod', '_uat'))
# Save to a new Excel file
combined_df.to_csv('combined_core_output.csv', index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment