Skip to content

Instantly share code, notes, and snippets.

@anaclumos
Created October 23, 2019 02:57
Show Gist options
  • Save anaclumos/9414c4347c14e43d36a81e58ef37e1d2 to your computer and use it in GitHub Desktop.
Save anaclumos/9414c4347c14e43d36a81e58ef37e1d2 to your computer and use it in GitHub Desktop.
For Blog: Anonymize.py @ MinsaPay
import pandas as pd
Dataframe = pd.read_csv('raw.csv')
def anonymize(df, targetColumn):
anon = {}
id = 0
for x in range(len(df)):
user = df.loc[x, targetColumn]
if user in anon:
df.loc[x, targetColumn] = anon[user]
else:
if id < 10:
unknown = "#00" + str(id)
elif id < 100:
unknown = "#0" + str(id)
else:
unknown = "#" + str(id)
anon[user] = targetColumn + str(unknown)
id += 1
df.loc[x, targetColumn] = anon[user]
anonymize(Dataframe, 'user')
anonymize(Dataframe, 'booth')
Dataframe.to_csv("anonymized.csv", mode='w')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment