Skip to content

Instantly share code, notes, and snippets.

@alexrutherford
Last active April 18, 2022 09:14
Show Gist options
  • Save alexrutherford/08c7006788b28c5663bf4c83db0c4660 to your computer and use it in GitHub Desktop.
Save alexrutherford/08c7006788b28c5663bf4c83db0c4660 to your computer and use it in GitHub Desktop.
bigSmallDict = (df['crime_category'].value_counts(normalize=True) > 0.05).to_dict()
# Make a dictionary saying if a category is > 5% of total or not
def assignOther(t):
if bigSmallDict[t]:
return t
else:
return 'other'
# Define a mini function that looks in the dictionary
df['crime_category'] = df['crime_category'].apply(assignOther)
# Apply the function
df['crime_category'] = df['crime_category'].apply(lambda x: x if bigSmallDict[x] else 'other')
# As a one liner with an inline lambda function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment