Skip to content

Instantly share code, notes, and snippets.

@ecdedios
Created May 25, 2020 19:05
Show Gist options
  • Save ecdedios/d185c5234246e493096a3d77d0ba51e5 to your computer and use it in GitHub Desktop.
Save ecdedios/d185c5234246e493096a3d77d0ba51e5 to your computer and use it in GitHub Desktop.
An early attempt at using fuzzywuzzy.
choices = set([item for sublist in articles for item in sublist])
cleaned_articles = []
for article in articles:
article_entities = []
for entity in set(article):
article_entities.append(process.extractOne(entity, choices)[0])
cleaned_articles.append(article_entities)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment