Skip to content

Instantly share code, notes, and snippets.

@cydal
Created March 17, 2021 22:18
Show Gist options
  • Save cydal/d72bf717d544d4a44dc383b9748f19bf to your computer and use it in GitHub Desktop.
Save cydal/d72bf717d544d4a44dc383b9748f19bf to your computer and use it in GitHub Desktop.
import pandas as pd
import numpy as np
import nltk
import string
from cleantext import clean
nltk.download('stopwords')
def clean_text(text):
clean(text, all=False, extra_spaces=True, lowercase=True, numbers=True, punct=True)
for word in text.split():
for eachkey in keywords:
if word in keywords[eachkey]:
keycount[eachkey][word] += 1
core_df = core_df["abstract"].apply(lambda x: clean_text(x))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment