Skip to content

Instantly share code, notes, and snippets.

@abhishek-shrm
Created October 4, 2020 14:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save abhishek-shrm/3650bc2adc9007249dd4aa1de58a89b3 to your computer and use it in GitHub Desktop.
Save abhishek-shrm/3650bc2adc9007249dd4aa1de58a89b3 to your computer and use it in GitHub Desktop.
# Function for pre-processing keywords
def preprocess_keys(text):
# Case Normalization
text=text.lower()
# Removing hyphen and whitespaces
text=re.sub('(\s|-)+',' ',text)
# Removing any leading and trailing spaces
text=text.strip()
return text
# Pre-processing Keywords
for i in keys:
keys[i]=[preprocess_keys(key) for key in keys[i]]
# Printing some sample keywords
dict(list(keys.items())[:4])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment