Skip to content

Instantly share code, notes, and snippets.

@hamletbatista
Created February 27, 2019 22:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save hamletbatista/9cb9a534d1a2dad72aab1c29cf22c328 to your computer and use it in GitHub Desktop.
Save hamletbatista/9cb9a534d1a2dad72aab1c29cf22c328 to your computer and use it in GitHub Desktop.
cnt=Counter()
english_stopwords = set(stopwords.words('english'))
for path in df.path:
words = re.split("[-/]", path)
for word in words:
if len(word) > 0 and word not in english_stopwords and not word.isdigit():
cnt[word] += 1
cnt.most_common(25)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment