Skip to content

Instantly share code, notes, and snippets.

@HeenaR17
Created October 25, 2020 19:25
Show Gist options
  • Save HeenaR17/b681ca99d60149fbe929f4052df98ff8 to your computer and use it in GitHub Desktop.
Save HeenaR17/b681ca99d60149fbe929f4052df98ff8 to your computer and use it in GitHub Desktop.
for i in range(0,news.shape[0]):
title = re.sub(pattern='[^a-zA-Z]', repl=' ', string=news.title[i])
title = title.lower()
words = title.split()
words = [word for word in words if word not in set(stopwords.words('english'))]
words = [ps.stem(word) for word in words]
title = ' '.join(words)
corpus.append(title)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment