Skip to content

Instantly share code, notes, and snippets.

@AyishaR
Created January 30, 2021 17:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save AyishaR/cff91cf83653d4626de70605753e4335 to your computer and use it in GitHub Desktop.
Save AyishaR/cff91cf83653d4626de70605753e4335 to your computer and use it in GitHub Desktop.
sno = nltk.stem.SnowballStemmer('english') # Initializing stemmer
wordcloud = [[], [], [], [], [], [], []]
all_sentences = [] # All cleaned sentences
for x in range(len(df['Questions'].values)):
question = df['Questions'].values[x]
classname = df['Category0'].values[x]
cleaned_sentence = []
sentence = removeURL(question)
sentence = removeHTML(sentence)
sentence = onlyAlphabets(sentence)
sentence = sentence.lower()
for word in sentence.split():
#if word not in stop:
stemmed = sno.stem(word)
cleaned_sentence.append(stemmed)
wordcloud[class_names.index(classname)].append(word)
all_sentences.append(' '.join(cleaned_sentence))
# add as column in dataframe
X = all_sentences
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment