Skip to content

Instantly share code, notes, and snippets.

@mlai-demo
Last active January 19, 2020 22:23
Show Gist options
  • Save mlai-demo/9cbf25005a5d9cfac1452ca44a9bc355 to your computer and use it in GitHub Desktop.
Save mlai-demo/9cbf25005a5d9cfac1452ca44a9bc355 to your computer and use it in GitHub Desktop.
vectorize text, and create sparse matrix and numpy array
import numpy as np
import sklearn.feature_extraction.text as text
vectorizer = text.CountVectorizer(input='filename', stop_words=my_stop_words, min_df=text_number)
tm_sparse = vectorizer.fit_transform(texts)
tm_array = vectorizer.fit_transform(texts).toarray()
vocab = np.array(vectorizer.get_feature_names())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment