Skip to content

Instantly share code, notes, and snippets.

@ashokc
Created January 26, 2019 18:04
Show Gist options
  • Save ashokc/d0652e804ef5cac9cbc52218ec49f5ae to your computer and use it in GitHub Desktop.
Save ashokc/d0652e804ef5cac9cbc52218ec49f5ae to your computer and use it in GitHub Desktop.
Tf-Idf vectors from tokens
# Build Tf-Idf Vectors
from sklearn.feature_extraction.text import TfidfVectorizer
X=np.array([np.array(xi) for xi in X]) # rows:Docs. columns:words
vectorizer = TfidfVectorizer(analyzer=lambda x: x, min_df=1).fit(X)
word_index = vectorizer.vocabulary_
Xencoded = vectorizer.transform(X)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment