Skip to content

Instantly share code, notes, and snippets.

@Navjotbians
Last active May 22, 2021 20:03
Show Gist options
  • Save Navjotbians/c4a90170c82b585971eab0542609f26d to your computer and use it in GitHub Desktop.
Save Navjotbians/c4a90170c82b585971eab0542609f26d to your computer and use it in GitHub Desktop.
Text to vector conversion
from sklearn import feature_extraction
def get_embeddings(X_train, X_test, max_feature = 1000, embedding_type = "tfidf"):
if embedding_type == "bow":
vectorizer = feature_extraction.text.CountVectorizer(max_features= max_feature)
vectorizer.fit_transform(X_train).toarray()
train_feat = vectorizer.transform(X_train).toarray()
test_feat = vectorizer.transform(X_test).toarray()
return train_feat, test_feat, vectorizer
if embedding_type == "tfidf":
vectorizer = feature_extraction.text.TfidfVectorizer(max_features=max_feature)
vectorizer.fit_transform(X_train).toarray()
train_feat = vectorizer.transform(X_train).toarray()
test_feat = vectorizer.transform(X_test).toarray()
return train_feat, test_feat, vectorizer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment