Skip to content

Instantly share code, notes, and snippets.

@amankharwal
Created Apr 1, 2021
Embed
What would you like to do?
from sklearn.feature_extraction import text
ted_talks = data["transcript"].tolist()
bi_tfidf = text.TfidfVectorizer(input=ted_talks, stop_words="english", ngram_range=(1,2))
bi_matrix = bi_tfidf.fit_transform(ted_talks)
uni_tfidf = text.TfidfVectorizer(input=ted_talks, stop_words="english")
uni_matrix = uni_tfidf.fit_transform(ted_talks)
from sklearn.metrics.pairwise import cosine_similarity
bi_sim = cosine_similarity(bi_matrix)
uni_sim = cosine_similarity(uni_matrix)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment