Created
August 7, 2021 16:11
-
-
Save Yuktha-Majella/d829d4709e11eefa7525c435b032ac8f to your computer and use it in GitHub Desktop.
Creating TF-IDF in Gensim
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
text = ["The food is excellent but the service can be better", | |
"The food is always delicious and loved the service", | |
"The food was mediocre and the service was terrible"] | |
g_dict = corpora.Dictionary([simple_preprocess(line) for line in text]) | |
g_bow = [g_dict.doc2bow(simple_preprocess(line)) for line in text] | |
print("Dictionary : ") | |
for item in g_bow: | |
print([[g_dict[id], freq] for id, freq in item]) | |
g_tfidf = models.TfidfModel(g_bow, smartirs='ntc') | |
print("TF-IDF Vector:") | |
for item in g_tfidf[g_bow]: | |
print([[g_dict[id], np.around(freq, decimals=2)] for id, freq in item]) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment