Skip to content

Instantly share code, notes, and snippets.

View CristhianBoujon's full-sized avatar

Cristhian Boujon CristhianBoujon

View GitHub Profile
@CristhianBoujon
CristhianBoujon / get_top_n_words.py
Last active May 20, 2021 09:14
List the words in a vocabulary according to occurrence in a text corpus , Scikit-Learn
def get_top_n_words(corpus, n=None):
"""
List the top n words in a vocabulary according to occurrence in a text corpus.
get_top_n_words(["I love Python", "Python is a language programming", "Hello world", "I love the world"]) ->
[('python', 2),
('world', 2),
('love', 2),
('hello', 1),
('is', 1),