Solution: Select top nouns
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tl = vol.tokenlist(pages=False) | |
just_nouns = tl.loc[(slice(None), slice(None), ["NN", "NNS"]),] | |
top_nouns = just_nouns.sort_values('count', ascending=False) | |
top_nouns.head(5) | |
# OUTPUT: | |
# count | |
# section token pos | |
# body doctor NN 83 | |
# time NN 80 | |
# day NN 73 | |
# eyes NNS 61 | |
# way NN 57 | |
# NOTE | |
# Because each step returns a DataFrame, it is possible to `chain` methods. | |
# Though inadvisable dense in this case, the solution above is possible to write like this: | |
vol.tokenlist(pages=False).loc[(slice(None), slice(None), ["NN", "NNS"]),].sort_values('count', ascending=False).head(5) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment