Skip to content

Instantly share code, notes, and snippets.

@purva91
Last active Aug 24, 2020
Embed
What would you like to do?
import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize
import numpy as np
sentences = ["I ate dinner.",
"We had a three-course meal.",
"Brad came to dinner with us.",
"He loves fish tacos.",
"In the end, we all felt like we ate too much.",
"We all agreed; it was a magnificent evening."]
# Tokenization of each document
tokenized_sent = []
for s in sentences:
tokenized_sent.append(word_tokenize(s.lower()))
tokenized_sent
def cosine(u, v):
return np.dot(u, v) / (np.linalg.norm(u) * np.linalg.norm(v))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment