Skip to content

Instantly share code, notes, and snippets.

@purva91
Last active August 24, 2020 16:37
Show Gist options
  • Save purva91/c8bf8e060d2cab90abdcc12dd65c2182 to your computer and use it in GitHub Desktop.
Save purva91/c8bf8e060d2cab90abdcc12dd65c2182 to your computer and use it in GitHub Desktop.
import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize
import numpy as np
sentences = ["I ate dinner.",
"We had a three-course meal.",
"Brad came to dinner with us.",
"He loves fish tacos.",
"In the end, we all felt like we ate too much.",
"We all agreed; it was a magnificent evening."]
# Tokenization of each document
tokenized_sent = []
for s in sentences:
tokenized_sent.append(word_tokenize(s.lower()))
tokenized_sent
def cosine(u, v):
return np.dot(u, v) / (np.linalg.norm(u) * np.linalg.norm(v))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment