Skip to content

Instantly share code, notes, and snippets.

@varunchitale
Created October 11, 2018 08:07
Show Gist options
  • Save varunchitale/c412ad3c70d6ec54ffe1b38e11c2cce9 to your computer and use it in GitHub Desktop.
Save varunchitale/c412ad3c70d6ec54ffe1b38e11c2cce9 to your computer and use it in GitHub Desktop.
Python + Spacy to generate and store key phrases
import psycopg2
from collections import Counter
query_populate = """
INSERT into <schema_name>.prompts (ngram, freq)
values (%(ngram)s, %(freq)s)
ON CONFLICT (ngram) DO UPDATE
SET freq = prompts.freq + %(freq)s
WHERE prompts.ngram = %(ngram)s
"""
prompts_count = Counter(prompts)
data = {}
#100 documents taken together and passed to spacy
#Taking text_size/5 = 20 most frequent noun chunks
text_size = 100
for item in prompts_count.most_common()[:(text_size/5)]:
data['ngram'] = item[0]
data['freq'] = item[1]
cursor.execute(query_populate,data)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment