Skip to content

Instantly share code, notes, and snippets.

('en', 9.061840057373047)
('en', -645.5460562705994)
('en', 9.061840057373047)
('en', -8731.488015413284)
('en', -10009.011488676071)
('en', -1105.587391614914)
('en', -13068.071085691452)
('en', -5555.750634670258)
('en', -2213.5811071395874)
('en', -12848.46113562584)
#### Informative URLs/Articles ####
{u'num_results': 1,
u'results': [{u'classification_probs': {u'comparison': 0.3269718612530041,
u'information': 0.5433481579959143,
u'local': 0.015455451127369114,
u'purchase': 0.11422452962371252},
u'overall_classification': u'information',
u'url': u'http://whatis.techtarget.com/definition/long-tail-keywords'}]}
Potential feeds:
http://feeds.reuters.com/reuters/companyNews
http://feeds.reuters.com/reuters/businessNews
http://www.economist.com/feeds/print-sections/75/europe.xml
http://feeds.nytimes.com/nyt/rss/Business
http://feeds.bbci.co.uk/news/business/rss.xml
http://www.telegraph.co.uk/finance/rssfeeds/ (Whole fucking list of potential feeds)
In [39]: intent_counts['purchase'].most_common(20)
Out[39]:
[(u'www.amazon.com', 418),
(u'www.walmart.com', 222),
(u'www.target.com', 85),
(u'www.google.com', 31),
(u'www.walgreens.com', 31),
(u'www.ebay.com', 26),
(u'www.samsclub.com', 18),
(u'www.mccormick.com', 16),
@soeffing
soeffing / Stoplist
Created February 20, 2017 16:12
Stoplist
#stop word list from SMART (Salton,1971). Available at ftp://ftp.cs.cornell.edu/pub/smart/english.stop
a
a's
able
about
above
according
accordingly
across
actually
@soeffing
soeffing / rake-pos.py
Created February 20, 2017 16:08
POS RAKE - Rake keyword extractions with POS tagging
# Implementation of RAKE - Rapid Automtic Keyword Exraction algorithm
# as described in:
# Rose, S., D. Engel, N. Cramer, and W. Cowley (2010).
# Automatic keyword extraction from individual documents.
# In M. W. Berry and J. Kogan (Eds.), Text Mining: Applications and Theory.unknown: John Wiley and Sons, Ltd.
# Modified RAKE for filtering out verbs, adverbs, etc. from returned keywords
import re
import operator
@soeffing
soeffing / gist:5a255a13500eff1ad8e5a8b74d55c211
Created February 16, 2017 20:16
Article keyword extraction TM vs POS-RAKE
Extracted keyword by RAKE allowing VERBS:
[(u'aws api compatible private cloud', 15.5),
(u'build cloud aware applications', 12.166666666666666),
(u'meet elastic demand', 9.0),
(u'build private clouds', 8.666666666666666),
(u'private cloud strategy', 8.5),
Extracted keyword by RAKE not allowing VERBS:
import requests
from flask import Flask
from flask import request
from pymongo import MongoClient
client = MongoClient()
client = MongoClient('localhost', 27017)
db = client.serp_v2
import requests
import csv
from pymongo import MongoClient
client = MongoClient()
client = MongoClient('localhost', 27017)
db = client.serp_v2
{'learning': {1: {0: {'keywords': ['revenue_models_for_online_learning',
'online_learning',
'online_learning_companies',
'experts_in_online_learning',
'online_learning_services',
'online_learning_service_provider',
'online_learning_solutions',
'online_learning_providers',
'best_online_learning_companies',
'online_learning_for_business',