Skip to content

Instantly share code, notes, and snippets.

@glamp glamp/spider_with_sql.py
Last active Dec 16, 2015

Embed
What would you like to do?
import psycopg2
import requests
conn = psycopg2.connect("{YOUR CONNECTION"})
cur = conn.cursor()
q = """
select
'http://www.yelp.com/search?find_loc=Manhattan%2C+NY&ns=1&find_desc=' || name as url
from
restaurants
limit 3;"""
cur.execute(q)
urls = cur.fetchall()
print urls
#["http://www.yelp.com/search?find_loc=Manhattan%2C+NY&ns=1&find_desc=ray's pizza",
# 'http://www.yelp.com/search?find_loc=Manhattan%2C+NY&ns=1&find_desc=shake shack',
# 'http://www.yelp.com/search?find_loc=Manhattan%2C+NY&ns=1&find_desc=rubirosa']
# find the number of times pizza is mentioned in each yelp search results
for url in urls:
html = requests.get(url).text
print url, html.count("pizza")
#http://www.yelp.com/search?find_loc=Manhattan%2C+NY&ns=1&find_desc=ray's pizza 212
#http://www.yelp.com/search?find_loc=Manhattan%2C+NY&ns=1&find_desc=shake shack 0
#http://www.yelp.com/search?find_loc=Manhattan%2C+NY&ns=1&find_desc=rubirosa 63
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.