Skip to content

Instantly share code, notes, and snippets.

@fadere
Created December 3, 2015 07:21
Show Gist options
  • Star 6 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save fadere/fe9d471fb605d7975ee3 to your computer and use it in GitHub Desktop.
Save fadere/fe9d471fb605d7975ee3 to your computer and use it in GitHub Desktop.
python code to scrape ebay historical auction results
import csv
import requests
import bs4
import argparse
# parser = argparse.ArgumentParser(description='Process a list of search terms.')
# parser.add_argument('terms', metavar='N', type=str, nargs='+',
# help='comma separated list of terms to search for')
# args = parser.parse_args()
# print args.accumulate(args.terms)
# enter multiple phrases separated by '',
phrases =['nixon autograph']
for phrase in phrases:
site = 'http://www.ebay.com/sch/i.html?_from=R40&_nkw='+phrase+'&_in_kw=1&_ex_kw=&_sacat=0&LH_Sold=1&_udlo=&_udhi=&LH_Auction=1&_samilow=&_samihi=&_sadis=15&_stpos=90278-4805&_sargn=-1%26saslc%3D1&_salic=1&_sop=13&_dmd=1&_ipg=200&LH_Complete=1'
res = requests.get(site)
res.raise_for_status()
soup = bs4.BeautifulSoup(res.text, "lxml")
# grab the date/time stamp of each auction listing
dte = [e.span.contents[0].split(' ')[0] for e in soup.find_all(class_="tme")]
# grab all the links and store its href destinations in a list
titles = [e.contents[0] for e in soup.find_all(class_="vip")]
# grab all the links and store its href destinations in a list
links = [e['href'] for e in soup.find_all(class_="vip")]
# grab all the bid spans and split its contents in order to get the number only
bids = [e.span.contents[0].split(' ')[0] for e in soup.find_all("li", "lvformat")]
# grab all the prices and store those in a list
prices = [e.contents[0] for e in soup.find_all("span", "bold bidsold")]
# zip each entry out of the lists we generated before in order to combine the entries
# belonging to each other and write the zipped elements to a list
l = [e for e in zip(dte, titles, links, prices, bids)]
# write each entry of the rowlist `l` to the csv output file
with open('%s.csv' % phrase, 'wb') as csvfile:
w = csv.writer(csvfile)
for e in l:
w.writerow(e)
@Ross24
Copy link

Ross24 commented Oct 16, 2019

Did you ever run into the issue of the web scraper, scraping data twice? I am currently trying to figure out what is going wrong with my code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment