Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Proof of Concept task to get ticket prices and event info using StubHub's API with Python
# Quick intro to accessing Stubhub API with Python
# Ozzie Liu (ozzie@ozzieliu.com)
# Related blog post: http://ozzieliu.com/2016/06/21/scraping-ticket-data-with-stubhub-api/
# Updated 3/5/2017 for Python 3 and Stubhub's InventorySearchAPI - v2
import requests
import base64
import json
import pprint
import pandas as pd
import datetime
#### Step 1: # Obtaining StubHub User Access Token ####
## Enter user's API key, secret, and Stubhub login
app_token = input('Enter app token: ')
consumer_key = input('Enter consumer key: ')
consumer_secret = input('Enter consumer secret: ')
stubhub_username = input('Enter Stubhub username (email): ')
stubhub_password = input('Enter Stubhub password: ')
## Generating basic authorization token
combo = consumer_key + ':' + consumer_secret
basic_authorization_token = base64.b64encode(combo.encode('utf-8'))
# print(basic_authorization_token)
## POST parameters for API call
headers = {
'Content-Type':'application/x-www-form-urlencoded',
'Authorization':'Basic '+basic_authorization_token.decode('utf-8'),}
body = {
'grant_type':'password',
'username':stubhub_username,
'password':stubhub_password,
'scope':'PRODUCTION'}
## Making the call
url = 'https://api.stubhub.com/login'
r = requests.post(url, headers=headers, data=body)
token_respoonse = r.json()
access_token = token_respoonse['access_token']
user_GUID = r.headers['X-StubHub-User-GUID']
#### Step 2 - Searching inventory for an event ####
inventory_url = 'https://api.stubhub.com/search/inventory/v2'
headers['Authorization'] = 'Bearer ' + access_token
headers['Accept'] = 'application/json'
headers['Accept-Encoding'] = 'application/json'
## Enter event ID
eventid = '9837052'
data = {'eventid':eventid}
## GET request and change to Pandas dataframe
inventory = requests.get(inventory_url, headers=headers, params=data)
inv = inventory.json()
listing_df = pd.DataFrame(inv['listing'])
## Getting ticket prices out of the currentPice column of dicts - assuming they're all in USD
listing_df['amount'] = listing_df.apply(lambda x: x['currentPrice']['amount'], axis=1)
## Export to a csv file
listing_df.to_csv('tickets_listing.csv', index=False)
#### Step 3 - Adding Event and Venue Info ####
## Calling the eventsearch api
info_url = 'https://api.stubhub.com/catalog/events/v2/' + eventid
info = requests.get(info_url, headers=headers)
# pprint.pprint(info.json())
info_dict = info.json()
event_date = datetime.datetime.strptime(info_dict['eventDateLocal'][:10], '%Y-%m-%d')
event_name = info_dict['title']
event_date = info_dict['eventDateLocal'][:10]
venue = info_dict['venue']['name']
snapshot_date = datetime.datetime.today().strftime('%m/%d/%Y')
listing_df['snapshotDate'] = snapshot_date
listing_df['eventName'] = event_name
listing_df['eventDate'] = event_date
listing_df['venue'] = venue
my_col = ['snapshotDate','eventName','eventDate', 'venue', 'sectionName', 'row',
'seatNumbers', 'quantity', 'deliveryTypeList', 'amount']
final_df = listing_df[my_col]
## Exporting final report
final_df.to_csv('tickets_listing_with_info.csv', index=False)

Cool concept, interested in playing with this. I'm getting stuck here -- any thoughts?

Traceback (most recent call last):
  File "calling_stubhub_api.py", line 50, in <module>
    inv = inventory.json()
  File "/usr/local/lib/python2.7/dist-packages/requests/models.py", line 812, in json
    return complexjson.loads(self.text, **kwargs)
  File "/usr/lib/python2.7/dist-packages/simplejson/__init__.py", line 488, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python2.7/dist-packages/simplejson/decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "/usr/lib/python2.7/dist-packages/simplejson/decoder.py", line 389, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

I had a similar problem with https://api.stubhub.com/search/inventory/v1. I had 404 response from this API.
It seems they updated to https://api.stubhub.com/search/inventory/v2, however, I couldn't figure out how to subscribe to that.

gllisch commented Dec 27, 2016

Has anyone figure out the ../search/inventory/v1 issue? I'm getting a 404 as well and can't find a way to return valid data

Having the same issue with a 404... I think we need to be subscribed to the V2, but cant figure out how

jeremycolon24 commented Jan 9, 2017

Found this SO thread. Seems like you have to email their API support to get access.

http://stackoverflow.com/questions/40616679/stubhub-api-url-for-inventory-listings

UPDATE:
Emailed API support - they got back to me pretty quickly and subscribed me to their new API.

Owner

ozzieliu commented Mar 5, 2017

Hey folks, glad you found this gist somewhat helpful. As some people have pointed out, Stubhub had updated their InventorySearchAPI to v2 and deprecated v1 which my original code was based on.

I just updated the gist using Python 3 and the InventorySearchAPI-v2

If you're having trouble subscribing to InventorySearchAPI-v2, try this link after you've logged in: https://developer.stubhub.com/store/apis/info?name=InventorySearchAPI&version=v2&provider=runiu&category=Search&api=InventorySearchAPI.

It looks like the subscribing to "All APIs" current does not include the new InventorySearch API, so I've had to request for it manually.

I'll try and monitor this gist more often as well.

cliftonbaker81 commented May 31, 2017

Thanks for this code! I was able to use it fairly successfully to look at a few specific events. However, one thing I've noticed is that it doesn't seem to be picking up all tickets for sale. For instance, on Stubhub, there is a concert scheduled for next week, and the site claims there are about 1,226 tickets available. I ran this routine on this specific concert, and I'm only getting back a quantity of around 836. I updated the 'rows' parameter on line 52 to be something like 5,000, so data shouldn't be getting cut off. I do notice that the listing price in my output tops out at $225 for this particular concert, but with a cursory glance, there are definitely more expensive tickets for sale still. Have you run into this issue?

Edit: I just ran this on another event with over 1,000 tickets for sale, and no matter how many rows I put as a parameter on line 52, my output is still the same at about 250 lines of data. It essentially starts with the lowest priced ticket and just keeps climbing until line 250, and then stops, leaving out any data on higher priced tickets. Any way to get around this and increase the output?

Edit: Sorry, one more thing. I was able to get most/all of the listings by using the 'pricemin' parameter on line 52, which allows me to get higher priced listings. Unfortunately, for now, I think I have to do multiple pulls to get all data on an event. I used the API documentation to find this out.

Yes, I had the same issue. It appears to max out at 250 individual listings. Is there a way to get around this and retrieve all tickets at once?

Owner

ozzieliu commented Jun 28, 2017

@cliftonbaker81 @fgscivittaro From what I've seen, their API does have a maximum number of 250 listings imposed per query. One workaround is to make several queries by paging through all the listings. You can look at totalListings and then vary the start, end or rows parameters to get all the listings, 250 at a time.

I've also found that making the same GET query with a browser actually returns all the results. So might be able to explore changing the user-agent in requests. Try it at your own risk of course.

@ozzieliu @fgscivittaro: I added some crude while looping to increment the starting point of the listings based on the total number of listings. This worked for me, retrieving all 1955 listings into a single spreadsheet. Your mileage may vary.

import requests
import base64
import json
import pprint
import pandas as pd
import datetime
import logging

logger = logging.getLogger("root")
logging.basicConfig(
    format = "\033[1;36m%(levelname)s: %(filename)s (def %(funcName)s %(lineno)s): \033[1;37m %(message)s",
    level=logging.DEBUG
)

APP_TOKEN = ''

CONSUMER_KEY = ''

CONSUMER_SECRET = ''

STUBHUB_USERNAME = ''

STUBHUB_PASSWORD = ''

# generating basic authorization token
COMBO = CONSUMER_KEY + ':' + CONSUMER_SECRET

BASIC_AUTHORIZATION_TOKEN = base64.b64encode(COMBO.encode('utf-8'))

HEADERS = {
    'Content-Type':'application/x-www-form-urlencoded',
    'Authorization':'Basic '+BASIC_AUTHORIZATION_TOKEN.decode('utf-8'),
}

BODY = {
    'grant_type': 'password',
    'username': STUBHUB_USERNAME,
    'password': STUBHUB_PASSWORD,
    'scope': 'PRODUCTION'
}

DATA = {
    'eventid': '',
    'sort': 'quantity desc',
    'start': 0,
    'rows': 100
}


class StuHubEventSearch(object):

    ## making the call
    url = 'https://api.stubhub.com/login'

    inventory_url = 'https://api.stubhub.com/search/inventory/v2'

    def _init(self, *args, **kwargs):
        r = requests.post(self.url, headers=HEADERS, data=BODY)
        token_response = r.json()
        user_GUID = r.headers['X-StubHub-User-GUID']
        HEADERS['Authorization'] = 'Bearer ' + token_response['access_token']
        HEADERS['Accept'] = 'application/json'
        HEADERS['Accept-Encoding'] = 'application/json'
        HEADERS['User-agent'] = "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/525.19 (KHTML, like Gecko) Chrome/1.0.154.53 Safari/525.19"
        inventory = requests.get(self.inventory_url, headers=HEADERS, params=DATA)
        inv = inventory.json()
        rows = inv['rows']
        start = inv['start']
        total_listings = inv['totalListings']
        list_of_tickets = self.make_request_to_url(self.inventory_url, rows, start, total_listings)
        final_df = self.create_and_return_data_frame(list_of_tickets)
        csv_file_name = "_%s_%s_tickets_listing.csv" % (datetime.datetime.today().strftime('%Y_%m_%d'), total_listings)
        final_df.to_csv(csv_file_name, index=False)


    def make_request_to_url(self, url, rows, start, total):
        all_listings = []
        DATA = {
            'eventid': '103128027',
            'sort': 'quantity desc',
            'start': start,
            'rows': 100
        }
        while DATA['start'] < total:
            inventory = requests.get(self.inventory_url, headers=HEADERS, params=DATA)
            inv = inventory.json()
            all_listings = all_listings + inv['listing']
            logger.debug("Retrieving listings starting at %s" % (DATA['start']))
            DATA['start'] = DATA['start'] + inv['rows']
        return all_listings


    def create_and_return_data_frame(self, list_of_tickets):
        listing_df = pd.DataFrame(list_of_tickets)
        listing_df['current_price'] = listing_df.apply(lambda x: x['currentPrice']['amount'], axis=1)
        listing_df['listed_price'] = listing_df.apply(lambda x: x['listingPrice']['amount'], axis=1)
        listing_df['snapshotDate'] = datetime.datetime.today().strftime('%Y-%m-%d %H:%m')
        listing_df['eventName'] = 'World Series'
        listing_df['eventDate'] = '2017-10-24'
        listing_df['venue'] = 'Dodgers Stadium'
        my_col = [
            'snapshotDate',
            'eventName',
            'eventDate',
            'venue',
            'sectionName',
            'row',
            'seatNumbers',
            'quantity',
            'sellerOwnInd',
            'current_price',
            'listed_price',
            'listingId',
        ]
        final_df = listing_df[my_col]
        return final_df


if __name__ == '__main__':
    task_run = StuHubEventSearch()
    task_run._init()
    print "\nTask finished at %s\n" % str(datetime.datetime.now())

Owner

ozzieliu commented Nov 8, 2017

@chrislkeller that's great! Thanks!

sasiegel commented Dec 9, 2017

This has been very helpful, thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment