Skip to content

Instantly share code, notes, and snippets.

@gastrodon
Last active October 12, 2019 20:17
Show Gist options
  • Save gastrodon/bdff01873d94893630c38334a61f800b to your computer and use it in GitHub Desktop.
Save gastrodon/bdff01873d94893630c38334a61f800b to your computer and use it in GitHub Desktop.

Walmart seems to use a few primary API's. No idea why. They are

  • https://walmart.com/
  • http://mobile.walmart.com/
  • http://quimby.mobile.walmart.com/

Searching for items by keyword

If you're looking for the price of some item, you can start with the search API.

The minimum headers should look like this

GET                /preso/search HTTP/2.0
Host	           www.walmart.com

The minimum URL parameters should look like this

  • query=item to search for
  • instore=true for instore, false for online
  • page=page index for paginated searches

To search for a Nintendo Switch, a python example might look like this

import requests

params = {
    'query': 'Nintendo Switch',
    'instore': True,
    'page': 1
}

# response will contain a JSON object with the call results
response = requests.get('https://www.walmart.com/preso/search', params = params)

Specifying sotre id and zip

If you already know the location or id number of a store, you can specify those in your search URL parameters.

For example to search within store 4132, add these URL parameters

  • pref_store=4132

To search within stores 4132, 5603, and 2110, add this parameter

  • pref_store=4132,5603,2110

To search within all stores near the zip code 01834, add this parameter

  • zipcode=01834

Initializing a device instance

If you search with instore=true, the store that it will search in is determined by your ip address. If you want to search in stores in some other locations, you need to set up a few things.

First, generated 4 new UUIDs. The spec for that can be found here, but most languages will have libraries to generate one for you. Basically, it's a unique ID.

One of them we'll call device_id, one will be session_id, one will be vid, one will be request_id. Also keep your zip code handy, which will be referred to as your zip

When those are ready, you need to initialize it in Walmart's database with their API.

headers:

POST            /init
Host            quimby.mobile.walmart.com
did             device_id
cookie:         sid=sid UUID
cookie:         vid=vid

POST data:

{
    "expo.sid": "session_id",
    "tempo.pageType": "MobileHomescreenV2",
    "deviceVersion": "19.37.2",
    "deviceType": "android",
    "tempo.location": {
        "zipCode": "your zip",
        "isZipLocated": true
    },
    "tempo.targeting": {
        "userAccountStatus": "loggedOut",
        "userLocation": "outOfStore"
    },
    "tempo.pl3n": {
        "reqId": "request_id",
        "pageOffset": 0
    },
    "tempo.userInfo": {
        "vid": "vid"
    }
}

To generate 4 random UUID's and initialize them, a python example might look like this

import requests, uuid

# Generate 4 random UUID strings, and unpack them into 4 variables
session_id, vid, device_id, request_id = [str(uuid.uuid4()) for _ in range(4)]
zip_code = '01834'

data = {
    'expo.sid': session_id,
    'tempo.pageType': 'MobileHomescreenV2',
    'deviceVersion': '19.37.2',
    'deviceType': 'android',
    'tempo.location': {
        'zipCode': zip_code,
        'isZipLocated': True
    },
    'tempo.targeting': {
        'userAccountStatus': 'loggedOut',
        'userLocation': 'outOfStore'
    },
    'tempo.pl3n': {
        'reqId': request_id,
        'pageOffset': 0
    },
    'tempo.userInfo': {
        'vid': vid
    }
}

headers = {
    'did': device_id,
    'cookie': f'vid={vid};sid={session_id}'
}

# response will contain a lot of data that we don't care about. As long as the status_code is 200, your "device" should be ready
response = requests.post('https://quimby.mobile.walmart.com/init', data = data, headers = headers)

Setting preferred store location

You can now use your device_id, session_id, and vid together in requests to set your location

headers:

POST            /account/api/location
Host            www.walmart.com
did             device_id
cookie:         sid=sid UUID
cookie:         vid=vid

POST data:

{
	"responseGroup": "STOREMETAPLUS",
	"includePickUpLocation": true,
	"postalCode": "your zip"
}

You'll also want to include skipCache=true in your URL parameters to force updated info.

The response to this POST request will contain information about the city of that zip code, as well as a list of nearby stores. Each payload of store data will look similar to this

{
    "storeId": "1930",
    "types": [
        "gsf_store"
    ],
    "distance": 7.04,
    "buId": "0",
    "id": 1930,
    "displayName": "Plaistow Store",
    "storeType": {
        "id": 2,
        "name": "Walmart",
        "displayName": "Store"
    },
    "address": {
        "postalCode": "03865",
        "address1": "58 Plaistow Rd",
        "city": "Plaistow",
        "state": "NH",
        "country": "US"
    },
    "phone": "603-382-2839",
    "operationalHours": {
        "saturdayHrs": {
            "startHr": "07:00",
            "endHr": "22:00"
        },
        "sundayHrs": {
            "startHr": "07:00",
            "endHr": "22:00"
        },
        "monToFriHrs": {
            "startHr": "07:00",
            "endHr": "22:00"
        }
    },
    "geoPoint": {
        "latitude": 42.8255173,
        "longitude": -71.1111482
    },
    "timeZone": "EST",
    "openDate": "02/02/1993 00:00",
    "detailsPageURL": "http://www.walmart.com/store/1930",
    "kiosk": false,
    "deleted": false,
    "gecOrgId": "9b4590a5-994b-4a61-a597-f227295a760b",
    "market": "184",
    "rdcNo": 6030,
    "newRdcNo": 0
}

The most important data here is the store's id and it's address

Finally, you should update your cart with acceptable store locations. This will tell Walmart search what stores to search for items in.

headers:

POST            /api/v3/cart/{device_id}
Host            www.walmart.com
did             device_id
cookie:         sid=sid UUID
cookie:         vid=vid

The POST data will contain a key storeIds. This will be an array (or list) of store id's. In this example I've included 3, but this list can be of any length (as far as I've seen). It will also contain a key for location, the format of which is the same as what's retuned in the previous call (albiet with less information)

{
	"currencyCode": "USD",
	"location": {
		"postalCode": "01834",
		"state": "MA",
		"country": "USA",
		"isZipLocated": true
	},
	"storeIds": [1930, 3491, 2142]
}

To set the preferred store id's to the 3 most relevant stores, a python example might look like this (including values from the last example)

data = {
    'responseGroup': 'STOREMETAPLUS',
    'includePickUpLocation': True,
    'postalCode': zip_code
}

response = requests.post('https://www.walmart.com/account/api/location', data = data, headers = headers)

# store each retuned store id in a list
store_ids = [store['id'] for store in response.json()['stores']]

data = {
    'currencyCode': 'USD',
    'location': {
        'postalCode': zip_code,
        'state': 'MA',
        'country': 'USA',
        'isZipLocated': True
    },
    'storeIds': store_ids[:3]
}

# POST the constructed data to the cart of this device_id
requests.post(f'https://www.walmart.com/api/v3/cart/{device_id}', data = data, headers = headers)

You can also POST a preferred store location

headers:

POST            /account/api/location/preferredstore
Host            www.walmart.com
did             device_id
cookie:         sid=sid UUID
cookie:         vid=vid

POST data:

{
	"postalCode": "01834",
	"preferredStoreId": "3491",
	"responseGroup": "STOREMETAPLUS",
	"includePickUpLocation": true
}

To do this in python, an example might look like this (again, with previous data set already):

data = {
    'responseGroup': 'STOREMETAPLUS',
    'postalCode': zip_code,
    'preferredStoreId' : str(store_ids[2]), # keep in mind that store ids are returned as integers, so here we cast to a string
    'includePickUpLocation': True
}

requests.post('https://www.walmart.com/account/api/location/preferredstore', data = data, headers = headers)

Now that you've done that your www.walmart.com/preso/search instore results will be based in the stores that you've chosen so long as you're including appropriate did and cookie values in your headers


Getting prices from search results

Let's go back to search. Now that you're searching for items in a store or region of stores, the retuned data will be relevant to you. When you search for an item, you will get a number of results. Every result will look similar to this

{
    "productId": "5T46E4NG6PS1",
    "usItemId": "709776123",
    "productType": "REGULAR",
    "title": "<mark>Nintendo</mark> <mark>Switch</mark> Console with Neon Blue & Red Joy-Con",
    "description": "<li><mark>Nintendo</mark> <mark>Switch</mark> is a unique hybrid system that blurs the line between console gaming and mobile play</li><li>Play on your TV while docked or as a handheld</li><li>Innovative Joy-Con controllers</li>",
    "esrb": "Unrated",
    "imageUrl": "https://i5.walmartimages.com/asr/afdb71df-4810-4e3e-9c3c-187e88a98619_1.9abc0f91d776fcbf0b8b580e875ed6c0.jpeg",
    "productPageUrl": "/ip/Nintendo-Switch-Console-with-Neon-Blue-Red-Joy-Con/709776123",
    "department": "Video Games",
    "customerRating": 4.6,
    "numReviews": 751,
    "specialOfferBadge": "bestseller",
    "specialOfferText": "Best Seller",
    "specialOfferLink": "query=nintendo%20switch&sort=best_seller&cat_id=2636_4646529_2002476&stores=7181&prg=mWeb",
    "sellerId": "F55CDC31AB754BB68FE0B39041159D63",
    "sellerName": "Walmart.com",
    "preOrderAvailableDate": "1564272000000",
    "launchDate": "28-JUL-2019",
    "enableAddToCart": true,
    "canAddToCart": false,
    "showPriceAsAvailable": true,
    "highlightedTitleTerms": [
        "Nintendo",
        "Switch"
    ],
    "highlightedDescriptionTerms": [
        "Nintendo",
        "Switch"
    ],
    "seeAllName": "Nintendo Switch Consoles",
    "seeAllLink": "query=nintendo%20switch&cat_id=2636_4646529_2002476&stores=7181&prg=mWeb",
    "itemClassId": "1",
    "primaryOffer": {
        "offerId": "C912398974614966BBEE368B7AD005A8",
        "offerPrice": 299,
        "currencyCode": "USD"
    },
    "fulfillment": {
        "isSOI": false,
        "isPUT": false
    },
    "inventory": {
        "status": "In Stock",
        "available": true
    },
    "quantity": 9999,
    "brand": [
        "Nintendo"
    ],
    "location": {
        "aisle": [
            "K.38"
        ],
        "detailed": [
            {
                "zone": "K",
                "aisle": 38,
                "section": 2
            }
        ]
    },
    "blitzItem": false,
    "marketPlaceItem": false,
    "shippingPassEligible": false,
    "pickupDiscountEligible": false,
    "preOrderAvailable": false,
    "virtualPack": false,
    "is_limited_qty": false
}

The price of this item is retuned in the field

  • primaryOffer, which contains the current price of the item in that store

or

  • prices, which contains the current price

I don't know the reason for the inconsistent information

The product number is retuned in the fields

  • productId, which is an international id to look that item up
  • usItemId, which is the US id to look that item up

To get the price of the first search result for the item Nintendo Switch from store number 4132, a python example might look like this

params = {
    'stores': '4132',
    'query': 'Nintendo Switch',
    'instore': True
}

response = requests.get('https://www.walmart.com/preso/search', params = params)

first = response.json()['items'][0]

if 'prices' in first.keys():
    print(first['prices'].get('current', {'no current price'}))
elif 'primaryOffer' in first.keys():
    print(first['primaryOffer'].get('offerPrice', {'no current primary offer'}))
else:
    print('no price was found')
  • If there is a field prices, the current price will be returned
  • If instead there is a field primaryOffer, the current primaryOffer will be returned
  • If neither field is present, there is no listed price

You can also search for items by their SKU or UPC number, which will return a single result with that item. That being said, not every store will have results for some query if the store does not sell that item specifically.


Getting online item data

Walmart uses graphql to store item data. That can be fetched by us with a POST request. Keep in mind that Content-Type is an important header in this request

headers:

POST            /terra-firma
Host            www.walmart.com
Content-Type    application/json; charset=utf8

URL parameters

  • id=ProductSubQuery-android

To look up an item by the sku number 124619757, POST data might look like this. Keep in mind that the variables field is a string containing json data. This data must also be dumped into a string.

{
    "variables": "{\"productId\":\"124619757\"}"
}

To get an item with the SKU 124619757, a pyhton example might look like this

sku = '124619757'

variables = json.dumps({
    'productId': sku
})

data = {
    'variables': variables
}

params = {
    'id': 'ProductSubQuery-android'
}

headers = {
    'Content-Type': 'application/json; charset=utf8'
}

response = requests.post('https://www.walmart.com/terra-firma/graphql', data = json.dumps(data), params = params, headers = headers)

#store the important data returned in the variable item
item = response.json()['data']['idmlByProductId']

The variable item now contains json data about this item, which can be used to search for items in specific walmart stores. The field specifications contains most of the searchable metadata of this item, while shortDescription contains a short description of this item that may be queried.


Summary

Brickseek appears to be using Walmart's /ip/ endpoint in tandem with it's search API. The /ip/ is used to find some specific product's name, and the search API can be used to find in-store availability for some item. This is the information needed to find the price of some item for some specific Walmart location.

To impliment this for automation, it's going to take a lot of time to tweak and fine-tune scraping to get acurate results. It might also be worth considering scraping Brickseek's site, as they've already done the hard part.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment