Skip to content

Instantly share code, notes, and snippets.

@JonLim
Created Mar 18, 2014
Embed
What would you like to do?
The simple Python script I use to scrape the Steam & Game Stats page (http://store.steampowered.com/stats/) that parses the HTML served by the server, assigns the required data to the right variables, and adds a line into two separate CSVs that hold the data.
#!/usr/bin/env python
import urllib
import time
from datetime import datetime
from bs4 import BeautifulSoup
steampage = BeautifulSoup(urllib.urlopen('http://store.steampowered.com/stats/?l=english').read())
timestamp = time.time()
currentTime = datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S')
top100CSV = open('SteamTop100byTime.csv', 'a')
for row in steampage('tr', {'class': 'player_count_row'}):
steamAppID = row.a.get('href').split("/")[4]
steamGameName = row.a.get_text().encode('utf-8')
currentConcurrent = row.find_all('span')[0].get_text()
maxConcurrent = row.find_all('span')[1].get_text()
top100CSV.write('{0},{1},"{2}","{3}","{4}"\n'.format(currentTime, steamAppID, steamGameName, currentConcurrent, maxConcurrent))
top100CSV.close()
steamOverallCSV = open('SteamOverallbyTime.csv', 'a')
for row in steampage('div', {'class': 'statsTop'}):
steamOverallCurrentConcurrent = row.find_all('span')[0].get_text()
steamOverallMaxConcurrent = row.find_all('span')[1].get_text()
steamOverallCSV.write('{0},"{1}","{2}"\n'.format(currentTime, steamOverallCurrentConcurrent, steamOverallMaxConcurrent))
steamOverallCSV.close()
@vcasadei

This comment has been minimized.

Copy link

@vcasadei vcasadei commented Oct 18, 2014

JonLim, this is really useful. Now, I wanna collect steam data for some time, let's say for one or two days.

I tried to adapt you code, but couldn't get the result I wanted. Do you know how I could monitor the steam data and store it for some time, so I can run an analysis on it?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment