-
-
Save JonLim/985acb4b8d58fa5b154a to your computer and use it in GitHub Desktop.
The simple Python script I use to scrape the Steam & Game Stats page (http://store.steampowered.com/stats/) that parses the HTML served by the server, assigns the required data to the right variables, and adds a line into two separate CSVs that hold the data.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import urllib | |
import time | |
from datetime import datetime | |
from bs4 import BeautifulSoup | |
steampage = BeautifulSoup(urllib.urlopen('http://store.steampowered.com/stats/?l=english').read()) | |
timestamp = time.time() | |
currentTime = datetime.fromtimestamp(timestamp).strftime('%Y-%m-%d %H:%M:%S') | |
top100CSV = open('SteamTop100byTime.csv', 'a') | |
for row in steampage('tr', {'class': 'player_count_row'}): | |
steamAppID = row.a.get('href').split("/")[4] | |
steamGameName = row.a.get_text().encode('utf-8') | |
currentConcurrent = row.find_all('span')[0].get_text() | |
maxConcurrent = row.find_all('span')[1].get_text() | |
top100CSV.write('{0},{1},"{2}","{3}","{4}"\n'.format(currentTime, steamAppID, steamGameName, currentConcurrent, maxConcurrent)) | |
top100CSV.close() | |
steamOverallCSV = open('SteamOverallbyTime.csv', 'a') | |
for row in steampage('div', {'class': 'statsTop'}): | |
steamOverallCurrentConcurrent = row.find_all('span')[0].get_text() | |
steamOverallMaxConcurrent = row.find_all('span')[1].get_text() | |
steamOverallCSV.write('{0},"{1}","{2}"\n'.format(currentTime, steamOverallCurrentConcurrent, steamOverallMaxConcurrent)) | |
steamOverallCSV.close() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
JonLim, this is really useful. Now, I wanna collect steam data for some time, let's say for one or two days.
I tried to adapt you code, but couldn't get the result I wanted. Do you know how I could monitor the steam data and store it for some time, so I can run an analysis on it?
Thanks.