Skip to content

Instantly share code, notes, and snippets.

@ahmedshahriar
Created January 26, 2021 11:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ahmedshahriar/03103faa556d38327eb4b48931ba919c to your computer and use it in GitHub Desktop.
Save ahmedshahriar/03103faa556d38327eb4b48931ba919c to your computer and use it in GitHub Desktop.
This snippet will parse NBA player statistics from www.basketball-reference.com website using pandas
# sample Url https://www.basketball-reference.com/leagues/NBA_2021_per_game.html
import pandas as pd
def parse_data(year: str):
url = "https://www.basketball-reference.com/leagues/NBA_" + year + "_per_game.html"
parsed_df = pd.read_html(url, header=0)[0]
parsed_df = parsed_df.drop(parsed_df[parsed_df['Age'] == 'Age'].index) # to remove duplicate headers
parsed_df = parsed_df.fillna(0)
parsed_df = parsed_df.drop(['Rk'], axis=1) # to index(Rk column)
return parsed_df
selected_year = 2011
df_player_stat_dataset = parse_data(str(selected_year))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment