Skip to content

Instantly share code, notes, and snippets.

@alaakh42
Created October 2, 2018 22:16
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save alaakh42/c1185c9eef19ec18b24cae4bae61705d to your computer and use it in GitHub Desktop.
Save alaakh42/c1185c9eef19ec18b24cae4bae61705d to your computer and use it in GitHub Desktop.
df = pd.DataFrame({'Team': first_col,
'Summary': summary,
'History': history,
'Team_Page': teams_links,
'Location': locations,
'Stadium': stadiums,
'Stadiums_Capcity': stadiums_capcity
})
# crawling the history of those three clubs were somehow tricky so I had to hard code the section names myself
df.loc[df['History'].isnull(),'History'] = wikipedia.WikipediaPage(u'FC Barcelona').section(u'1899–1922: Beginnings'),\
wikipedia.WikipediaPage(u'CD Leganés').section(u'History'),\
wikipedia.WikipediaPage(u'Valencia CF').section(u'History')
#construct a different variation to the team/ club name
Team_alt = pd.Series(['Deportivo Alaves',
'Athletic Bilbao',
'Atletico Madrid' ,
'Barca',
'RC Celta de Vigo',
'SD Eibar',
'RCD Espanyol',
'Getafe CF',
'Girona FC',
'SD Huesca',
'Leganes',
'Levante UD',
'Rayo Vallecano',
'Real Betis',
'Real Madrid CF',
'Real Sociedad',
'Sevilla FC',
'Valencia CF',
'Real Valladolid',
'Villarreal CF'])
df['Team_alt'] = Team_alt.values
df.to_csv("data/Teams_data.csv", encoding="utf-8", index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment