Skip to content

Instantly share code, notes, and snippets.

@pastenague
Last active September 26, 2019 04:08
Show Gist options
  • Save pastenague/e8edd24b07892a8c88238e0e550cea18 to your computer and use it in GitHub Desktop.
Save pastenague/e8edd24b07892a8c88238e0e550cea18 to your computer and use it in GitHub Desktop.

Introduction

If you follow other leagues apart from the Premier League, I'm sure you've wondered what it would be like to play a Fantasy Premier League-esque game for other leagues. Fantasy games for other leagues do exist — La Liga and the Bundesliga have official fantasy games, while the draft-style fantacalcio (invented by Italian journalist Riccardo Albini, who was inspired by NFL fantasy football) is particularly popular among Serie A fans. However, (to the best of my knowledge) none of these fantasy equivalents use exactly the same scoring scheme as Fantasy Premier League does.


Interpretation

The spreadsheet linked above contains estimates of FPL-style fantasy points for every player who started at least one match in at least one season of at least one of the top 5 leagues from the 2014-15 season to the 2018-19 season (12,297 players in total). Calculation of points follows the FPL scheme, as detailed in the "Scoring" section of FPL's rules, with a few exceptions detailed below.

I included some filters for convenience in viewing and interpreting the data. These can be found in the Data > Filter views section of the toolbar. You can create your own filter (for example, Bundesliga MIDs in 16-17) by navigating to: Data > Filter views > Create new temporary filter view.


Method

For another project, I gathered match-by-match data for all top-5-league matches in Understat's database from 2014-14 to 2018-19. I realized that this collection of data could be used to calculate fantasy points using an FPL-style scheme, so I did just that!

Predicted Costs

In the spreadsheet, you may have noticed the columns Start Cost, End Cost, and ΔCost (Cols. O, P, and Q). Start Cost and End Cost are predicted starting and ending costs based on historical FPL cost data (more on that coming). ΔCost is the difference between ending and starting costs.

--

Here's how I calculated the starting and ending costs for each player (feel free to skip this section if you'd like):

First, I obtained historical FPL data from Vaastav's fantastic FPL data repo (full credit to him for that!). Next, I used this data to train 3 simple neural networks:

  1. A NN that, given a player's end-of-season stats, predicts what price the player was most likely to have been assigned at the beginning of that season (i.e., the player's Start Cost).

  2. A NN that, given (1) a player's end-of-season stats and (2) the player's predicted Start Cost, predicts what cost the player is most likely to have at the end of that season (i.e., the player's End Cost).

  3. A NN that, given a player's end-of-season stats, the player's predicted Start Cost, and the player's predicted End Cost, predicts what cost the player is most likely to have at the start of the next season (i.e., the player's Start Cost for the next season).

Here, the "stats" used in the neural network prediction/training were: Position, Minutes, Goals, Assists, Yellows, Reds, Own Goals, Clean Sheets, and Total Points.

--

For every player in the database, here's the process I followed to calculate their predicted costs:

  1. For the player's first season S0 in the database, feed the player's stats for season S0 into NN #1 to predict the player's starting cost for season S0.

  2. Feed the player's stats for season S0 and the player's starting cost for season S0 into NN #2 to predict the player's ending cost for season S0.

  3. If the player played in the next season (S1): feed the player's stats for season S0, the player's starting cost for season S0, and the player's ending cost for season S0 into NN #3 to predict the player's starting cost for season S1.

  4. Repeat steps 1-3 for season S1 and any subsequent seasons.

On the whole, I found these neural networks to be pretty decent at predicting the prices. There are a few cases (for example, van Dijk and Robertson 18-19) where it predicted prices way lower than the actual FPL price assigned to the player, but these are mainly due to the fact that the NNs were blind to the strength of each team — since van Dijk and Robertson had mediocre/average points totals in seasons prior, the NNs saw no reason to price them at £6M last season, even though in real life the fact that Liverpool are a top 6 team influenced their starting prices.

What do you think? I encourage you to have a look for yourself. As far as I'm aware, predicting prices like this hasn't been done before, so I'd be delighted to hear your thoughts on the accuracy of my methods!


Notes

Here's what this data does NOT contain:

  • Bonus Points. I tried doing some rudimentary bonus points calculation using FPL's scheme with the data I had (which was possible since I could allocate bonus points on a match-by-match basis), but since Understat only supplies offensive stats, the bonus points were being weighted extremely heavily (i.e., like 5 times more) towards forwards and there were tons of ties that I couldn't break because there weren't enough underlying stats to distinguish performances (e.g., pass completion, tackles, errors) apart from goals and assists.
  • Goalkeeper Stats. Understat does not supply any defensive stats, so goalkeepers' points are only a function of their goals, assists, minutes played, cards, and clean sheets. Saves (including penalty saves) are not included in the data.
  • Penalty Misses. In the Match Events section of each match in Understat's database, penalty goals/misses are specified, but penalty misses are not included in their player data for each match. 15-16 Messi rejoices!
  • "FPL Assists". FPL awards assists for winning a penalty or free-kick, and rebounds off the post to a goalscorer, among other occasions.

A few other important notes about the data:

  • Player position for each season is based on their position in that season, not the season beforehand. The fantasy position for each player in a season is assigned based on how often they played in each position in the same season. You might have noticed that Mohamed Salah (Liverpool, 2017-18) is listed as a FWD even though he was actually a MID in FPL 17-18; this was because he played more as a FWD in 17-18 than he did as a MID.
  • In regards to goals conceded, each player effectively plays the whole match (regardless of whether they were substituted in/out). Since the times of each goal scored are not included in Understat's match player data, each player is penalized for conceding more than 2 goals even if they came on as a substitute after those goals were scored. Case in point: Diego Rico (AFC Bournemouth, 18-19) ended up with a total score of -1 because Bournemouth conceded so many goals (19) in the 12 appearances he made, even though he was only on the pitch for a handful of them. This also means that players who were substituted off after the 60th minute of a match with no goals conceded lost their clean sheet if their team conceded a goal afterwards.

"Dream Teams"

The tables below contain images of the "dream teams" (i.e., teams that score the maximum possible points) for all the seasons of all the leagues examined in the spreadsheet. These work similarly to the FPL overall dream team. Each value in the table below is the total points scored by that dream team.

I've listed 3 types of dream teams for each season/league. First, a dream team where the price of the players selected doesn't matter — we're only looking to maximize points scored (this is how the FPL dream teams work). Second, a dream team where the total starting cost of all the players selected is no more than €83.0 (since €17.0 is required to afford the cheapest possible bench players). Third, a dream team where the total ending cost of all the players selected is no more than €83.0. I think it's interesting to see the variations across all the elagues and seasons.

Unlimited Budget:

2014-15 2015-16 2016-17 2017-18 2018-19 All Seasons
Bundesliga 1563 1631 1587 1481 1660 1873
La Liga 1939 1905 1691 1686 1706 2164
Ligue 1 1677 1717 1681 1767 1734 2125
Premier League 1714 1738 1847 1823 1848 2058
Serie A 1579 1674 1769 1823 1602 1959
All Leagues 2141 2136 2000 2093 2052 2432

Maximum Starting Budget €83.0:

2014-15 2015-16 2016-17 2017-18 2018-19 All Seasons
Bundesliga 1563 1631 1586 1481 1660 1873
La Liga 1922 1872 1673 1676 1706 2149
Ligue 1 1677 1717 1681 1767 1734 2125
Premier League 1708 1738 1847 1823 1848 2058
Serie A 1579 1674 1769 1823 1602 1959
All Leagues 2090 2136 1996 2092 2052 2432

Maximum Ending Budget €83.0:

2014-15 2015-16 2016-17 2017-18 2018-19 All Seasons
Bundesliga 1555 1631 1573 1481 1660 1848
La Liga 1880 1839 1660 1676 1706 2084
Ligue 1 1672 1717 1681 1767 1734 2125
Premier League 1702 1738 1841 1809 1848 2047
Serie A 1579 1674 1769 1823 1602 1959
All Leagues 2014 2098 1976 2049 2052 2340

Thanks for reading! Hope you enjoyed browsing the spreadsheet and the dreamteams. Let me know if you have any questions.

I drew some inspiration from some previous looks at how Lionel Messi would have fared in the Premier League so thanks to the users behind those posts as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment