Just How Unpredictable Is The English Premier League?
With just over a month remaining in the 2013/14 season there is still all to play for in the Premier League. The league title, European qualification and the relegation battle all look like going right down to the wire. Many commentators are calling this “the most unpredictable season ever” and we often hear the Premier League referred to as “the most unpredictable league in the World”.
Never being one to take a commentator’s word for something I wanted to discover if this is really the case.
Just how ‘unpredictable’ is the Premier League?
What do we even mean by ‘unpredictable’? Can we measure it?
Furthermore, is there an ideal level of ‘unpredictability’ or ‘competitiveness’ for a league?
How Can We Measure Unpredictability?
Fortunately there are companies for whom it is their job to accurately predict sporting events – bookmakers. The Football Data website records match statistics and pre-match bookmaker odds for thousands of football matches across Europe every season.
How Accurate Are Bookmaker Predictions?
The website Kaggle runs competitions for predictive modelling of many scenarios including sporting events. Recently they ran a competition to predict the outcomes of US College Basketball matches during March Madness. Kaggle evaluated entries using the Binomial Deviance method and I will use the same scoring system here. Hopefully this isn't as complicated as it sounds. 'Binomial' just describes the way matches are evaluated on a scale from 0 to 1 (1 for a home win, 0 for an away win) and 'deviance' just means we will measure by how much our predicted outcome deviates from the actual match outcome.
The difference between the forecast outcome and the actual outcome is measured in terms of the log-loss between the two. The smaller the log-loss the more accurate the predictions are considered to be. The idea here is that a very confident prediction that is incorrect is ‘punished’ more than a less confident pick would be. This is perhaps best shown with an example:
Example: Liverpool vs Tottenham Hotspur (30th March 2014)
Liverpool were strongly favoured to win this match. The average bookmaker odds were:
Home Win - 1.45 Draw - 4.65 Away Win - 6.76
Bookmakers odds represent the percentage chance each game is expected to end in a home win, draw or away win so can be easily converted to the 0 to 1 scale (a drawn match is scored as 0.50). The expected 'score' for this match from the bookmakers odds is therefore:
Expected 'match score': 0.757 [Please see comments section below for a full explanation of this calculation]
Liverpool did win as expected (actual 'match score' of 1.000) so the resultant log-loss was small: 0.278
If the match had been drawn ('match score' 0.500) the log-loss would have been larger: 0.847
If Spurs had pulled off a shock win ('match score' 0.000) the log-loss would have been very large: 1.416
How (In)Accurate Are Bookmaker Forecasts?
Now we have a method for evaluating predictions we can produce the following chart:
[All data correct up to and including 1st April 2014]
This chart shows the average per match log-loss of pre-match bookmaker odds for the last 5 seasons of the EPL (remember the smaller the number the more accurate the predictions). It actually seems that the ‘predictability’ of the Premier League has remained pretty consistent of this period.
If anything, this season has actually been the 2nd ‘easiest’ to predict in the last five years.
Further details are below:
2013/14 = 0.591 per match Biggest Upset: Man Utd 1-2 West Brom (1.724 log-loss)
2012/13 = 0.603 per match Biggest Upset: Chelsea 0-1 QPR (1.945)
2011/12 = 0.623 per match Biggest Upset: Man Utd 2-3 Blackburn (2.290)
2010/11 = 0.635 per match Biggest Upset: Arsenal 2-3 West Brom (1.948)
2009/10 = 0.583 per match Biggest Upset: Tottenham 0-1 Wolves (1.770)
What is Happening Here?
Technically our scoring system is a measure of how 'inaccurate' the bookmaker predictions are. The smallest log-loss scores result from very confident predictions that prove to be correct (i.e heavy favourites that go on to win their matches). Although the 13/14 title race remains unpredictable, in reality there have actually been very few genuine ‘upsets’ this season. The top teams have all been very consistent and have largely beaten the teams they are expected to. The biggest upsets have been Manchester United losing at home to West Brom (log-loss 1.724), Everton losing at home to Sunderland (1.588) and Chelsea losing away at Crystal Palace (1.525).
Towards the end of the recent Liverpool against Sunderland match the Sky Sports co-commentator Alan Smith described Sunderland’s pretty disappointing (and ultimately unsuccessful) second half comeback as something along the lines of “What makes this league so great”.
Is this really the ideal level of unpredictability for a league?
How Does The Premier League Compare To Other Leagues?
This table represents the same measure for the current 13/14 season for every league that is covered by Football Data (again, the smaller the number the more ‘predictable’ the league).
[All data correct up to and including 1st April 2014]
This table suggests that the Premier League is actually one of the more ‘predictable’ leagues around Europe? What might be causing this?
Is it possible that it is actually easier for bookmakers to set odds on some leagues than it is on others? It is certainly possible that there is some truth in this. Several of the leagues with the most accurate odds are also those that are the most covered in the media (EPL, Serie A, La Liga) and have the most information available. In contrast, I don't think there aren’t too many odds compilers who specialise in the Scottish lower leagues. Does this mean we should all start betting on the Bundesliga Two? I won't be rushing to do so just yet. I think any differences here are still very small and that this method should rather be considered as an interesting way to highlight differences in the competitive shape and balance of competitions.
For many of the leagues studied there appears to be an inverse relationship between how predictable the matches are and how competitive the league is. For example the leagues with the lowest average log loss include the SPL and Scottish Division One where Celtic and Rangers have already clinched the respective titles with a month to spare. The most predictable league is the Greek Superleague which has been won by the same team for the last 4 seasons. This method is still the best we have for evaluating competition ‘predictability’.
If we consider this a useful measure of predictability then it is surely also a useful measure of the ‘competitiveness’ of a competition.
Why might the Premier League have a lower score than the Bundesliga? Although Bayern Munich has romped clear in Germany, below them the league has been very competitive. As mentioned, in the Premier League the top 4 teams have all been consistently excellent (the top 5 have only 4 home defeats between them all season). The title races remains open but it is widely accepted that it will probably be decided by the two games Liverpool play against Manchester City and Chelsea.
Does this mean commentators should be more careful what they describe as unpredictable? For the EPL it seems fair to say the title race is unpredictable but in general it is not actually one of the more unpredictable leagues.
Is the Premier League actually not competitive enough?
Is There An Ideal Level Of Predictability For A League?
The question of how competitive we might want the league to be is an important one and has implications for a wide range of decisions, in particular with regard to revenue distribution from the leagues lucrative media contracts. Many of the leagues that we have seen to be the most ‘predictable’ are also those that have very uneven financial structures. In contrast, the major US sports leagues such as the NFL and MLB openly engineer greater competition through the use of salary caps and draft systems.
Yet is it really desirable to have a league where ‘anyone can beat anyone’? Does this mean every team is as good as each other? Or does this just mean every team is as bad as each other?
Before we get too excited and start speculating about revenue redistribution it is important to remember that the best Premier League clubs are also those that represent English football in UEFA competitions such as the Champions League. This is not a consideration for any of the major US sports as they do not have to compete with other leagues overseas. This season only 2 English teams have made the quarter finals and neither are favourites to progress. Interestingly, the favourites to win the Champions League (Bayern Munich, Barcelona, Real Madrid, PSG) are all sides who compete is seemingly lop-sided domestic competitions (see above).
Is there an optimal balance to be sought between the competitiveness of a league competition and the opportunity it affords its best teams to build squads to rival the best in Europe?
I admit my premise was a little facetious – I do not actually think the EPL is too predictable and actually think this has been the most interesting Premier League season for a long time. I am sure plenty of football fans in other leagues are envious of such a close finish in prospect. Also, I noted that only two of our sides are in the quarter finals but Manchester City and Arsenal didn’t exactly disgrace themselves – coming up against the 2 best sides in Europe and some unfortunate refereeing decisions.
Yet I do think there are some important issues to look at in terms of what it actually means to have a competitive league. Should competitiveness be ‘engineered’? What if this is to be at the expense of the performance of our sides in Europe? If this season is representative of the future then I think the current balance between the league and European performance is about right but this doesn’t mean we should be complacent.
And it definitely doesn't mean the Premier League is ‘the most unpredictable league in the world’.