Just How Unpredictable Is The English Premier League?

With Just over a month remaining in the 2013/14 season there is still all to play for in the Premier League. The league title, European qualification and the relegation battle all look like going right down to the wire. Many commentators are calling this the most unpredictable season ever and we often hear the Premier League referred to as “the most unpredictable league in the World”.

Never being one to take a commentator’s word for something I wanted to discover if this is really the case.

Just how ‘unpredictable’ is the Premier League?

What do we even mean by ‘unpredictable’? Can we measure it?

Furthermore, is there an ideal level of ‘unpredictability’ or ‘competitiveness’ for a league?

How Can We Measure Unpredictability?

Fortunately there are companies for whom it is their job to accurately predict sporting events – bookmakers. The Football Data website records match statistics and pre-match bookmaker odds for thousands of football matches across Europe every season.

How Accurate Are Bookmaker Predictions?

The website Kaggle runs competitions for predictive modelling of many scenarios including sporting events. Recently they ran a competition to predict the outcomes of US College Basketball matches during March Madness. Kaggle evaluated entries using the Binomial Deviance method and I will use the same scoring system here. Hopefully this isn’t as complicated as it sounds. ‘Binomial’ just describes the way matches are evaluated on a scale from 0 to 1 (1 for a home win, 0 for an away win) and ‘deviance’ just means we will measure by how much our predicted outcome deviates from the actual match outcome.

The difference between the forecast outcome and the actual outcome is measured in terms of the log-loss between the two. The smaller the log-loss the more accurate the predictions are considered to be. The idea here is that a very confident prediction that is incorrect is ‘punished’ more than a less confident pick would be. This is perhaps best shown with an example:

Example: Liverpool vs Tottenham Hotspur (30th March 2014)

Liverpool were strongly favoured to win this match. The average bookmaker odds were:

Home Win – 1.45        Draw – 4.65      Away Win – 6.76

Bookmakers odds represent the percentage chance each game is expected to end in a home win, draw or away win so can be easily converted to the 0 to 1 scale (a drawn match is scored as 0.50). The expected ‘score’ for this match from the bookmakers odds is therefore:

Expected ‘match score’: 0.757

[Please see comments section below for a full explanation of this calculation]

Liverpool did win as expected (actual ‘match score’ of 1.000) so the resultant log-loss was small:         0.278

If the match had been drawn (‘match score’ 0.500) the log-loss would have been larger:                          0.847

If Spurs had pulled off a shock win (‘match score’ 0.000) the log-loss would have been very large:        1.416

How (In)Accurate Are Bookmaker Forecasts?

Now we have a method for evaluating predictions we can produce the following chart:

prem seasons

[All data correct up to and including 1st April 2014]

This chart shows the average per match log-loss of pre-match bookmaker odds for the last 5 seasons of the EPL (remember the smaller the number the more accurate the predictions). It actually seems that the ‘predictability’ of the Premier League has remained pretty consistent of this period.

If anything, this season has actually been the 2nd ‘easiest’ to predict in the last five years.

Further details are below:

2013/14     =             0.591 per match                Biggest Upset: Man Utd 1-2 West Brom                 (1.724 log-loss)

2012/13     =             0.603 per match                Biggest Upset: Chelsea 0-1 QPR                            (1.945)

2011/12     =             0.623 per match                Biggest Upset: Man Utd 2-3 Blackburn                   (2.290)

2010/11     =             0.635 per match                Biggest Upset: Arsenal 2-3 West Brom                  (1.948)

2009/10     =             0.583 per match                Biggest Upset: Tottenham 0-1 Wolves                   (1.770)

What is Happening Here?

Technically our scoring system is a measure of how ‘inaccurate’ the bookmaker predictions are. The smallest log-loss scores result from very confident predictions that prove to be correct (i.e heavy favourites that go on to win their matches). Although the 13/14 title race remains unpredictable, in reality there have actually been very few genuine ‘upsets’ this season. The top teams have all been very consistent and have largely beaten the teams they are expected to. The biggest upsets have been Manchester United losing at home to West Brom (log-loss 1.724), Everton losing at home to Sunderland (1.588) and Chelsea losing away at Crystal Palace (1.525).

Towards the end of the recent Liverpool against Sunderland match the Sky Sports co-commentator Alan Smith described Sunderland’s pretty disappointing (and ultimately unsuccessful) second half comeback as something along the lines of “What makes this league so great”.

Is this really the ideal level of unpredictability for a league?

How Does The Premier League Compare To Other Leagues?

This table represents the same measure for the current 13/14 season for every league that is covered by Football Data (again, the smaller the number the more ‘predictable’ the league).

all competitions

[All data correct up to and including 1st April 2014]

 

This table suggests that the Premier League is actually one of the more ‘predictable’ leagues around Europe? What might be causing this?

Is it possible that it is actually easier for bookmakers to set odds on some leagues than it is on others? It is certainly possible that there is some truth in this. Several of the leagues with the most accurate odds are also those that are the most covered in the media (EPL, Serie A, La Liga) and have the most information available. In contrast, I don’t think there aren’t too many odds compilers who specialise in the Scottish lower leagues. Does this mean we should all start betting on the Bundesliga Two? I won’t be rushing to do so just yet. I think any differences here are still very small and that this method should rather be considered as an interesting way to highlight differences in the competitive shape and balance of competitions.

For many of the leagues studied there appears to be an inverse relationship between how predictable the matches are and how competitive the league is. For example the leagues with the lowest average log loss include the SPL and Scottish Division One where Celtic and Rangers have already clinched the respective titles with a month to spare. The most predictable league is the Greek Superleague which has been won by the same team for the last 4 seasons. This method is still the best we have for evaluating competition ‘predictability’.

If we consider this a useful measure of predictability then it is surely also a useful measure of the ‘competitiveness’ of a competition.

Why might the Premier League have a lower score than the Bundesliga? Although Bayern Munich has romped clear in Germany, below them the league has been very competitive. As mentioned, in the Premier League the top 4 teams have all been consistently excellent (the top 5 have only 4 home defeats between them all season). The title races remains open but it is widely accepted that it will probably be decided by the two games Liverpool play against Manchester City and Chelsea.

Does this mean  commentators should be more careful what they describe as unpredictable? For the EPL it seems fair to say the title race is unpredictable but in general it is not actually one of the more unpredictable leagues.

Is the Premier League actually not competitive enough?

Is There An Ideal Level Of Predictability For A League?

The question of how competitive we might want the league to be is an important one and has implications for a wide range of decisions, in particular with regard to revenue distribution from the leagues lucrative media contracts. Many of the leagues that we have seen to be the most ‘predictable’ are also those that have very uneven financial structures. In contrast, the major US sports leagues such as the NFL and MLB openly engineer greater competition through the use of salary caps and draft systems.

Yet is it really desirable to have a league where ‘anyone can beat anyone’? Does this mean every team is as good as each other? Or does this just mean every team is as bad as each other?

Before we get too excited and start speculating about revenue redistribution it is important to remember that the best Premier League clubs are also those that represent English football in UEFA competitions such as the Champions League. This is not a consideration for any of the major US sports as they do not have to compete with other leagues overseas. This season only 2 English teams have made the quarter finals and neither are favourites to progress. Interestingly, the favourites to win the Champions League (Bayern Munich, Barcelona, Real Madrid, PSG) are all sides who compete is seemingly lop-sided domestic competitions (see above).

Is there an optimal balance to be sought between the competitiveness of a league competition and the opportunity it affords its best teams to build squads to rival the best in Europe?

Conclusions

I admit my premise was a little facetious – I do not actually think the EPL is too predictable and actually think this has been the most interesting Premier League season for a long time. I am sure plenty of football fans in other leagues are envious of such a close finish in prospect. Also, I noted that only two of our sides are in the quarter finals but Manchester City and Arsenal didn’t exactly disgrace themselves – coming up against the 2 best sides in Europe and some unfortunate refereeing decisions.

Yet I do think there are some important issues to look at in terms of what it actually means to have a competitive league. Should competitiveness be ‘engineered’? What if this is to be at the expense of the performance of our sides in Europe? If this season is representative of the future then I think the current balance between the league and European performance is about right but this doesn’t mean we should be complacent.

And it definitely doesn’t mean the Premier League is ‘the most unpredictable league in the world’.

 

 

  • Simon

    Hi Oliver,

    Nice article, but I was wondering one thing. How do you think the results will differ when looking at pre-season predictability. Your method examines the week to week predictability (CP recovery, Liverpool comeback etc), but I can imagine that differs from pre season predictions. Man United were something like 3.25 and Liverpool more in the range of 34.00 for the title.

    • Oliver Page

      Hi Simon,
      Thank you for your comment. This is actually a conversation I have been having on Twitter this morning! Certainly we could look at pre-season expectations and compare to outcomes as well. For example spread betting companies produce forecasts for how many points each team will get each season. In my personal experience in that industry the Premier League was still considered one of the less volatile leagues to do forecasts for (for example compared to League One or League Two) although you are right that maybe that is less the case this season (e.g. Liverpool and Manchester United).

  • Paul Agius

    Presumably if the bookmakers think that a match is difficult to predict then this will be reflected in the odds.

    Perhaps a better analysis would be to look at how many matches have a team placed at evens or less

  • Quincy

    It seems your argument against revenue redistribution is the need to be competitive in the Champions League. Isn’t the obvious reply that the redistribution must be done all across Europe?

    The question is, do we want the same teams dominating their respective leagues, the same teams dominating in Europe, or do we want more competition?

    • Oliver Page

      Hi Quincy
      Yes absolutely it is a Europe wide issue. It will be interesting to see if the introduction of Financial Fair Play has any impact!

  • John Frasene

    Hi Oliver,
    Can you explain how the 0.757 figure is calculated from the bookmaker’s odds?

    • Oliver Page

      Hi John
      Sure, there are a couple of stages. If you divide any bookmaker odds into 1 then you get back a % value:
      Liverpool win => 1 / 1.45 = 69.0%
      Draw => 1 / 4.65 = 21.5%
      Tottenham win => 1 / 6.76 = 14.8%

      These percentages represent the bookmakers estimate of how likely each event is to happen. Any time you place a bet what you are really saying is that you think that event is more likely to happen than the odds suggest (something to remember for anyone betting on the Grand National!). Unfortunately these odds add up to 105.3% (the bookies never give us fair odds!) so the next step is to scale them back to 100%.
      Liverpool => 69.0% / 105.3% = 65.5%
      Draw => 21.5% / 105.3% = 20.4%
      Tottenham => 14.8% / 105.3% = 14.1% = 100.0%

      Finally the 0 to 1 scale awards a ‘score’ of 1.0 for a home win, 0.5 for a draw and 0.0 for an away win.
      Home win => 65.5% * 1.0 = 0.655
      Draw => 20.4% * 0.5 = 0.102
      Away => 14.1% * 0.0 = 0.000 Total = 0.757

      Hope this makes sense
      Thanks

  • John Frasene

    Hi Oliver,

    Thanks, this makes sense now. The original odds adding up to over 100% was throwing me. Great article on the predictability of the league and some of the consequences there regarding European competition. As a Liverpool fan, it’s great to be involved in the title race but I know that all of the top sides will look to strengthen next season and the table will be difficult to predict again. I prefer it over the Spanish model, though.

  • Tom

    Hi Oliver,

    Interesting article, can you explain a little more how you arrived at the log loss figures for the Liverpool-Tottenham example?

    my apologies if its obvious I just cant see it

    • Oliver Page

      Hi Tom

      The scoring system is borrowed from here. It took me a couple of goes to get my head around it too. This is how my formula looks in Excel:

      = – ( ActualScore * LN(ForecastScore) + (1-ActualScore) * LN(1-ForecastScore) )

      So for the example where Liverpool did actually win the match it is:

      = – ( 1.000 * LN(0.757) + (1-1.000) * LN(1-0.757) ) = 0.278

      If the game was a draw you would replace the 1.000’s with 0.500’s and if it was an away win replace them with zeros.

      Hope this makes sense. I am glad you found the article interesting. This weekend just gone saw the new biggest upset of the season actually (Sunderland at Chelsea) so I may have to update it again soon!

      • Tom

        Hi Oliver,

        I’ve been coming back to this article scratching my head a few times, I’ve been looking to apply different methods on market prediction accuracy on different leagues for a while and this looks an excellent tool

        thanks for clearing it up, you’re very kind

  • Pingback: Unpredictability is something that EPL honors this time around | SCORE YOUR GOAL()