Coding Up History: The Man Utd 1999 Project

It’s been great to see the results of the 1999 Project come to fruition these last few days.

For those who missed it, StatsBomb teamed up with The Independent to run a series of articles looking at three key games in Manchester United’s 1999 treble season through the lens of modern statistics. Twenty years on, the classic Giggs semi-final against Arsenal, the FA Cup final against Newcastle and the crowning glory, the Champions League against Bayern Munich were all coded up with the current 2019 StatsBomb dataspec and Mark Critchley did a great job reframing the narrative in which stats sit alongside the memories.

The original articles can all be viewed here:

Man Utd 2-1 Arsenal

Man Utd 2-0 Newcastle

Man Utd 2-1 Bayern Munich

Mark has covered a lot of detail across the games so do give them a read if you missed them, but I thought I’d add in a few broader notes on top. It’s scarce that we get to see historical football via any other method that our memory and the narratives that persist. Few people can think that luck didn’t play some part in Man Utd turning around a deficit against Bayern in injury time, but one comment we received was that as remembered, Bayern were better overall and United were lucky to even be in the game to stage a comeback. That was my own general recollection too. However, when you pick it apart, sure Carsten Jancker hit the bar with a speculative overhead kick, but Bayern essentially weren’t better. The shot count was fairly equal overall, but Bayern scarcely got decent chances close in and the pass volumes were 60:40 in United’s favour. United did create chances in good locations, and can perhaps be classed as unfortunate not to have drawn level before. It’s likely that our collective memory is influenced by the game state: Bayern leading for 83 minutes in a fairly even game to some extent gives the impression of control.

There is also the historical context of how football has changed. The obvious hook here is the formations the teams essentially lined up in. Man Utd started in a 4-4-2 in all three games we coded as did Arsenal in the FA Cup semi-final and Newcastle in the FA Cup final. Bayern played with Jancker in a more solitary role up front and Lothar Matthäus in a sweeper role, so more of a 5-4-1/5-2-3. Is it any wonder central midfielders from this era are revered? How much space was there in the centre of the pitch?

Pressure

Here are the pressure events from the three games:

It’s probably easy to think back and presume the FA Cup final was the least intense of these three games since Man Utd won fairly comfortably, but this is the one game in which pressure events veered away from the midfield, at least for United. For Newcastle, Gary Speed put in a heck of a shift as did Dietmar Hamann before he was replaced at half time. Elsewhere, Arsenal saw Emmanuel Petit and Patrick Vieira most active and Nicky Butt, David Beckham and Roy Keane were most active for United. Interestingly, Tony Adams and Jaap Stam score highly in this match (and Ronny Johnsen does also in the Champions League final). Their no nonsense active defending often saw them move up the pitch to engage with attackers, and this is borne out to a degree here. Adams also recorded three fouls up near the half way line. This style of defending on the front foot, stepping out of the back line to engage the opponent, is something that appears less common twenty years on, and nowadays when we evaluate pressure events, centre backs often rank low. Lastly, in the Champions League final, midfielders came to the fore again with Jens Jeremies and Stefan Effenberg patrolling midfield for Bayern, while Jancker did plenty of work up front, United, with significant possession, were less active off the ball. Their strikers were particularly inactive: see the totals of Dwight Yorke (6) and Andy Cole (5). This is a contrast to just days before and the FA Cup final which saw Teddy Sheringham (35) and Ole Gunnar Solskjaer (34) racking up the effort. Game state, tactics and respect for the quality of the opposition could all play a part here but such a difference in style and personnel is notable.

Quarterback Beckham

It’s quite fun to see peak era David Beckham filling in in central midfield in the Champions League final. As his career moved on he played there more frequently, but earlier on, the very reasonable idea that he wasn’t defensively robust enough meant he wasn’t deemed an ideal fit there. Add in how effective he was as a peerless crosser of the ball and he was most often deployed as a traditional right sided midfielder (he was never really a winger per se). Here’s his pass map from that Champions League final, as the right side of a CM two:

So many long passes! 38/40 ground and low passes, and 6/22 high passes but a real lack of the shorter passes you might expect from a central midfielder. Contrast that with Nicky Butt, a player who very much eschewed the extravagant throughout his career:

Kept it simple! I’m enamoured with the idea that in the most important game of his career, Beckham got the chance to fill in as a central midfielder and proceeded to attempt his full range of passing. Was this the basis of him thinking later that this was his ideal position? Was this the birth of “Quarterback Beckham”? I don’t recall, but much like Wayne Rooney’s latter day England career, in which he appeared to feel he was more of a central creative hub than the forward he had been before, Beckham’s late England career featured many questions about whether he could play in central midfield and was more than just a world class right midfielder with an insane work ethic and top tier delivery. Regardless, in this game it translated as 71% passing from central midfield, quite a lot of turned over ball and essentially a very direct strategy.

Pass completion

This directness also pans out to the bigger picture. In this match as a whole there were 659 open play passes attempted of which around 69% were completed. This was a game between two of the strongest teams in Europe yet neither team impressed with their completion percentage (Man Utd 72%, Bayern 68%). Check out how that shapes up against every single game in the Champions League of 2019:

 

Passes were completed at a higher rate in every single Champions League game this season and only Porto v Galatasaray featured fewer actual passes. Back in 1999, against Arsenal and Newcastle, the pass volumes were much higher, but the pass completion rates were still low (Arsenal 73% v Man Utd 72%, Newcastle 72% v Man Utd 64%). It’s worth considering that the Spanish teams of the late ’00s, the influence of Pep Guardiola and stylistic hooks to value possession, slow down attacks and be less direct, are all in the distant future here. By coding up a handful of historical games, we don’t necessarily create irrefutable proof but we do get a window into the past and more evidence to consider fun questions like: Would teams of yore match or exceed modern teams? It’s most likely a different game…

____________________________________________________________________

Look out for more historical projects in the future.

Errors, Medhi Benatia And The Troubled Life Of The Centre Back

It is often said that the life of a goalkeeper is unfair.  Repeated brilliance and octopus-like saving talent can be forgotten in an instant if a moment of ineptitude or bad luck costs his team the game.  This is nothing new.  Throughout the history of football, the goalkeeper has only rarely been the hero and has suffered more than any other player in the hands of critics.  Heurelho Gomes, a keeper who switched from the sublime to ridiculous with revolving door frequency is my particular favourite example of the dual nature of goalkeeping efforts. We can see the perilous nature of goalkeeping by looking at the rate of errors charged to players in the Premier League this year: errors   An excellent post from @shots_on_target last year looked at repeatability levels for errors, and found there was none.  This follows simple logic, they are a non-regular event, and what defines an “error” is at least arguably subjective.  None of this seems to have stopped players living or more relevantly dying by their error rates though, as parts of the table show. Topping the table are two goalkeepers, Szeczesny and Mannone who have been cast out by their coaches.  But as we see quickly, and given that it is a defensive measure, unsurprisingly, the centre back is particularly vulnerable too.  Poster boy for “transfers that didn’t go as hoped” Dejan Lovren had a horrific start to the year, so much so that he has struggled for minutes since and has a very low reputation amongst fans.  Is he so bad?  I’d suggest he hit a bad patch trying to settle into a new team and is unlikely to repeat these levels. This list is chock full of players that you can see have had difficult seasons; old players maybe losing their legs (Distin, Barry), players struggling to adapt to new teams or leagues (Lovren, Fazio, Dier, Duff, Moreno), the usual crowd of keepers (Krul, Green) and players hung out to dry by a failing system, but possibly not good enough (Mason). Quite simply, I think errors inform selection decisions and on occasion time, patience and more context is probably required.  Players do not become bad because they commit a series of errors in a short space of time, but in the cut-throat world of football, sometimes they are denied the opportunity to prove otherwise.  And the centre back is vulnerable to the vagaries of their position. So what am I focusing on errors for? Well, last night’s Champions League fixture between Bayern and Barcelona was a curate’s egg for Bayern centre back Medhi Benatia.  It started encouragingly for him as he scored a delightfully placed header to give his team the lead but then events took a turn for the worse.  After being left calling for offside as Messi threaded a ball to Suarez who fed Neymar to score, he made one genuine error in charging out to challenge an already marked Messi leaving Suarez to dash clear and again feed Neymar.   The piece de resistance was revealed via replays as Suarez outrageously spun him with a flick for the ages:   And this was all before half time!  Poor Mehdi Benatia!  At half time, Paul Scholes on ITV rounded on Bayern’s defense and particularly Benatia and even many hours later, as I write, Alan McInally is on SSN telling me that Bayern are “defensively (…) nowhere near where they need to be to compete at the top end of European football”.   I’d contend the 114 goals shared amongst Messi, Suarez and Neymar this season could also be included as counter evidence to this knee-jerk analysis. So what if I tried to convince you that beyond the obvious error for the second goal, Benatia had a really good game, indeed that he made contributions that marked out his performance as an exceptionally rare feat?  Benatia’s game brings to mind the quarterback who passes for 400 yards yet tosses up an interception to cost his team the game.  It happens, it’s what gets remembered but it is rarely the whole story. It was noted by site owner, Ted Knutson, that Benatia had attempted four shots, made eight interceptions and succeeded with four tackles and that this was likely a rare combination of feats.  It was, you have to go back some years to find a comparable numerical performance.  Now whilst the defensive measures noted here, much like errors, are not noted for their usefulness in predicting future performance, they do offer a descriptive measure of involvement.  Benatia was facing probably the finest front three in recent memory, and he was working relentlessly hard to repel them.  Meanwhile he was also a danger at the other end, availing himself of four shots and a goal.  He also made ten of Bayern’s 17 clearances, suggesting his positional lapse was momentary not permanent.  And I think we can forgive his role as victim in the first goal, the combination of Messi’s pass with the timing of Suarez’ run would beat any defense.  However, Benatia’s legacy from this game is likely to be negative. And this is the wider point.  The speed of the modern game has damaged the perceived reliability of the centre back.  Ten years ago, defensive systems involving two defensive midfielders offered superior centre back protection.  The advent of high lines, a widespread abandonment of 4-4-2 and ever more focus on attacking full backs has left the centre back more isolated.  They now appear more error prone than before.  Just recently, @footyintheclouds expressed surprise to me that Gary Cahill reached the PFA team of the year alongside the one genuinely solid centre back in the Premier League,  John Terry, and when pressed for solid worthy alternatives, I could think of none.  The centre back is less happy with his lot than he once was. Next time your centre back struggles to keep pace with Luis Suarez or falls at the feet of Messi, it’s worth remembering that their task may well be increasingly weighted against them from the start and that their qualities are being limited by systems and the evolution of the game.  They are better than they might look. Medhi Benatia sure is. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Thanks for reading!   Follow me on twitter here: @jair1970

Are Mourinho’s Chelsea brilliant or riding a huge wave?

Every year some teams seem to click.  They embark on long undefeated runs and record impressive victories.  The media get excited and players win awards.  They might even win titles.  On some occasions this is an accurate reflection of a team’s quality, but on others it’s a comparatively brief flirtation with the desirable end of random variation.  So this year, are Chelsea a decisively good team or are they just riding a huge wave? Let’s find out.

How good are Chelsea?

It was easy to get irritated by some of the media fawning over Chelsea as they skated happily undefeated through Autumn.  Fabregas fed Costa repeatedly and the desire for a new team of ‘Invincibles’ led to a lack of wider critical analysis.  Despite Mourinho’s CV being full of changes effected at a pace akin to the Fonz’s ability with jukeboxes, this time it had taken longer and only the shrewd recruitment of top class players, a neat trick available to very rich clubs, had seemingly raised Chelsea’s performance level.  All hail Jose!

I had reservations.  Chelsea were good, but despite making their annual meal of the Champions League and throwing in the odd duff performance, so were Man City.  In fact, so are Man City.  Most recently, their impressive limitation of Chelsea to three shots at Stamford Bridge was a clear indictment that this Chelsea team were maybe not the second coming.  Samir Nasri, ever the shrinking violet, agreed:

“I’m a big fan of Mourinho but I am not impressed with his team, not at all, we have played them twice, one time with 10 men, and we were better than them. Then we played them at the Bridge, and we were better than them. When I have seen some of their games, I don’t think they are that fantastic – they are just strong and have a good striker up front.”

So, where are we?  Nasri’s biased slight isn’t provably untrue, yet Chelsea sit contendedly and clearly on top of the league.  City follow at a respectable distance.  Why?

Sometimes the glaringly obvious can stand clearly in front of you yet appear invisible.  A finely honed contrariness and a tendency to try and look for nuance can leave you peering around the sides and neglecting the emphatic truth.  Then you refocus and it hits you like a brick.  Such a wave of simple realisation occurred to me earlier this week as I was pondering this season’s Premier League title race.  Chelsea are that team.  Every year we get at least one, and this year it’s Chelsea.

What the hell am I on about?

Okay: a Premier League season consists of 38 games, which in the scheme of trends and variations is a small sample.  Each season, within that small sample it is possible to identify teams that have over or underachieved when compared to what might be expected from baseline metrics.  Most often, due to proven repeatability, these will be based around shots and their relationships with goals and/or points.  What we’re looking at here is the top of the league and each season, of the contending teams, one or more learns to fly.

It’s relatively straightforward to build a model or translate existing metrics into points totals that reflect these underlying truths, without delving into the complexities and extensive data collection required for expected goals (James Grayson is the flagbearer in this realm).  I’ve looked at the total shot ratios, adapted the same to incorporate conversion rates, then shots on target ratios and finally a version of Grayson’s Team Rating.  This creates subtly different results and rather than give a single points figure, given we’re dealing in estimation, we get a points range.  When comparing actual points to derived expected points totals, the most overachieving sides of the last 5 seasons are as follows:

overachiever rank

We have representatives from each season here, 4 of 5 title winners and 3 of 5 runners up.  Chelsea’s phenomenal 2009-10 title side ranks a clip behind (however, in raw terms, that team is way ahead of all other teams on the list).  Also close up is the 2012-13 City side that meandered as Mancini edged towards the exit whilst Ferguson’s last team flew.  Interestingly, the only top 2 team that has a mediocre ranking is Ancelotti’s 2010-11 Chelsea side, but with 71 points, it was a particularly low total for second place.  Man Utd had it easy that year. Otherwise we have:

  • Four Ferguson Man Utd teams, with little surprise that his final two head the list
  • The Man City team that memorably dueled and beat Utd in 2011-12
  • Three teams from last season.  Arsenal overachieved significantly against, for them, moderate underlying numbers and were nominally in the title mix for about two thirds of the year, Liverpool had a phenomenal attacking season but were pipped by Man City, who matched them going forward and held a decisive advantage defensively.

What has this got to do with this season?

This year we have a functional duopoly of Chelsea and Man City, two teams well matched by skill but divided by small degrees of random variation. In a neat reflection of core performance indicators, after twenty games, Chelsea and Man City had accrued identical records.  Since then, their results have diverged and if we again look at projected points, we might be surprised that Man City’s totals bear greater resemblance to expectation than Chelsea’s.  Projecting forwards, Chelsea 2014-15 are quite likely to end up ranking 3rd, behind only Ferguson’s last two squads at the top of the original table.  This is an extremely high positive variance and when considering it is building on a contending base, it is highly likely to generate the title.

We can see the top 8 positive variances for this season here:

overachiever 2015

From this, and knowing how similar City and Chelsea’s underlying numbers are, we can estimate that Chelsea’s advantage caused by random variation (or even luck depending on your inclination!) to be somewhere between 3 and 9 points.  That they currently lead by 7 bears this out quite tidily.

A red-hot striker helps

This year Chelsea have Diego Costa finishing chances at a rate of 28%.  This is incredibly high, literally ‘Papiss Demba Cisse has hot feet’ high.  His last two seasons were 25% and 20% which implies he may be an above average finisher anyway, rather than a Cisse-esque streaky converter.  Liverpool and Man City last year had Suarez, Sturridge, Toure, Dzeko and Aguero all converting chances at extremely high rates. (In fact, Dzeko consistently appears high up in conversion rates, his long absences this term may well have impacted on City’s effectiveness.) Ferguson’s Man Utd teams had combinations of Berbatov, Rooney and Van Persie all scoring at high rates.

Having a hot striker contributes massively towards the ‘for’ end of ‘conversion dominance’: the rate in which a team converts minus the rate in which they concede.  Here are the same teams from the original chart plus City, Utd and Chelsea from this season:

conv dominance

All of the teams in the initial chart converted chances at a far higher rate than their opposition.  This season both Chelsea have a conversion dominance of around 6 percentage points (14% for, 8% against), Man City’s is 2 percentage points (12% to 10%). Simply, Man City have not performed like Champions in this metric. Interestingly, part of Man Utd’s continuing success despite performances can be explained here. Van Gaal has followed Ferguson here. Only time will tell if he’s able, like Ferguson, to sustain it over multiple seasons, a trend which possibly suggests tactical or systematic methods.

Get shots on target

The top 4 teams in my database for turning their shots into shots on target are the top 4 teams from the initial graphic. If Chelsea maintain their levels (38%), they will rank 3rd.  Chelsea’s current rate of dominance in this metric in relation to their opposition is 2% higher than anything else i’ve got recorded.  They get their shots on target and they stop the opposition doing the same at a very high and likely unsustainable rate. Grayson’s Team Rating already cleverly incorporates a similar concept into it’s calculation and it’s clear that it is an important influence.

Again, we find Van Gaal’s Utd scoring highly here; at least in the plus side.  They rank 2nd in the league for turning shots into shots on target (37%) but intriguingly give up shots on target at nearly the same rate (36%).  One might expect this to indicate potential problems but when those same shots on target are converted at 37% For and 25% Against, it looks like a big thank-you to David De Gea.

Any hope for Man City?

Man City are again this year, top class.  Their issues, which start with the seven point deficit, could well be unsolvable given the tiny sample of 13 games remaining.   They simply haven’t turned their clear quality into enough points.  In nearly every game that they’ve dropped points, they’ve been the better team and have generated more shots and chances than the opposition.  The random variation up until this point has given Chelsea what could well be a decisive lead and the only slight hope I can muster revolves around the Champions League.  Mourinho seemed to prioritise the Champions League last year despite being in league contention; they lost to Villa, Palace and Sunderland in weekends prior to three of their six knockout fixtures and of course, famously won despite fielding a weakened team at Liverpool prior to another.

Whilst Man City also remain in the Champions League, their route is heavily blocked by Barcelona.  Chelsea’s competition nous gives them an edge over Paris Saint Germain and if they progress, their fixtures become interesting. A quarter final tie will give them a midweek/weekend run of:

Champions League Quarter Final Leg 1

(H) Man Utd

Champions League Quarter Final Leg 2

(A) Arsenal

whilst a semi final would create this:

Champions League Semi Final Leg 1

(H) Liverpool Champions League Semi Final Leg 2

It is ironic that Man City’s hope of rechallenging for the title may rest on their own departure from the Champions League and the continued progression of Chelsea in the same competition.

Conclusion

The statistically best team of the enlightened stat era (2009-14) was, by some margin, Ancelotti’s 2009-10 Chelsea team. They scored the most goals, took a higher percentage of shots and shots on target, prevented the opposition doing anything much at all and only the sustained brilliance of Man Utd under Alex Ferguson stopped them winning the league with ease. This Chelsea team is a clear step below and maps very similarly to this year’s Man City side. Both are clearly a step above the opposition in the league this year but compare only adequately to former title winners. In a season in which shot totals and goals have reduced year on year, defenses have defined the main storylines in the league, Southampton and Man Utd particularly.  One can speculate as to factors involved; World Cup hangover, increased focus on the financial incentives or even Financial Fair Play but with the pragmatist’s pragmatist in charge of the leaders, it is no surprise that efficiency has come to the fore.

Thanks for reading!