Ten Weeks In The Premier League

HAIYANG, CHINA - JUNE 16: Trang D Huang of Vietnam misses a save during the Men's Beach Soccer Preliminary match between Vietnam and Palestine on Day 0 of the 3rd Asian Beach Games Haiyang 2012 at Fengxiang Beach on June 16, 2012 in Haiyang, China. (Photo by Ryan Pierse/Getty Images) Ten weeks? Fair to say it has flown by. Some things have been very familiar: the usual weekly rotation of "this team could well be set for a title charge" each time a big gun records an impressive result, the bottom of the table becoming less a race to become good but more a desire to be least worst, in which few teams are managing to conceal their ineptitude or Romelu Lukaku reliably scoring goals while grizzled bystanders look on, pipe in hand, sagely noting his inferior first touch. Is anything different? This time last year I was writing about the "rise of the middle order" as usurpers like West Ham and Leicester did the usual thing a couple of mid-rangers do and run hot for a dozen games and get "talk of European places" bestowed upon them. Or course that...er... came to nothing as per usual as in my imaginary world Leicester dropped back to a plucky 6th place finish and West Ham blew up completely. This year: nothing. It's too late to describe Southampton putting up decent shooting numbers as anything other than as expected, and Everton, finally rid of the bizarre helming of Roberto Martinez are tidily reverting to their best of the rest role. The ninth best team in the league could be Crystal Palace? Let that sink in. Do or don't write off Leicester? Back here in the odd version of reality christened "2016", it's painfully obvious that Leicester are prioritising the Champions League. Let THAT sink in. They are putting together some genuinely wretched numbers in the league; giving up an extra 2.5 shots on target per game is seriously weak. Expected goals is kinder to them--for all that they are a couple or three goals behind pace-- and it's likely that woeful accuracy (27% of all shots are on target, 3rd worst in league) will pick up over time, but it's less a title defence so far and more a farewell tour. And within the accuracy stats are further insights: Leicester's ability to land their own shots on target and prevent the opposition from doing the same was at "top five teams this decade" levels last year (+7.3%), whereas so far this year it's in the basement (-10%). This is the kind of stuff that bounces around the place and is subject to huge variance, and now, with some aplomb, they've landed on the wrong side of it. Their all shot conversion is flat too, which is novel for them, and the last beacon of positive variance to cling on to is that the few shots they are getting on target are flying in (40%) just like the good old days. Of course, in the bigger picture, this team will be fine and will find a safe place in mid table somewhere, but throw all that into the mixer and it's a different, far more familiar world. Apart from the part where they top their Champions League group. West Ham West Ham, so drunk on the spoils of 2015-16, are also experiencing a huge hangover that no amount of paracetamol, pre-sleep water consumption or fried breakfasts will fix. Like all good hangovers, time should do something to help matters, but something nagged me about Sofiane Feghouli and Simone Zaza jogging onto the pitch late on against Everton and providing as much impact as if i'd have been out there. It felt like witnessing the West Ham of old, when Harry Redknapp used to fill his benches with seemingly random signings and loanees then throw them on late to "try something", and rarely did it work out. Is that where West Ham could end up? They must be better than that, right? I digress: numbers-wise, they are offering a similarly diverse palette to Leicester, except their deficit is at least a handful of goals and their shot stats are plain wonky. They are outshooting their opponents around 15 to 13 per game, yet are registering only three on target opposed to five. Huh? This means that their rate of getting shots on target (19% of all shots) is league low--and comparatively very bad--while the rate their opponents are getting their shots on target (40%) is league high. That is a potent mix which feeds into some odd conversions: those on target are flat--which does you no good when you're outshot there so severely, while the all shot rates are terrible on the front end and also terrible in defence. There is so much that is extreme here for now that they will also inevitably land long term in some comfy middle ground. Their season will have been deemed "challenging" and the ground will get blamed for things it maybe shouldn't be; 14% shot accuracy at the Olympic Stadium means they are exploring every tier. The top Considering a point separates the top four and Tottenham are unbeaten and two points further back, it's only Manchester United who can be considered truly underachieving of the big six teams in the league. Eighth and with a 4-3-3 record  is all a bit Moyes van Gaal so far, but a wee bit of misfortune has been enough to derail them and the reality is not quite so grim. Home fixtures against Stoke and Burnley yielded 61 shots, with twenty on target and only fourteen with seven on target in reply. While I can accept that the huge volume created is at least in part a function of trying to break a deadlock, that kind of total dominance rarely ends up in two draws. Zlatan Ibrahimovic's recent drought has no doubt contributed to the early difficulties, as has a jumbled squad, but if United had turned those draws into wins, as must surely have been probable, they'd be a point off Tottenham and four off the lead. Jose Mourinho would still be smarting at the decency--or lack of-- that Chelsea showed in thumping them 4-0, but they would be in the mix rather than the forgotten ingredient they have become instead. If, but, if but... yes, I know, but the simple truth is performance metrics are far more content with Manchester United than a plus-one goal difference might imply. In fact we can take a tour around shots and expected goals metrics and ascertain quite a lot about all of the teams in the top four discussion. We cannot ascertain the title winner, and i'll leave the probability fiends the impossible task of rating and weighting the league this far out, but we can identify a few strengths and weaknesses so far. Shots All of the big six teams are putting up dominant shots numbers. Liverpool are over plus-ten per game, Man Utd, Chelsea, Man City and Tottenham are all between plus-seven and plus-nine, while Arsenal are on plus-five. None is conceding more than United's 10.5 per game and Liverpool, United and Tottenham are all taking over 18 per game. This is strong across the board. For City, Tottenham and Liverpool, this backs up good 2015-16 numbers. For Arsenal, this top layer--like 2015-16--only tells some of the story and for Chelsea and Man Utd, the need for them to bounce forward from last season was very necessary, and at week ten, they've managed it. Expected Goals There will always be differences across models here, but we can get a general guide to how the shots have manifested themselves. Here, in attack, I have Liverpool only very slightly ahead of a tight pack of Arsenal, Man Utd, Man City and Chelsea while Tottenham are a couple of goals behind them. In defence, Man City and Chelsea are clear best, followed by Liverpool, then Utd, Arsenal and Tottenham. These changes in position from the shot rankings come out if we look at expected goals per shot: Arsenal have a high expectancy (0.113 per shot)--very much like last season--and Tottenham do not (0.081). The other teams are oscillating around average in attack. In defence, the good end has Man City (0.080) lead Chelsea (0.084) and Man Utd (0.087) and the bad end has Liverpool, perennially vulnerable to any kind of ball to the centre of their box, on 0.112 and Tottenham, 0.105. This means, for good or for ill, overall Man City lead Chelsea, Liverpool, Man Utd and Arsenal with only very small differences between them, enough that one or two good or bad games could completely change the order. Tottenham are a clear sixth here with Everton and Southampton snapping at their heels. Of course schedule will have some effect on all these numbers but at ten games it's starting to shake out a little. Throw all that together and you get this: xg-delta           Again models will vary, but the trends should remain similar. Arsenal are running hot on both ends, while Utd are ice cold for both and the variation between a plus-13 goal difference and plus-one couldn't be more stark. Tottenham's defence is overachieving by a huge margin (5 goals conceded compared to an estimated 10.5) while Liverpool's defence is the opposite. As an aside, save percentage is similarly informative here with Liverpool at 57% and Tottenham at 83%. The two teams that have clear solid profiles and appear to be getting rewarded for this are Man City and Chelsea. Both are enjoying a small skew in attack and are on par in defence. It's a tricky estimation but by looking at all these aspects we can spot trends quite easily: Tottenham's attack is a little too "AVB" at the moment (shots from range, little penetration), Liverpool's profile is a bit "Rodgers 13/14" with the same strengths (attack) and weaknesses (defence). Arsenal look to be once more chasing the god of shot quality, Conte has made Chelsea's defence robust and City are typically good--a positive skew would serve them well, as it has done in previous titles. And Man Utd, well, even when the structure works, fate conspires... What it means is that, by some strange confluence of events, the table isn't actually lying much at the top end. Apart from United's woes, that there's barely a cigarette paper between the rest and it is pretty much in line with expectations. After spending so much time behind expectation last season, it's even possible to give Arsenal a break for bouncing the other way this, and they will always be strong. It's all set up quite nicely.   ____________________   Thanks for reading. More to come this week on site, so stay tuned. I will attempt to write another of these somewhere along the line, but these days i'm a bit busy.      

Ligue 1's Tale of the Tape

ligue-1 A glance at the Ligue 1 table and you might notice some surprises at the top of the table. Nice are first with a sparkling GD of +13. Toulouse are 4th and since Pascal Dupraz took over late last season they’ve had the statistical profile of an above average side. PSG are meandering in a way that’s not been seen since segments of the 2014-15 season while Lyon have looked dominant at times yet find themselves with only 13 points in 10 games. In comparison to the cakewalk that was Ligue 1 last season, this season has been much more parity driven. While the logical conclusion is still PSG winning the title, the roadmap to that destination could be much bumpier than some expected. This is good news for a league that’s gained a reputation for being PSG’s playhouse. With that in mind, let’s look at how some of the bigger teams France have fared so far. Nice There are three guarantees in life: death, taxes, and a Lucien Favre led team overperforming relative to expected goal models. He did it multiple times with Gladbach and now he’s doing it with an odd squad at Nice, who models project to finish in the final CL spot; something no one would’ve expected coming into the season.

It’s quite amusing to see Nice overachieve in a completely different way compared to last year. Last year, Nice managed it in attack with Hatem Ben Arfa and Valerie Germain going on conversion benders and Ben Arfa in particular having by far the best season of his career. This year, they’re doing the opposite by having a save percentage over 80%. With the league having a defensive reputation, it’s not totally surprising to see teams conjure up such 10 game splits. Between 2009-14 alone, nearly 20 teams had 10 game splits of above 80% shot stopping. This isn’t to say it will last though, because the likelihood is that it won't. Also, another thing that seems unsustainable is Mario Balotelli. And not because what Balotelli is doing isn’t well, because it is. He’s shooting over 5 times per 90 and several of them from good areas. It’s just that it’s been over three years since we saw this version of Balotelli: the shot hungry mercenary who scores enough goals to warrant the amount he attempts. Nice did rehabilitate Ben Arfa last year so I guess maybe something’s in the water over in the French Riviera for once football prodigys trying to reclaim their careers. There is plenty of good stuff elsewhere in the team too. Their midfield trio has continued to be excellent despite the loss of Namplays Mendy, Ricardo Pereira is one of Ligue 1’s best full/wing backs, and Alassane Plea has emerged from the shadows as one of Ligue 1’s better attacking players. I wrote about Plea last season and while I did like him, my fear with him was how he would cope without the assistance that his departed attacking partners gave him. So far, he’s been awesome and quelled any such fears of potential snags. Another good thing for the team is while they allow a crap ton of shots, they have among the lowest quality per shot in the league. Personally I’m still not totally convinced that Nice will finish in the top three, probably because I’m still clinging to the belief that a fully functional Lyon side will return sooner rather than later. But an eight-point gap over 4th place Toulouse and double Lyon’s current total leaves some room for error, and perhaps enough to sneak into the Champions League. Monaco Here are two things about Monaco so far this season:

  1. They are unquestionably in a better place currently than they were at any point last year. At this time last year, they were a team trying to incorporate a more proactive pressing style that basically just exposed how old and slow their backline was. Their stable of attacking players now is exciting, all their underlying numbers are considerably improved and they have the look of a team that will finish somewhere in the top three.
  2. They have super high goals/SoT rate of over 50% that has assisted their sky-high GD of +15. No team can ever maintain such a beneficial conversion bump throughout the season and there will be an eventual regression to the mean. While they did beat PSG, they also have been stomped by both Nice and Toulouse which doesn’t exactly scream of a genuine title contender

PSG Fundamentally, PSG are not broken, yet are obviously not quite as dominant as they were last year. That’s in part due to the rest of the league providing stiffer competition. They’re still the most talented team in the league, they create on average more high quality chances than anyone else and so far all their conversion rates on both ends skew quite favorably. While it is amusing to see them 6 points behind the top of the table, two years ago the same thing happened when Marseille built up a similar lead. In the long run this is still PSG’s league to lose. There are some signs however that it’s not clicking yet. While nine times out of ten they defeat Marseille last Sunday with the type of shots they created, last year these were the type of matches that PSG would kill off. There were still noticeable problems with how PSG attacked space in behind OM: https://twitter.com/flotoniutti/status/791003082117279744   Moreover, some of the ways they have tried to redistribute goals in the wake of Zlatan Ibrahimovic’s departure is suspect to say the least. Ben Arfa was good last year but it was an anomaly compared to the rest of his career. Jese produced very nicely in spot minutes in Madrid, but he played on a super team mainly as a sub which we know has a big boost on attacking performance. When those are your two biggest acquisitions in attack, it’s no wonder that the ceiling is lowered. Again, this is still the best team in the league, and nothing statistically screams that PSG will be usurped even with the acknowledgment that there’s slightly more fragility within them. Lyon They were my pick in preseason to give PSG their hardest test this season and through the first few weeks, they looked exactly as advertised. Alexandre Lacazette was scoring goals for fun while Nabil Fekir looked good coming back in his first full season since his knee injury. Then injuries happened, a switch to a back 3/5 formation occurred and Lyon haven’t been the same since. It must be said; Lyon aren’t anywhere near as bad as the league table suggests. In fact, all their shot numbers suggest that they should be considerably higher up the table than currently present. They’re actually 1st or 2nd in regular shot numbers and expected goal numbers but the rub comes in where their conversion rates on both sides are considerably below average, particularly their save% at a brisk 61%. They just lost to Guingamp 3-1 despite only giving up 6 shots, albeit they were high quality opportunities. Variance has not gone their way. There is also a developing trend going on with Lyon though and it’s this: When they play a 4-3-3, they look considerably better visually in terms of fluidity and the ability to progress the ball into dangerous areas (their performance versus Caen being the best example). When they play that hybrid back 3/5, it’s considerably jaggier. Their performance in the CL against Sevilla was an example of that and slowly but surely, it’s starting to show statistically as well (albeit there’s a legitimate caveat in terms of sample size). https://twitter.com/Birdace/status/789822467393064960 Now here’s some good news: Almost everyone relevant to the team is healthy, and we saw during the first few games this season that a healthy firing Lyon side playing a 4-3-3 has the best attack going, spearheaded by arguably Ligue 1’s best striker in Alexandre Lacazette. Anthony Lopes should eventually stop shots at a higher proficiency considering his performances over the past two seasons and while they did play an easy schedule, they should still have enough to eke out Champions League qualification. But there’s a very real chance Lyon missed a golden opportunity at producing a genuine title challenge. Again, their schedule has been quite favourable as they haven’t played PSG or Monaco or even Toulouse yet. Plus; at this time last season it could very well be argued that Caen were the 2nd best team in Ligue 1, which said a lot about the strength of the league without PSG. This isn’t the case now but having to make up a 13-point gap at the top and 7-point gap for the final CL spot will be quite hard to do in a beefed up league this time around. Honorable Mention Pascal Dupraz has been Toulouse manager for 20 games. Not the greatest sample size I'll grant you but this is how Toulouse have done since then:

  • TSR% of 53.3%
  • SoTR% of 62%
  • Big Chance% of 73.1%
  • Only Nice (48 points), PSG (43 points) and Monaco (37 points) have secured more points than Toulouse (36) over the same time span

Translation: Toulouse are probably good? After years of searching, they stumbled upon a teenager in goal who can actually stop shots proficiently, while Martin Braithwaite has shown so far that he doesn’t have two cinder blocks masquerading as feet, and not before time:

Year (Age) Non Penalty Shot P90 Non Penalty Conv%
2014-15 (23) 3.06 8.3%
2015-16 (24) 3.22 8.5%
2016-17 (25) 3.71 19.2%

Who knows how long this will last? While Toulouse have some genuine talent including a entertaining Swiss army knife midfielder in Oscar Trejo, like a lot of Ligue 1 sides depth is an issue. The potential downgrade from Braithwaite to someone like Ola Toivonen is severe and if history is any indication, even when healthy Braithwaite will probably stop converting shots to goals at this rate. But I like this team, their performance versus Monaco was super impressive (not least of which Braithwaite’s work) and if health is on their side, there’s no reasons why they can’t give it a go for a top six finish.

How Spurs and Liverpool's Presses Differ Plus Other Buildup Bits

Last week I looked at how different Bundesliga teams built up their attacks. There were lots of interesting takeaways and it feels to me like a rich area to mine for takeaways for fans, analysts, and coaching staffs. This week I'm going to keep looking around these stats and expand a bit to look at if we can see if certain types of build up are better, check out a few stark differences in play between the big 3 leagues and point out a few interesting bits and examples involving teams and players including some differences in how the 3 teams pushing the pressing narrative in the PL have played. Now remember it's only been 6 or 7 games so opponent strength probably plays some role here, but sample size should not be as big an issue like it is with shot stats for the most part. There are hundreds of passes from each of these zones. Let's dive in.   The Zones These are self-selected zones, you can quibble with how I chose these but it's what we are using in these pieces. Might want to keep this open in another tab to refer back to, because all of these stats involve these zones. snip20160930_1     Differences Between the Leagues The first big difference comes from zone 7, basically where goalies start play from. In the Bundesliga, teams build up going one zone at a time a lot more than in the PL. BuLi teams pass to zone 6 22% more often than PL teams, while PL teams hit long balls more often. snip20161018_11 The Bundesliga is not afraid of long balls though, they are the most aggressive league in forward areas. BuLi teams play 34% more long balls from Zone 4 than teams in La Liga. Another big difference comes in zone 3 where BuLi teams hit it into the danger zone 22% more often than PL teams, who play it within zone 3 a lot more than BuLi teams. snip20161018_12 The difference shows up as Bundesliga teams as a whole pass it into zone 1 from zone 3 more often, while English teams rarely ever do. Anyone with ideas about why there is such a stark difference between all 3 leagues when it comes to zone 3 passing, let me know in comments or on twitter because it's hard to think of something simple that can explain such a wild difference in play. snip20161021_19 Spanish and German teams play a much higher proportion of their passes from zone 3 to zone 1, while English teams dominate the bottom of that same list: snip20161021_21 Spanish teams play on the sides of the pitch massively more than either German or English teams. This list of teams who allow the lowest % of passes to zone 3 ending in the center is absolutely dominated by Spanish sides. snip20161020_10   I love that there are these differences to find as the sport has developed differently from league to league, but it's still hard to figure, why? Anyone with a lead, please chime in.   Which is the best way to build? It's a hard question and one I don't have the chops to come close to answering answer now, but can at least I can attempt to light a match to get a glance at this yawning darkness of a question. I simply looked at how many passes after a certain pass it took for a team to complete a key pass. There are big problems in a lot of places with this: ignoring unassisted shots, dribbles, stretching over multiple possessions, etc, etc, but it still reveals a few trends.  

  1. Entering higher up zones from really long balls seems to be worse overall than doing so via short passes: it takes on average 62 passes to get a key pass after a long completion from zone 6 or 7 into zone 3 compared to 52, 55, and 56 respectively for passes from zones 3, 4, and 5.
  2. Completions ending in zones 5, 6, and 7 are not too different when it comes to how many passes it then takes you to get a key pass off. You start seeing significant drops in zone 4 (8% drop from zone 5 and 5 fewer passes than zone 5), zone 3 a 12% drop from zone 4, zone 2 a 17% drop from zone 3, and zone 1 a 57% drop from zone 2.
  3. And this is probably common sense but the closer to goal, the more valuable the center of the pitch is. Passes into the center of zone 2 turn result in a key pass 30% quicker than passes into the wide areas of zone 2 (generally where short crosses are played from). In Zone 3 it's about a 10% difference and just 7% in zone 4 before the difference completely is erased by zone 6.

      Different Types of Pressing: Tottenham, Manchester City, and Liverpool These three are probably the three teams who have earned the right for "pressing" to be a part of their narratives pretty much each and every game. There even was an entire podcast completely about Liverpool's pressing last year. There are some subtle differences in how they've pressed this season. Tottenham push up higher than the other two to take away options and force opponents into long balls. When opponents are in zones 7, 6, and 5 Tottenham are among the teams who force the highest proportion of long balls. Liverpool don't force more long balls than average, City more than average but Spurs are among the European leaders, with teams like Barcelona, Dortmund, Sevilla, Bayern and...Crystal Palace? The only team forcing more long balls from deep in opponents half than Tottenham in the PL are Crystal Palace.   While City do try to take away options high, they aren't selling out to do so. In the early days of the Pep Revolution, where they are building their wall is right around midfield as teams move into their territory. Because without a wall on the border of your territory, are you really a team at all, folks? City's wall is luxurious, nearly impenetrable and getting 10 feet higher every week: no team across the top 3 leagues has forced opponents into a lower completion % moving from zone 5 to 4 than City at 63%. snip20161021_13 Nothing about where opponents pass the ball really jumps off the page for Liverpool so far, but how one category of how successful opponents are really leaps out at you. When opponents are trying long balls from the middle of the field against Liverpool (216 such passes so far), they are getting absolutely nowhere. snip20161021_14 The difference between Liverpool and #2 Athletic Bilbao in this category is equivalent to the difference between #2 and #11. Tottenham are close to the top here, but not quite at the extremes Liverpool are at. Why are they so good at stopping these types of passes? I'm not quite sure, but a strong center of the defense looks a decent candidate. Only 1 team (Las Palmas) has allowed fewer centrally completed long balls where center backs generally roam and no one has allowed a lower completion%. The anti-Swansea, who have horrifying volume and percentages against on these type of long passes, and particularly on those up the gut. snip20161021_16 So there you go. In the early going, while these teams all press, the types of passing that opponents result to in the face of this pressing has had some interesting differences.   Standout Teams One Way Or the Other Las Palmas Their focus on short passing and spacing is beautiful to watch and results in being a statistical standout. They refuse to make long passes toward goal at all and wind up around some big names (and Middlesborough) when you look at teams that rarely try long balls toward goal. snip20161014_19 Hull Essentially any look into almost any stat will immediately show Hull is in wildly over their heads and essentially are not a Premier League quality team. Their opponents are able to methodically move the ball zone by zone until they reach zone 3 where they can pass the ball around uncontested as they look for a way forward into the danger zone. In every single case, not only do Hull opponents go a zone at a time, they also have the highest completion%. In most cases, the gap is even larger for the completion %.    

  We can see Hull don't even have the ability to do any sort of Pulis-ball, look at the Zone 4 to Zone 3 chart. West Brom allow the lowest share of passes moving from 4 to 3, this is because they "stick" opponents in Zone 4 the most. No team in any of the top 3 leagues sees opponents pass within Zone 4 more often than West Brom's 55.1%. Hull can't even slow opponents there, not putting up resistance at any point until basically it's too late. Leipzig Sticking with the 3rd category above, one team who really stands out as great at forcing opponents wide is Leipzig. snip20161018_9 Tom Payne's analysis about the Leipzig-Wolfsburg match inspired this chart and emphasizes how well Hasenhüttl's plan has been carried out this season. I thought he was coach of the year, Europe-wide last year at Ingolstadt last year and his early work at Leipzig shows it doesn't look like a one year wonder. Standout Players I looked at players who have the most pass entires to zones 2, 3, and 4. snip20161020_6  snip20161020_5 snip20161020_7 Santi Cazorla shows up twice, which isn't too surprising if you saw my most common passing combo map a few weeks ago. He plays central passes in advanced areas and a ton of them. The Levels-Groß connection should have odes written about it for their sheer persistence in making it work over and over and over. You can really see the difference between the leagues in the leaders passing into zone 2. Groß is up there but as we saw earlier, the Bundesliga as a whole doesn't pass into this zone often and the individual leaders bear that out: only 3 players are among the top 50 (Groß, Kampl, Levels). Liverpool's territorial dominance is evident here with 3 players entering zone 2 and Henderson (despite his low %'s) right behind Cazorla in zone 3. Further Questions 1. How do these passing numbers change depending on game state? I suspect this type of piece would be one of the most revealing and informative pieces in recent years. 2. When we add in carries and dribbles how much does all this change?


I caused a bit of a kerfuffle on Twitter yesterday with my comments about football commentary, so much so that I thought it worth the time to explain it in longer format here. For those who are blissfully unaware, this is what I said: comms_1 comms_2 comms_3 comms_4 That last part was the real impetus for the complaint. My son watches Match of the Day religiously. Every Sunday morning, he sits captive in his pew, listening to the wisdom of the commentator pastors preach about football. Sitting there together Sunday morning, watching the Arsenal highlights, we heard this from Phil Neville: “Walcott should have scored 4, or at least 3…” MotD then walk through the first two goals, the first made via graft from Theo, the second mostly by being in the right place at the right time (though it’s a beautiful knockdown and turn). Then we get highlights of the next chance… “He’s GOT to score. It’s a simple chance.” walcott_chance_50res The shot is 7-8 yards out, off a headed pass. Walcott hits it cleanly, on the volley… WITH HIS LEFT FOOT. Now granted, it went straight at the keeper, but the problem here is this is anything but a simple chance. Let’s break it down by layers. heads_vs_feet_final First, it’s a shot with feet, 7 yards out. That’s between a 40-60% chance of scoring. However, it’s actually taken out of the air on the volley/half-volley from the side, which makes it closer to a cross. Now we’re in the 20-40% probability range, and probably at the lower bound because it’s a volley. And finally, it’s with Walcott’s weak foot. I’m an Arsenal fan and have been for about two decades now. There was definitely a period of time where I wasn’t sure Walcott HAD a left foot. Now this is a guess because the sample size is too small to actually calculate it, but let's say Walcott is probably half as likely to score from there with his left foot than his right. We have now gone from Neville’s “simple chance” (which was never actually simple) to at most a 20% likelihood of it being a goal. And this is my problem with commentary like this: It’s wrong. Not just off-by-a-little-bit wrong, or yeah, mostly-in-the-same-range-but-imprecise wrong. This is completely-misanalysed-what-happened-on-that-chance wrong. It would be more fair to say Walcott did well to keep the ball down and get it on target with some power than it is to say he missed an easy chance. The bigger problem here is that this happens all the time. We have commentary on football that doesn’t understand how the game actually works. I have heard commentators say goals should be scored off headers 12 yards away from goal. They must watch most of their football on some other planet, because that's not how football here on Earth actually works. Neville told the audience a very difficult chance was simple and that Walcott should have scored, neither of which is actually true. This is an opinion that is then easily refuted by data. Lest you think I am cherry picking, let’s fast forward to the analysis of the WBA v. Spurs game in the same show. Talking about Dele Alli, Shearer says, “I love the way he gets into the positions. He’s not afraid to miss them” (referring to shots). *nodding along* Cool. Me too. Dele is such a clever midfielder in and around the box and I really love his sense of spa… Shearer: “He’s got to work on his finishing.” Wait, what? Why? Why would you say that? Aside from the fact that Alli is young and all football players presumably work on their finishing in some way, why would Shearer explicitly say this about Alli? Unfortunately, we never find out. The show then cuts to show Alli’s goal, which just happens to feature an insanely good finish. alli_goal_lowerred Nine yards away, outside of the boot, across his body, in the corner and away from both the keeper and a defender on the line. The level of difficulty on this is incredibly high. Funnily enough, if you look at both Alli’s and Walcott’s goal scoring stats, you see that they actually score more than we expect them to versus expected goal models. This is a pretty reasonable indication that both of these guys are at least above average when it comes to finishing chances. But the Match of the Day analysts are telling us Theo missed a simple chance and Dele Alli needs to work on his finishing. We Can Do Better I don’t think Match of the Day or general football commentary needs to become a bastion of precise statistical analysis. It's possible you could post the expected goal of a shot as a pop-up graphic on the screen, but it's probably not necessary, and I certainly don't want Alan Shearer saying a shot from a certain location is normally a 43% chance of being a goal. However, I do think our commentators need to be less wrong. Telling kids that the Walcott chance was simple and “should” have been finished is incorrect. This happens constantly, and I think the reason it happens is because so many of our ex-footballers who are now commentators were poorly trained in what should and should not be a goal. The only way to overcome this is through education, and the best way is teaching via example. Get the data from their own finishing, or from that of their teammates, or current favourite players, or whatever, and walk them through the actual probabilities. I have done this myself with both footballers and coaches, both of whom have found the concepts very easy to grasp. Additionally, telling the audience that Alli needs to work on his finishing might be fine, but you need to justify it with actual logic and example, and not just use it as a throwaway line. The great part about having guys like Shearer on Match of the Day is that he was an amazing goalscorer. Leverage his knowledge to talk about what he thinks players could do better in front of goal, but in specifics not clichés. Why do you think Alli’s finishing needs to be better beyond the fact that every shot he takes does not turn into a goal? How does Alli make his finishing better? What part of a technique does someone need to improve to allow them to put away these allegedly simple chances? Or did the goalkeeper simply close off all possibility of a chance becoming a goal by good anticipation and positioning? So many times there are coaching badges on the couches that go completely unused as part of the commentary, but these little elements are the actual expertise the analysts bring to the show. Segue into past video analysis of Alli's poor technique and suggest how Pochettino could improve that via drills going forward. And while on the topic, maybe producers need to stop assuming that these ex-footballers actually know everything there is to know about football. It’s now clear to the general audience that they don’t, but somehow that never stops them from being both overcritical and constantly speaking in cliches that lack insight. Story Time The models for expected goals aren’t perfect. They never will be. That is an up-front admission that doesn’t remotely limit the impact of using them. Right now the models are so much better than the pundit estimations it’s farcical. It doesn’t have to be that way, and with some experts it's definitely not. Last summer I sat in a room with Bob Bradley – you may have heard of him - and I talked to him about how our models work in evaluating chances. His reply to me was that there are plenty of instances where a chance is either better or worse than the model thinks it is, because someone is blocking a player’s shot, or because "two of my guys have a body on a player trying to make a header" close to goal, or any number of other reasons. And he was totally, completely correct. Without tracking data, we lacked the appropriate info to evaluate the different factors he throws into evaluating his own team, and even with tracking data, we couldn’t see everything with his expert eyes. On the other hand, Bob’s eyes can’t evaluate every touch in every game across 27 different leagues. And no one in the world could sell us tracking data to go along with it. We could neither clone/scale Bob’s expertise, nor buy data that would let us make the dramatic improvements he wanted, so we simply did best with what we had. Which, at that time, was probably better than 95% of other clubs in the world in using and evaluating data. Expert eyes plus the data is far better than the data by itself. But right now, data is dramatically better than the "experts", or at least the ones we see regularly on TV. At some time in the future, it will be commonplace for experts to leverage data models and tell us why something was a better or worse chance than we might think based on the specific game situation. Unfortunately right now, using data at all is still a scary thing. No One Will Watch/Read That Back when I started writing about football stats in 2013, penalties were lumped in to all goalscoring stats, and no one in media ever used rate stats like per90. Nowadays we see some level of sophistication in both of these topics, even in mainstream outlets. Basically, we are three years down the road, and a small proportion of the media talks about football in the way we started to back in 2013. Should we be satisfied with that? Hardly. The advances made in the last three years in public football research and knowledge about the game dwarf anything we knew back then. The two things that haven’t changed at all are the older generation of newspaper writers and the entirety of television punditry. Columnists like Martin Samuel and most of his generation are a lost cause, which is fine because I’m not sure having them as champions for new things is a comfortable fit anyway. Television though… Here's the thing – the audience for football on TV is changing rapidly. Consumers have more football knowledge at their fingertips than ever before, but somehow almost none of that makes it through to broadcasts. Meanwhile, nearly every other sport on the planet is rapidly pushing forward with ways to add insight to their product. The combination of these two elements makes football look like the kid wearing a dunce cap in the corner of the classroom. It’s really not hard to talk more intelligently about the game. A lot of the work done in analytics is easily explainable to footballers, which means it’s both useful and digestible for a television audience. Arsene Wenger mentions expected goals in a press conference. Do we think that’s by accident, or because he uses that as a tool to determine how well his team and individual players are executing on the pitch? And IF the concept is completely embedded at Arsenal and other teams in the Premier League, maybe it deserves to be broadcast on television? Instead we get “distance run” stats, which to my knowledge have never been proven as relevant to anything in football other than telling us who happened to run further in a match. Which must be interesting for reasons that are completely unknown to me, because I have always been the type of person who wanted to figure out how to get the best results from the least amount of work, not the other way around. I had a throwaway comment yesterday that we should take the Sunday Supplement panel and privately ask them how likely they thought a number of different shots were to turn into goals, then discuss the results. This sparked a lot of ire from certain members of the journalistic community. To me it’s an interesting experiment because again, I don’t think the actual difficulty of chances is very well understood, even among people who cover the game on a daily basis. To some journalists, I had clearly profaned their own Sunday morning church of aspirations with the prospect of making their elders potentially look silly. My point was not to make people look bad out of ignorance. My point was that I want our press to speak more intelligently about the game, at least in this respect, and the only way you can do that is via education. I am a huge, avid consumer of sports media. I subscribe to both Sky and BTSport. I am a Guardian member. I have been paying for ESPNInsider nearly constantly since 2002. I have also written for the Daily Mirror and appeared (briefly) on Sky, TalkSport, and BBCRadio 5Live. When I say we can and should do better, my criticism is both as a paying customer and as a producer of media content. Back when Monday Night Football first started, there was a clear common belief that the average football fan probably wouldn’t be interested in smarter television about tactics and analysis. MNF blew that idea out of the water, but strangely we haven’t seen a flotilla of clones try to copy the formula. I’ve heard the argument that people won’t watch or read more smart statistical analysis with their football either. They might be right, but it’s not as if anyone has really tried. Meanwhile, Fantasy Football is one of the fastest growing ways to interact with the Premier League. There are literally tens of millions of unique players who take part in fantasy football leagues every week - a game that is entirely driven by statistical output of players. This is to say nothing of the legions of Football Manager and FIFA Ultimate Team fanatics who also deep dive into games that are also entirely driven by player stats. Hmm… Meanwhile, in an age of media cutbacks, the world’s largest sports media company ESPN just launched a stats and analytics vertical because they thought there was a rabid audience out there who want to read smarter analysis of all sports, so it made sense to collect that information into one easily accessible place. Hmmmm… Oh my god do people gamble every week on the Premier League. With odds. That are probabilities. Hmmmmmm… OptaJoe has 852K followers on their English account alone. WhoScored and Squawka nearly 600K a piece. ¯\_(ツ)_/¯ There is an audience out there. Leveraging their interest while making a workable media property is the tricky part, but some combination of fantasy sports, gambling, statistical analysis (especially visualizations) and tactical insight probably makes sense. Conclusion I want smarter football commentary. I want it so that my children, who are rapidly becoming football consumers themselves, can understand more about the game they are learning to love without me having to fact check every step of the way. And I want it so that I don’t want to simply turn off the sound or fast forward the analysis section every single time I watch your program. The bar here is pretty low. I have personal experience teaching a lot of these concepts inside of football, and they are honestly not that hard to grasp. I also want to listen to experts who clearly have something unique to contribute, and see them show their expertise in a way that teaches me new things. And I want to hear far fewer cliches from guys who are paid significant sums of money to appear on my TV and contribute knowledge and insight while doing anything but. I will take insightful, lesser-known commentators and analysts on my football every time versus well-known, cliché-spewing sound machines. The vast majority of people I talk to will too. There is an audience here. If you build it, they will come.

StatsBomb Podcast: October 2016

[soundcloud url="https://api.soundcloud.com/tracks/286218325" params="auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false&visual=true" width="100%" height="450" iframe="true" /] Ted Knutson and James Yorke take a spin around the key numbers in the Premier League and beyond... Downloadable on the soundcloud link and also available on iTunes, subscribe HERE If you like, we'd love it if you shared it. Thanks!

Statistically Scouting the Bundesliga's Buildup

One of the best attributes of looking at passing data is the samples pile up quickly a few games into the season. While teams still have just a handful of total shots, you can break down passes almost anyway you like and still wind up with hundreds in each bucket. One thing I've been doing with this dats is looking at how teams progress the ball up the pitch. It's been illuminating to me and I'd like to share some first impressions with you today. Build up across any league varies wildly, we all know this especially when we get to see Sander's maps illustrating the variation every game. I thought it would be interesting to peek behind the curtain on the differences in how teams build up their attacks via pass. Hopefully it’s interesting to see the nuts and bolts, it's something I can easily imagine drilling down into when preparing for an opponent or when trying to self-scout your strengths and weaknesses. It's also something that doesn't often get discussed when we talk about numbers or analytics in soccer and is probably under-covered statistically. This is a bit of a journey across the league to show different ways this data can help, so there is no one theme here it's more of a buffet of analysis. Yes, this section droning on about Mainz might look unattractive like the lukewarm green beans to you but stick around, beautiful fried okra might be coming up next with a killer nugget about Benjamin Henrichs.   Definitions To sort out my thinking, I divided the pitch into zones. Attacking from left to right, so Zone 7 is furthest from goal to zone 1 being right in front of goal. You can quibble with these and I'm open to change, but it's what we are using today so keep that in mind. snip20160930_1     We start nearly at the back in Zone 6. These numbers are for passes originating in zone 6. Zone 4+ means passes to zone 4 or anything closer to goal than that (3, 2, or 1). snip20161004_22 snip20161004_21   So to clarify: on average the ball is advanced at least 1 zone ~54% of the time (2nd chart shows this with the first two columns combined) and trying to jump 2 zones is well below a 50-50 proposition (first chart shows this at 38.7%). This makes RB Leipzig's numbers fascinating. Only Darmstadt jumps 2 zones more often that Leipzig's 31% and Leipzig combines that with a league-leading 47% completion rate on these long balls. Bayern at 46.8% are right behind them. How are they combining both hitting it long constantly with being the most efficient? Really it's goalie Péter Gulásci being very successful at finding Yussuf Poulsen with long balls. snip20160930_13 Look at the long balls from Gulásci to Poulsen in the Gladbach game alone: snip20161004_23 I'm not exactly sure how to parcel out the credit between Gulásci, Poulsen or some sort of scheme to get your forward more open than normal on long balls but it's clear Leipzig have an edge so far this season at using the long ball, they know it and are using it a lot.     Mainz struggle massively in this zone. The distribution of their passes go is close to the league average (they pass backwards a bit more and long forwards a bit less) but their completion rates when trying to move forward 1 zone or within the zone are easily worst in the league. snip20161004_26 When we drill down further we find a scapegoat of sorts in left-back Gaëtan Bussman who has gone just 24/43 passing for an atrocious 56% from this zone without being a long-ball specialist. Opponents should see this and be ready to pounce. What is the platonic ideal of a perfectly positioned team? Probably Bayern. 47% of the time they are advancing from Zone 6 to Zone 5, well ahead of the team in 2nd. They also try the long ball the least of any team, though when they do try it they near Leipzig in completion%. They are also progressing the ball in the exact order you'd expect, with midfielders Thiago/Xabi Alonso/Vidal/Sanches making up the majority of those on the receiving end of these passes. This can be compared with a team like Gladbach, who have both defenders like Christensen and forward players like Stindl receiving passes in zone 5 a lot. Two teams who "build up at the back" but don't "play out of the back" are Hoffenheim and Gladbach. They have 2 of the top 5 completion %'s in this area overall, but play the lowest share of passes forward out of zone 6. So while their defenders completion %'s might look nice, the ball isn't getting anywhere often.   Jumping To Zone 4 Check back up top to re-orient yourself where we are on the pitch now. These are passes played from the zone 4 area. snip20161004_29 snip20161004_28   The first thing that jumps off the stat sheet looking at this area of the pitch is there are 3 teams who look extremely similar with elite completion percentages. Wolfsburg are hard to separate from Bayern and Dortmund just looking at the rates: snip20161004_30   They even have a much better rate on the long balls into dangerous areas (though have only played 49 total), so what's going on here? One obvious point is Wolfsburg gets to zone 4 much less than the other 2 (~70% as often) but another is they aren't as well positioned when they get here. So even though they are completing passes at equivalent rates, they are having to use Julian Draxler a lot to pass, while Dortmund's 3 most common passers from Zone 4 are Weigl, Schmelzer, and Sokratis and Bayern's are Thiago, Lahm, and Alonso. When Draxler gets involved, it's a sign you don't have much left in front when you need them to actually create the chance. Sure enough, Wolfsburg's passing rates fall well back to the pack when you move into zone 3.   This is also a distinctive area on the pitch for Roger Schmidt's Leverkusen. While they often pass backwards from Zone 6 (only Hertha do it more in fact), once they move into zones 5 and 4 it's full steam ahead. No team passes back less often than Leverkusen in these areas of the pitch. An interesting comparison is them with Dortmund, the team with almost identical deep completion numbers and wildly differing strategies of what to do with the ball in zone 4. Just look at the list of the two teams combined, sorted by how often a player moves the ball forward out of zone 4. snip20161004_35 6 of the top 7 are Leverkusen players, only Raphael Guerreiro is among the red storm. The Kampl/Aranguiz rates compared to Weigl/Rode/Castro are enormous. And then check out the differences in two of the fullbacks passing from nearly identical positions: snip20161004_38 Obviously this is an extreme case, but drives the point home: these teams are going different places when the ball reaches this area on the pitch. Dortmund and Bayern are the most patient teams in this area of the pitch as they are all across the pitch, playing intra-zone passes 52 and 53% of the time when the average is just 45%. They play the fewest multi-zone advancement passes of any teams as well, well under half of the league average rate of 6.3%.   Let's take a look at the most common way each team progresses from zone 4 to zone 3: snip20161005_53 A few takeaways -One, Pascal Groß's volume (and Tobias Levels volume in passing to him) continues to simply astound. I've written about it before but the amount of time he handles the ball is just mind-numbing. This year Ingolstadt are actually well clear of the rest of the league in 5th as far as territory dominance. They seem to have good fundamentals, but man nothing is going right luck-wise: they've already allowed 6 goals from outside the box on their way to 1 measly point. -Wolfsburg having to use Gomez to pass to Draxler to get the ball something like 40 yards from goal explains a lot about what's wrong with their attack. -Verhaegh has long been one of the Bundesliga's nice lesser known players with how well he's played at fullback for Augsburg. His legs might be slipping a bit, but he's still a key part of their attack.   Zone 3 snip20161004_41 snip20161004_42 Again check back at the top if you are losing where we are on the pitch. This is close to the edge of the box, near the danger zone, where most crosses come from and not yet in shooting areas.   Here the two counter-attacking, red-shirt wearing, mid-to-upper-table teams really stand out. Köln and Mainz throw the ball forward into the danger zones 1 and 2 from Zone 3 a league leading 46 and 44% of the time. When you counter and get this close, their plan is clearly not to circle back around letting the defense get set. The difference is really clear when you look at a more conservative team like Dortmund. Mainz play almost 5x as many passes into forward zones from here than they do backwards, Dortmund just 1.3x as many forward as backwards. Mainz are completing at a basically league-average rate, Köln are way below. Risse's crossing is beautiful when it works, but there are a ton of lost possessions that come along with it.   Dortmund and Bayern are again the least likely to play forward, but Bayern's success rates when they do are what separates them from the rest. snip20161004_43   One interesting difference between Bayern and Dortmund is how often Bayern finds their main goal-threats on these types of passes (Lewandowski 31 and Müller 30 times) while Dortmund spread the wealth much more (Aubameyang 11 leads the team, but 11 different players have received between 6 and 11 passes).   On the self-scouting side for Bayern, it might be time to have a word with David Alaba. He is crossing the ball and trying to complete it in deep areas too often for how ineffective he is. He has the worst completion % on forward passes from zone 3 by nearly 20 points and also tries them more than anyone short of Robben, who has just 43 total passes in his minimal time on the pitch. Better pass selection please David. snip20161005_44   Conclusion Adding in carry and dribble data would be nice and add a bit more context around the edges but there is plenty here to work with. There are plenty of helpful nuggets here when preparing for a match. Opponents should know Mainz try to be basically a normal team building up from the back, but are just awful at it, specifically their left-back Bussmann. Bayern coaches should get to Alaba to help him out with his final third decision making. Leipzig opponents should know the goalie is good at finding Poulsen, etc. There are all kinds of insights to be gleaned from drilling down into numbers like these, I hope you enjoyed me scratching the surface.   Charts 'n Totals snip20161005_48 snip20161005_47   snip20161005_50 snip20161005_49