The beginner’s guide to reading, writing and pitching about football analytics

Do you find yourself with time on your hands these days? Suddenly staying in on a Saturday night for the good of humanity? And just to top it all off, you have to seclude yourself with no sports to watch. Separately, have you noticed an explosion of numbers in football? A sudden rash of xGs springing up all over the place? Suddenly everybody seems to be spouting off about stats and you’ve got only the vaguest notion of what they’re on about?

Well, you’re in luck. StatsBomb copy-editor and general woman about town Kirsten Schlewitz is just like you! While she’s an expert at correcting the incredibly sloppy copy you’ve all come to know and love from me, she also came in as a relative novice at this whole stat thing. So, I roped her into asking me every question she could think of that she might have otherwise been afraid to ask. So, let’s get started.

K: First of all, let’s get the elephant out of the room. What is xG?

M: xG is short for expected goals. It’s a statistic that attempts to measure how likely any given shot is to become a goal. It’s really good at predicting the future. That is, xG is better at telling you which teams will score and concede goals going forward than any other statistic we have. That’s the most basic barebones definition I can think of.

K: If you already have xG to predict who will score and who won’t, why are so many other numbers needed? I see a great deal of figures and maps when I edit pieces, and sometimes I don’t understand what their purpose is. For example, when comparing two players, you can’t rely on xG for a team. So what numbers would be used there?

M: The answer to your first question is that knowing who is more likely to score goals going forward isn’t a particularly interesting thing to know (unless all you care about is betting, which, fair enough). The interesting questions are the hows and the whys. A single number like xG doesn’t help you very much with that. I like to think of xG as being a statistic that makes sure the conversation starts in the right place, as opposed to one that tells us anything remotely close to what we need to know.

So after the conversation gets started that’s why we need all the other stuff, to examine how teams play, what individual players are doing, basically what’s going on on the pitch that leads to the xG number at the end. And when those numbers come to particular players things can get very complicated very quickly. That’s because while xG works fine for players (specifically it can tell us when a player is on a hot or cold streak that’s unlikely to continue), shots are only a relatively small part of what’s happening. And, quite frankly, the further we get away from the actual shot, the less definitive our numbers become about what’s good and what’s bad, and the more we rely on them to try and accurately describe the game, as opposed to predict outcomes.

K: So you’re saying there are more or less two sets of numbers that a StatsBomb article could use: ones to predict which team will play better going forward, and ones that tell us what happened in a previous game (games?) in a way that dives deeper than simple match reports. If someone wants to write an article about, say, how they think the Champions League would have panned out this year, would they only use the prediction numbers, or would they also examine numbers that show what happened in previous matches?

Or am I way off base here and all the StatsBomb stats are used in conjunction with one another, rather than existing as two separate sets that focus on past and future?

M: So this is exactly right conceptually. The problem is that the numbers often overlap in ways which make the divide not particularly clear cut. For example, xG is an excellent stat for predicting the future, but it’s also a pretty ok one for explaining what happened. We know more about a match if we say that Arsenal had 1.5 xG than if we said Arsenal had 15 shots. Using the xG from a single game is kind of a quick and dirty way to describe what happened, albeit one with plenty of faults.

The best use of numbers though will always combine prediction and explanation. If I wanted to look at upcoming, now cancelled, Champions League matches, I would use general xG numbers as a starting place and say, “Here’s what I think will happen based on these numbers” and then use everything else to say, “And here’s why.” Now that also doesn’t mean xG is perfect. Doing good work in stats means trying to understand the limitations of the numbers as well so that we can understand when they might be missing something. So, in theory, it might be possible to analyze all the whys and hows and decide beforehand that even though a team like Liverpool might seem much better based on xG, they would struggle against Atléti (that’s not a conclusion I would have come to, but it’s not like completely beyond the pale to suggest).

K: We keep talking about “the rest of these numbers.” For someone who’s completely intimidated by stats, to the point they’re afraid to even click on a StatsBomb link, much less pitch you an idea, what other types of numbers would you anticipate they’d need to understand?

M: From a writing perspective, understanding the numbers is somewhat less important than understanding the game. If a writer is making accurate assertions about the game then those claims are going to be reflected in the numbers and in the editing process we can work together so that your friendly neighborhood StatsBomb editor (me) can help give you the appropriate statistical support you need.

So, if a writer wanted to write about how a team relied on a midfielder for a lot of their buildup play, they wouldn’t need to know the ins and outs of StatsBombs numbers. But I’d be able to call upon stats of ours like “deep progressions” to look at how frequently they move the ball up the field, or at passing percentages when they’re pressured and not pressured to explain how they’re cool in the face of a defense, or information and graphics on pass length, etc. etc. etc.

Now, if the numbers don’t match a writer’s argument that makes for an interesting challenge. The question of why a writer perceives the game a certain way while the numbers don’t capture it is generally a really exciting place to do analysis. Figuring out why there’s a disconnect between what the numbers capture and what the eye might see is usually an interesting endeavor for everybody involved.

K: I’m here editing and writing articles, and I fully admit I don’t comprehend exactly what half of these numbers mean. But if I wanted to submit an article that showed I do understand a few of the statistics, which do you think would be most important to understand?

M: You do need to understand the basic mechanics of xG and why it works so well. It’s important to understand that a player having more goals than xG expects he “should” is likely to start scoring less. Beyond that I’m looking less for knowledge of a specific stat than for a way of thinking about questions. Questions like, “Do you have a statistic that measures XYZ” are good, questions like, “How do you go about measuring ABC” are even better.

K: From xG and its variations (non-penalty xG, open play xG etc), it’s relatively easy to assess the offensive strength, or lack thereof, of a side, even if you’re new to stats — and I can attest to this, believing I had no ability to comprehend sports statistics before I took this job. But what still tends to confuse me is the defensive measurements . . . I see the maps and figures, but even those don’t help me quite get it.

M: Yeah. Defense is hard. We can look at xG conceded, or shots conceded, or any number of other things, but those are still fundamentally measurements about what the other team’s attack is doing. And that makes sense, because on some level all defense is is preventing the other side from attacking. But it’s also unsatisfying because defenders are obviously doing SOMETHING and it would be nice to describe what those things are.

The traditional measures are things like tackles, interceptions and blocks, and while those are useful numbers, they have some major problems. The biggest is that you can’t commit those defensive actions while you have the ball, so players on bad teams tend to have more defensive actions than players on good ones that keep the ball all the time. One thing we do is adjust all of those numbers for possession, to try and give a better picture of what’s going on.

On top of that we track pressures. That is, we track every time a defender is close to an attacker with the ball and impacting him in some way. This gives us a lot of information — adding pressures into the mix demonstrates where on the field a team is making defensive actions.That gives us the ability to look at a heatmap of a team’s activity and really get a picture of where on the pitch they like to defend (the redder the square the further above average the number of defensive actions are in the zone, the bluer the square, the further below). Manchester City defend basically in their opponents penalty area, for example.

All of that’s a long winded way of saying that it’s really really hard to evaluate defenses!

K: So we know how offense is evaluated, and we know how defense is judged — somewhat, anyway. With these two necessary halves of the game described, I have one final question: What would you like to see a writer be able to demonstrate with the numbers, keeping in mind that the StatsBomb blog is there to both educate readers and show potential purchasers what they can do with the data?

M: The major thing I want to see isn’t a specific proficiency with data, but rather a framework for thinking about issues. Think about a question you want to answer, and how can you use data to answer that question. That’s what we’re all trying to do, whether it’s determining if a potential signing will be worth it, or why a player is having a career year, or if a keeper’s yips will pass, everybody is fundamentally doing the same thing. Whether it’s analysts with teams, or fans in the stands, or writers for StatsBomb, they’re looking at the game, developing a question and then trying to answer it.

StatsBomb Conference Videos: Valuing Player Influence and Evaluating Defensive Risk in Expected Threat Models

The videos, they just keep coming.

First up today we have Ryan Beal (@RyanBeal95) with his talk, “Valuing Player Influence Within Teams”.

 

 

Next up it’s Robert Hickman (@robert_squared) presenting his paper, “Considering Defensive Risk in Expected Threat Models”.

 

 

StatsBomb Conference Videos: Estafania Vidal and Thom Lawrence

Happy Friday! We’ve got two more conference videos for your viewing pleasure.

First up, our very own Thom Lawrence (@lemonwatcher) would like you to know, Some Things Aren’t Shots: Comparative Approaches to Valuing Football.

And next, we’ve got Estafania Vidal (@tefavidal) and her talk, Understanding Entry Zones in Football.

 

Aston Villa have given midfielder Marvelous Nakamba a thankless job, and he’s doing it admirably

Aston Villa have made a surprisingly strong start to the season. It’s not so much their position in the league table, they’re 15th, but only one point above the relegation zone. Rather, the eye opening number is the fact that eight games into a season where the newly promoted team was expected to struggle, they’ve actually scored one more goal than they’ve conceded. Sadly for Villa, the xG numbers don’t exactly back up their goal scoring record.

 

Specifically on the defensive side of the ball Villa have struggled with a case of the Arsenals. They give up a lot of shots:

And while the quality of those shots are generally low, they aren’t on average low enough to make up for the extremely high volume.

Nowhere is it as clear just how overwhelmed this Villa defense is than in the numbers of Marvelous Nakamba. The poor guy really is doing the most defensive work he can.

 

Nowhere will you find a clearer statistical example of a midfielder shielding his defenders than what he does. He just sits in the middle of the field and breaks up play after play after play in front of his own penalty area.

 

The problem for Villa is that it sure seems like their defensive midfielder is their first line of defense. The side simply offers zero opposition until the ball is sitting in Nakamba’s lap.

 

Villa’s surprisingly lively attack has received a lot of praise, especially after hanging five goals on Norwich last weekend. And it should. It’s no small thing for a newly promoted team to be a Premier League average attack after eight matches. But it’s coming at a cost. And the brunt of that cost is being born by Nakamba.

That’s not to say, that Villa’s defensive midfielder is perfect, or maybe even above average. His numbers in possession are truly terrible. He does very little to move the ball up the field, and despite his relatively unambitious passing, his pass completion percentage is only 84.5%.

 

Nakamba is as pure a defensive midfielder as you’ll find in the Premier League. His job is to shield the back line and win the ball. On Aston Villa, that’s a near impossible task. Villa is a classic case of a team that’s putting its defensive personnel in a position to fail. Nakamba is performing admirably despite that. If Villa continue to play this way, and manage to remain in the Premier League it’s going to look ugly defensively all year long. If it works, it’ll just barely work, but if it weren’t for Nakamba it would probably have no shot of working at all.

Can any of these Champions League underdogs shock the big boys?

As we role into Champions League match day two, the biggest teams from across Europe will inevitably get the attention. But, as the group stage progresses, a number of smaller teams will look to stake their claim to be this year’s Cinderella story. Here’s a quick look at some of the less heralded teams in this year’s Champions League and what they do, or don’t do well.

Slavia Praha

Things could not have gone better on match day one for the team from Prague. Not only did they travel to Italy and come back with a point from Inter Milan but Barcelona and Borussia Dortmund also drew, leaving the group wide open. Getting stuck in such a difficult group meant that their odds of even making the Europa League were always going to be long, but every Cinderella Story starts with a single invitation to the ball.

If their match against Inter taught us anything, it’s that this team plans to remain true to itself, no matter how challenging that may be. Manager Jindřich Trpišovský has his team press the opposition early, often, and unrelentingly. Through nine games of their domestic season, this is what their defensive heat map looks like.

In attack, their approach is generally to shoot early, shoot often, and then use the defense to get the ball and do it again. With such an active defense, shot quality comes second to simply burying opponents under an unending deluge of attempts on goal, even if many of those individual attempts are somewhere between speculative and downright deluded.

 

Of course, executing that approach when you have the most talent in your domestic league is a lot easier than it is when you go up against some of the biggest teams in Europe, but as we can see from their first match against Inter, they’re sure going to try.

Advancing from this group remains an extremely unlikely proposition, but if it were to ever happen, this would be the way. First the draw against Inter, and now they get to host Borussia Dortmund at a time when the team seems to be ever so slightly off the top of their game. The German side is coming off back to back disappointing 2-2 draws against Eintracht Frankfurt and Werder Bremen dropping them to eight in the Bundesliga. Slavia Praha probably won’t shock the world, but if they are going to, it helps to catch their group stage opponents at their lowest moments.

Crvena Zvezda

It seems silly to suggest that a team that has started their season with eight wins and a loss is struggling, but sometimes it fits the bill. Sure Red Star sit tied for points at the top of the table and even have a game in hand on Bačka Topola the unlikely co-owners of the league’s best record, and yes the side sits one point ahead of bitter crosstown rival Partizan, but the stats suggest, that at least early in the season all was not quite right with this perennial powerhouse.

It’s hard for Europe’s smaller teams to compete on the biggest stage if they aren’t absolutely dominant at home, and Red Star simply haven’t been. On both sides of the ball it’s been Partizan that is the stand out performer (the following stats don’t include the last two match days, which is important to note were both comfortable 3-1 wins for Red Star). On both sides of the ball Partizan have the stronger xG numbers and thus a significantly higher xG differential so far this season.

This difference clearly showed itself when the two sides squared off earlier this season. It’s one of the most hotly contested rivalries in the world, and while Partizan left it until late to score their two goals, they were clearly the better team over the course of 90 minutes.

Even though their domestic form suggests Red Star might not have enough fire power to threaten any of the favorites in their group, Tottenham Hotspur’s opening week draw away to Olympiakos has left the door ever so slightly ajar. A good result this week at home against the Greek side combined with a Spurs loss to Bayern would actually vault the Serbian team into second place in the group and at least give them a fighting chance in the weeks to come.

Galatasaray

The Turkish giants have been a mainstay of the Champions League for years. They haven’t exactly threatened to do anything historic in the knockout stages, but they’ve always been a hard out for Europe’s biggest clubs, a side good for an upset or two under the right circumstances. This year, with groupmate Real Madrid getting trounced in the opener getting trounced in the opener by PSG’s backups, it at least seems to open the door for Galatasaray to pull off a shocking upset.

This year’s version of the Turkish side is going to have a real hard time driving through that door. Despite being filled with players you know and are at least mildly fond of, the team simply hasn’t been very good this season. Despite the likes of  Jean Michaël Seri, Sofiane Feghouli, Younès Belhanda, Steven Nzonzi and Radamel Falcao leaving viewers in a constant state of “oh yeah I remember that guy” the team hasn’t gelled together at all.

Right now, Galatasaray sit in just seventh place domestically. Despite all those names, they simply struggle to score the ball.

They neither create a lot of shots, 12.87 is the ninth most in the league, nor create particularly good shots, only six teams have a worse average xG per shot than their 0.07.

And the sad reality for fans of Turkish football is that if you’re not good enough to dominate at home, you’re probably not good enough to shock the world and outlast Real Madrid or PSG, even in a year when Madrid are struggling. Not all underdogs are created equal, and this year, Galatasaray have simply shown no signs of being able to do the things that they’d need to in order to upend the expected Champions League order.

Tanguy NDombele marks a step forward for Tottenham Hotspur, their first superstar purchase

Tanguy NDombele is officially a member of Tottenham Hotspur. It’s a momentous moment for the North London club. The acquisition marks a change in not only the club’s approach, but their stature in European football. For the first time Spurs have successfully purchased a top line young star, one whose acquisition would not be out of place at any of the top clubs in the world.

Spurs have had superstars before of course. They developed Gareth Bale. They set (or at least tied) a club record to purchase Luka Modrić from Dinamo Zagreb. More recently they’ve developed Harry Kane, and won the transfer lottery with Dele Alli. Both Christian Eriksen and Son Heung-min are talented players bought for reasonable prices who have blossomed along with Spurs.

NDombele, however, is something different. Spurs successes, up until now, have been achieved by doing the things that a team has to do when the biggest players in the world are unavailable to them. Whether it’s developing their own talent, or successfully taking risks on players that other teams decide not to, Spurs, by shrewd management, or dumb luck managed to build a team of stars that other top clubs decided they didn’t want. Now, for the first time, they’ve acquired the kind of young midfield star that every team in the world targets.

NDombele is the kind of midfielder who can do everything. He’s equally adept at patrolling the midfield defensively and bringing the ball forward in attack. Lyon’s success hinged on NDombele doing both. NDombele led the team with 9.08 deep progressions per 90 minutes.

Continue reading “Tanguy NDombele marks a step forward for Tottenham Hotspur, their first superstar purchase”

Argentina, Canada and the art of playing ugly at the World Cup

Underwhelming matchups are a staple of international play. A favorite doesn’t quite dominate, an underdog hangs tough, and the result is a scrappy, disjointed affair. Sometimes the bigger team sneaks by, sometimes the smaller one holds on, but either way they can be difficult to love (unless you happen to be invested in said underdog).

Not all annoying defensive matches are created equal. Sometimes a defensive underdog plays out of their minds, other times a favorite coasts, and most of the time the balance is somewhere in the middle. On Monday, as the World Cup entered its first full week, Argentina squared off against Japan and Cameroon faced Canada. Both games were difficult, defensive affairs. In the first, however, it was Argentina’s committed defensive performance that drove the match, in the second it was the favorite Canada’s conservative focus on doing just enough to win.

Argentina did a masterful job of foiling Japan’s approach. The unheralded, underfunded, South American side committed to clogging up the midfield, and when they succeeded, Japan had no fallback plan. Japan played a traditional 4-4-2 and were simply never able to work the ball through the midfield to the strikers supported by wingers. Over and over again Argentina waited for Japan to try and move the ball through midfield and then blew up the play. Japan’s passing network is just a mess of sideways and backwards connections around the periphery. And when the ball went into the middle, Hina Sugita and Narumi Miura, if they kept it, were reliably forced to play backwards and sideways.

 

In large part because Japan only had two midfielders, and they never committed to having anybody else help in the center of the park, they weren’t able to use possession to force Argentina to defend deeper in their own half. Argentina, for their part, were more than willing to commit their attackers to help blow up midfield. Striker Soledad Jaimes often times was drawn into her own defensive half, leaving winger Estefanía Banini as the only attacking option when Argentina regained control. It was conservative but effective, as Argentina’s defensive pressure map shows, they managed to defend well above their own box, and were not regularly forced into the kind of defensive shell that more talented teams can tee off against.

There are ways Japan could have chosen to combat Argentina’s approach. Instead of trying to go through midfield, the team could have simply played around it. They could have attempted to punish Argentina for contesting the middle so heavily by playing over the top, or tried to push their fullbacks high up the pitch early in possession to stretch the game laterally, making it harder for the Argentinian swarm to do its thing. But they didn’t. It wasn’t until Jun Endo came on in the 73 minute that the pattern of the game changed. Japan got the ball forward more quickly more effectively in the game’s last quarter but by then it was too late.

That’s how surprising results happen. The underdog has a plan, the favorite fails to react until it’s too late and before you know it Argentina walks out of the match only conceding eight shots and 0.24 expected goals while getting five of their own for 0.11 total. Argentina didn’t attack, but their defense, with a little help from Japan’s stubbornness, kept the thing close enough to fully warrant the side’s historic first World Cup point.

That’s a far cry from what happened when Canada defeated Cameroon 1-0. In that match, Canada controlled the game from whistle to whistle, and the favorite, as opposed to the underdog, was directly responsible for the conservative nature of the match. Canada played fairly conservatively throughout the first half, took the lead right before the whistle, and then they really decided to play unambitious keepball for the final 45 minutes.

As you can see from the difference in pass maps, the fullbacks stopped getting forward at all, the midfielders dropped deeper, and Jessie Flemming dropped into midfield from her striker position to collect the ball and knit things together. In the first half it was Sophie Schmidt stepping forward from midfield to do that. Canada had the lead, so what was the point in keeping their foot even moderately grazing the gas pedal.

 

That’s reflected in the shots they created as well. While they took eight shots in both halves, all shots are not created equal. In the first half, seven of Canada’s eight shots were from within the penalty area. They averaged a relatively unimpressive 0.060 xG per shot (0.058 from open play). That’s not exactly cutting a side open, but it’s still well ahead of their second half numbers.

The team’s eight shots after the break averaged an anemic 0.031 xG per shot (0.019 from open play). And that includes an 87th minute 0.11 xG chance from legend Christine Sinclair which made up the bulk of the scoring that half. For most of the time Canada was content to move the ball, and make sure Cameroon had no space to counterattack into, confident that the African side could not build their own attacks from the back. Given that Cameroon managed only four shots and 0.09 xG it seems like a reasonable plan, even if it was a boring one to watch in action.

The moral of the story is that not all testy defensive matches are created equal. Some are driven by a successful underdog like Argentina stymieing a stronger attacking team that can’t figure out how to turn on overdrive and get the game out of the mud. It’s impossible to fault the underdog for doing everything in their power to claw their way to a point. Others dour matches, though, are brought to you by a favorite that’s decided to do just enough to win. Canada got their goal and then didn’t take a single risk while strangling the life out of Cameroon. The plan worked, and Canada, one of the stronger teams in the tournament, are quite good at executing it. Still, it’s hard not to wonder what the team might be with a bit more ambition. When underdogs win ugly it’s because they have to, when the favorites do, they’re making a choice.

Header image courtesy of the Press Association

Bayern Munich Begin Reloading

This summer’s transfer season is already off and running. Bayern Munich dropped a cool 80 million euros to bring in Lucas Hernandez from Atlético Madrid. The left footed French center (and left) back is the second young defender Bayern have acquired, joining with Stuttgart’s Benjamin Pavard as a pair of young talented defensive reinforcements for Germany’s top team.

This is likely just the beginning for Bayern. Their squad is old. Up and down the pitch this is a team in need of good young players to take the reins from the next generation. There is, in fact, such need at other areas of the pitch that it raises the question of whether acquiring Hernandez is the best use of resources given that in addition to Pavard, Bayern also have Niklas Süle. Arguably the one place on the field Bayern were set was at center back, now they have a true embarrassment of riches.

But, given that Bayern have approximately infinity money, let’s assume that acquiring Hernandez won’t get in the way of doing a whole lot more work. Because this squad is going to need a lot of work.

 

This is the constant challenge for the best teams in the world. Competing at the highest levels means constantly finding the best players in the world at the height of their powers and then moving on from them quickly as they begin to decline. That means a constant churn as yesterday’s fresh faced 23-year-old babies become tomorrow’s 28-year-old grizzled veterans. Or, in the case of Bayern’s wingers, Arjen Robben is 35 and Franck Ribery is 36. The fact that those two stars are, at least in sporting terms, not just old, but rotting corpse old, has overshadowed that the rest of the first choice attacking unit is also past it’s prime. Robert Lewandowski is 30, Thomas Müller is 29, both of those guys are at the point where most players start getting worse, sometimes fast.

Bayern need to freshen up that unit in exactly the same way that acquiring Pavard and Hernandez freshened up the defence. Currently the heirs apparent are Kingsley Coman and Serge Gnabry. Coman is only 22 and has shown flashes of brilliance but optimism around him has to be tempered by a serious injury history and a consistent inability to make it to the field. Even if he has the ability to perform at an elite level on the field, relying on him week in and week out is a recipe for injury filled disaster.

Gnabry is a more difficult call. He’s 24 and just entering his prime. He does a lot of things well in and around the penalty area. He creates a respectable amount of average shots for himself, and does a wonderful job of creating great shots for his teammates. He’s also an able and willing defender. What he isn’t particularly proficient at is doing the work of moving the ball up the field. He wants to get the ball in advanced positions, not move it there himself. This has been challenging at times for Bayern who often look to suck teams into the middle before spraying it wide to advance the ball up the field and unsettle defences.

 

If Bayern believe in Gnabry, and expect him to start for them on a regular basis over the next three years they’ll need to make sure that he is part of a team that has lots of other players tasked with moving the ball forward and allowing Gnabry to create magic in the box. That player is unlikely to be a forward. As Lewandowski hits the wrong side of 30 his most likely replacement is rumoured to be Timo Werner. Werner is a lot of things. He’s great in space on the counterattack and perfectly proficient, although not as good as Lewandowski, at finding space in the penalty area (then again who is) but what he isn’t is a facilitator.

 

 

Bottom line for the attack is that it will require at minimum not just the purchase of Werner (or a similarly high profile, young, prolific striker) but additionally one very high quality winger to play the majority of minutes across from Gnabry.

One possible solution for a team looking to rely on its attackers to stay high and around the penalty area is to get a bunch of creativity from midfield. And Bayern is, of course, stacked with midfielders. But there too, they aren’t exactly young. James Rodriguez is in his prime at 27, and Thiago at 28 isn’t old, but won’t be around forever and is often hurt. While neither of them are immediate concerns this offseason (especially given the pressing need on the wings), they are both nearing the point where a succession plan needs to be considered.

And Bayern do have a lot of young midfielders floating around. The best of the bunch is Leon Goretzka who looks well up to the task of stepping into this midfield and starring for years to come combining unspectacular but necessary passing in possession with an ability to get forward into the box. Past him, Corentin Tolisso is entering his prime but also lost this entire season to injury. And in the defensive midfield spot, Javi Martinez might be a rock, but he’s a 30 year old rock.

This midfield unit wouldn’t be a concern if everything else on the team was settled. An attack brimming with creativity would balance out a midfield focused on getting and retaining the ball. A young and vibrant midfield could carry an attack focused on poaching in the box. The concern for Bayern is that both things could collapse at once. Fail to find a star on the left wing, and a creator in midfield to lighten the load on Thiago and the team could end up with a strike force that needs the ball delivered to them and a midfield incapable of providing it.

None of this is to say that Bayern doesn’t have a lot of young prospects. They do. From Renato Sanches to Alphonso Davies to their rumoured interest in Chelsea’s Callum Hudson-Odoi, Bayern are clearly interested in young talent. The challenge though is that when you’re Bayern Munich prospects don’t cut it. They need players who are young and great at the same time. Davies is only 18 and maybe by the time he’s 22 he’ll be good enough to be a regular, but Bayern can’t afford to twiddle their thumbs while they wait. Talent development is important but it has to come along side fielding a team that’s one of the best in the world right now.

These are all champagne problems of course. Wondering whether the clearly excellent Gnabry is good enough to shoulder a large load for Bayern, or wondering whether they have enough creativity should Thiago get hurt and Tolisso not recover is basically wondering whether Bayern is a top tier Champions League contender or a fringe Champions League contender. And they’re going to be favorites to win the Bundesliga either way. But that’s the challenge of being Bayern. When you have the gigantic resource advantage they do, you don’t get to rebuild, you have to reload, and you have to do it fast.

Bayern have a lot of work to do. In Hernandez they added further strength to the strongest part of their squad. It’s how they strengthen the weaker parts that will determine whether they’ll merely be great, or continue to contend at the very top of the European pyramid.

 

Header image courtesy of the Press Association

Do Statistics Explain Kevin-Prince Boateng’s Barcelona Transfer?

Kevin-Prince Boateng? That Kevin-Prince Boateng? Are you sure? For real? Why?

It’s been a long and winding road for Boateng. While his younger brother settled down in Bayern Munich’s defense, Kevin-Prince has spent the better part of his career bouncing around Europe. The former Hertha, Tottenham Hotspur, Borussia Dortmund, Portsmouth, Milan, Schalke, Milan again, Las Palmas, Eintracht Frankfurt, and now Sassuolo player has finally made it though. At almost 32 years old he’s been loaned to the big time. He’s going (somehow) to Barcelona.

This isn’t Boateng’s first time playing for an elite team. He was, if not integral, at least involved with Milan the last time they were a true world power. He started 18 matches during the team’s 2010-11 Serie A title winning season, and then 15 matches over each of the next two years as they finished second and third. But that was a soccer playing lifetime ago. Since then he’s remade himself, shifting from an all action midfielder to an unconventional striker. He has played the bulk of his minutes for Sassuoulo in that role.

His baseline stats don’t really suggest there’s much to write home about though.

 

 

He doesn’t take very many shots, 2.64 per game is extremely mediocre for a forward. And the shots that he does take are terrible. His 0.07 xG per shot is literally off the charts bad for our radars. There’s no worse combination of outcomes for a striker than only being able to manage taking a small number of really unlikely shots.

 

 

Perhaps there’s something else to his contributions though? Given that he is a former midfielder and an unconventional striker at best, maybe there are some distributional aspects to his play that the numbers fail to account for. Perhaps he’s dropping deeper and creating chances for runners in behind him, or he’s an integral part of an aggressive pressing team, where he defends from the front? If that’s the case it also doesn’t show up in the numbers. Here’s how he appears on the attacking midfield radar.

 

 

It’s a slightly larger blob, but it’s not actually more impressive. His average number of touches in the box for a striker shows up impressively on the attacking midfielder template, and his pass completion percentage is pretty high. But that’s about it. There’s nothing here that suggests he’s creating a lot of opportunity for others. His expected goals assisted from open play per 90 is an exceedingly low 0.09. If he’s doing something creative with the ball, it’s not showing up in the shots he’s creating for his teammates.

In Boateng’s defense, Sassuolo play the game at a very slow pace. They’re actually slightly more invested in possession that you might expect from Serie A’s 12th place club. They play 533 passes a game, tied with Roma for the sixth most in the league and allow only 442, the sixth fewest. They’re happy to have the ball and not do a ton with it, as long as their opponent doesn’t have a chance to get the ball and attack them. It’s a necessary strategy because when they do give up the ball, they’re completely unable to stop opponents from attacking them.

Despite giving up only 13.30 shots per match, the ninth most in the league, a respectable total, the expected goals they’ve allowed from those shots is a mind blowing, 1.43, tied for the fourth worst in the league. A brief look at their defensive activity map might serve to explain why.

 

 

Against that backdrop, Boateng’s defensive contributions from the top look pretty decent. He’s committed to harrying the ball around the halfway line even if the team behind him consists entirely of, well, not much of anything.

 

 

Squint and you can almost see the stylistic appeal for Barcelona. Boateng plays up front for a midtable team that plays slowly and methodically. They insist on working the ball out of the back and are extremely patient in possession. That’s sort of Barcelona-ish. And while they turn all that possession into a mediocre number of terrible shots (a process which Boateng is an integral and negative part of) presumably when surrounded by superstar teammates that won’t be nearly as large a problem.

And defensively you can definitely see a role for Boateng as a closer. If Barcelona have a lead, bringing in Boateng for somebody like Ousmane Dembele or even Luis Suarez  to shift that emphasis from attacking to defending makes some sense. Boateng can do that, the trade off will be that he’ll add much much (much much much) less on the attacking end, so much less that the trade off may not be worth it.

The question remains though. Why? It’s true that if you torture the numbers, and the scouting just right you can gin up a narrowly define role that Boateng makes sense in for Barcelona, but he’s not the only player that would make sense in that role. It’s not hard to find players that will willingly defend from the front in a substitute role. It’s especially not hard to find ones that are under 30. And while getting Boateng assures that you’re finding an unconventional attacker who is used to playing in a possession oriented system, it also assures that you’re getting an attacker who doesn’t give you much output in that system.

The benefit of being Barcelona is that there’s a lot of wiggle room to make mistakes at the margins. Going and getting Boateng on loan, whatever the reason, won’t hurt this squad overall. They’ve acquired better players for more money than Boateng who have flopped at providing frontline depth while not slowing the super team down (whither art thou Arda Turan). The main cost though is the opportunity cost. Bringing in Boateng to fill this role means not bringing in somebody younger and potentially better to fill that role.

Slice the numbers just right and it’s possible to make an argument that explains what role Boateng will fill at Barcelona. That’s fine, as far as it goes. But no matter how long you look you’ll never find a good reason for Barcelona going and getting a mediocre 32 year old to be the one to fill that role. That decision will remain a mystery

Header image courtesy of the Press Association

Despite Manchester United’s Collapse, David De Gea Remains Strong in Goal

Statistics have told a clear story about Manchester United over the last couple of seasons. Their second-place finish last season was largely a mirage, a function of David De Gea playing an astounding season of football. They might have finished with 81 points and only conceded 28 goals, but it was simply unlikely to continue. That seems to have come to pass. Jose Mourinho’s team is currently sitting in eight place and they’ve already conceded 21 goals through just 12 games. The exact reversal that analytics predicted would come to pass. And yet, the story is more complicated.

Three things are all true at the same time. First, Manchester United’s numbers are significantly worse this year than they were last season. Second, Manchester United are no longer posting better results than their numbers indicate they should. Third, David De Gea is, despite that, still playing out of his gosh darn mind. Let’s take them one at a time.

United’s baseline defensive performance is deteriorating. Last season, United’s opponents notched 1.01 expected goals per match. That wasn’t very good by the usual standards of a team near the top of the table, but it wasn’t bad per se. It was fifth in the Premier League behind Manchester City, Liverpool, Tottenham and Chelsea, in that order. The problem was that United needed to be great defensively because their attack was also only fifth best in the league at 1.49 expected goals per match. They were solidly a fifth place team, but because they conceded so many fewer goals than expected, they finished second.

This season, they’re worse on both sides of the ball. Their attack has dropped down to 1.18 expected goals per match, tenth in the league, and their defense has dropped to allowing a scary 1.34 expected goals per match. There are twelve teams who have better defensive underlying numbers than United. Twelve! Last year United played like a fifth place team and finished second. This season, with their negative 0.17 expected goal differential per match they’re playing more like a 12th place team.

United’s numbers have gotten worse, but they’ve also stopped being able to outrun them. Last season United blew their expected goals numbers out of the water on both sides of the ball. They scored 67 non-penalty goals from 56.78 expected goals and conceded only 27 from 38.24. This year the, team is more or less in line with their numbers. They’ve conceded one more goal than they’ve scored, and their expected goal difference is a little over negative two.

How they’ve done it is interesting though. On the attacking side of the ball United, just like last season are on pace to better their numbers. They’ve scored more than their expected goals.

They’ve also conceded more than expected goals thinks they “should” have.

This looks straightforwardly like a team playing an open brand of football, and executing it at a not particularly effective level. That’s startling for Mourinho the dull, but it’s less surprising given the talent he has at his disposal. United have crummy defenders and good attackers, so a team basically playing at the level expected goals expects, while also being skewed towards attack and away from defense makes sense.

This also looks like a team without any noticeable standout goalkeeping. And that’s where things get weird. Because the numbers also show David De Gea once again having an amazing season. Last year it was easy to see De Gea’s contributions. The team faced 38.24 expected goals, and the shots that made it on target to De Gea’s net had a roughly similar value. Their post-shot expected goal value was 38.76. In other words, De Gea’s dominance last year was easily recognizable. The 12.76 goals above average he saved were directly reflected in United’s defensive performance against expected goals.

This year it’s way trickier. While United’s opponents have 14.10 expected goals overall, the set of shots that have reached De Gea have been considerably tougher. Post-shot expected goals shows De Gea as having faced shots worth 24.09 expected goals.

Ok, so what exactly is going on here? In effect what this shot chart is saying is that the while on average the shots United are conceding will lead to a little over 16 goals, in practice opponents have caught the ball quite a bit better than average, leaving De Gea to deal with shots on target that will be scored a little over 24 times. Opponents are hitting the ball way above average, and De Gea is standing on his head just to keep United close to where expected goals thinks they should be.

If we look at this in terms of expected save percentage and actual save percentage it becomes clear that De Gea is basically maintaining last year’s level. Last season, our post-shot expected goal model gave De Gea an expected save percentage 73.2% and he clocked in at 82.4% giving him a goals saved above average percentage (GSAA%) of 9.2%. This season, post-shot xG has him with an expected save percentage of only 64.5% and an actual save percentage of 71.4% for a GSAA% of 6.9%. Despite a severe drop in save percentage De Gea iss still having an excellent, if slightly less superb season than last year. It’s just masked by the fact he’s facing shots that are a lot more difficult to save than a regular xG model predicts they would be on average.

It’s important to be cautious when drawing conclusions about exactly why the xG discrepancy exists. It’s possible that it’s just noise, and that through 12 games United’s opponents have been catching the ball extra pure. Maybe United have been getting really unlucky, and De Gea’s magic has stemmed the bleeding so that it seems like they’re only getting a little unlucky. Football has a lot of moving parts, and the reason expected goals works so well is that even one part being very out of whack tends to be mitigated by everything else. So, De Gea playing great while United get unlucky so that from a distance things look mostly normal, is a reasonable possibility.

It’s also possible this is a further reflection of United’s poor defense. Maybe teams are teeing off on De Gea because United’s shoddy backline, and complete lack of a midfield, is letting them. That too makes some degree of sense. Expected goals against is a fairly accurate predictor of a team’s defensive performance so it wouldn’t be unreasonable to think that what it’s reflecting right now is a unit that consists of poor defenders and a great keeper awkwardly averaging each other out.

It’s also possible that this is a tactical effect. If United are playing more openly this season, and it seems like they are, then maybe that openness is giving opponents a chance to get better than average shots off. It would make sense that above average sets of shots, offsetting each other on the attacking and defending end would lead to a team ending up where they should be according to expected goals while also having exaggerated post-shot expected goals.

The reality is that we simply don’t know enough yet to advance any of these theories with any degree of certitude. What we do know is that our post-shot expected goal numbers show a goalkeeper in De Gea who is still playing his tail off, even as his team underperforms opponent’s expected goals. That in turn allows us to say pretty concretely that United’s disastrous plummet back to earth from their surprisingly lofty finish last year isn’t driven by De Gea’s return to earth. It’s everything else that’s collapsing.

Header image courtesy of the Press Association

Arsenal in Trouble: Results are Covering Serious Flaws for the Gunners

On the surface Arsenal waters are calm. While the team left it until late, they ultimately managed to put a couple in the net and defeat Watford 2-0. Usually beating Watford would not be a notable achievement but given that Tottenham weren’t able to pull it off, and that the Hornets started the day above Arsenal in the table, and the accomplishment becomes a little more impressive. The win was also Unai Emery’s fifth straight, after two opening defeats. Things seem to be looking up. Look a little deeper, however, and there’s deep cause for concern for the Gunners.

Let’s start by looking under the hood of that comfortable 2-0 win.

Well, that’s not great. The teams were mostly even until Watford took control of the match in the 67th minute with a flurry of activity. Arsenal managed to weather the storm, and then go ahead thanks to an own goal. But, what this clearly shows is that rather than Arsenal dominating the game and then finally pushing one past Watford, this match was a case of Arsenal hanging on, and then getting a fortunate bounce at the right time.

Other than the result, there wasn’t a lot for Arsenal to walk away from this match hanging their hat on. It wasn’t just the expected goal total of 1.95 to 1.34, but more prosaic measures as well. They were out shot 14-9. Arsenal managed just two shots on target to Watford’s five. Arsenal, as expected, dominated possession, completing 468 passes out of 591 attempted to Watford’s 232 out of 345. But those possession numbers are perfectly within acceptable ranges for Watford’s press and counter approach. And given the shot numbers, it’s clear the game was played stylistically in the manner Watford preferred. Even if Arsenal ended up sneaking out a win.

Now, individual games are not dispositive and there are always mitigating factors. Alexandre Lacazette was denied a penalty early on that could have changed the course of the game. If Arsenal was playing from ahead then maybe everything looks different. And it’s only one game, and one at the end of a winning streak at that. Maybe a broader look would mitigate the concerns raised by Watford.

Except that the broader look turns up lots of concerns for this Arsenal side. Arsenal are conceding an eye popping 1.50 expected goals per match. The only five teams with worse defensive numbers than Emery’s team are Brighton, Burnley, Fulham, Huddersfield, and West Ham. That’s bad for any Premier League side, let alone one that expects to challenge for the top four places. Their shot numbers are marginally better. They give up 14.14 per game, which is eight worst in the league. That puts that at 0.11 xG per shot conceded, the fourth worst total in the league. A defense that doesn’t suppress shots, and also allows good shots is not a good defense.

These problems aren’t new, of course. They’re the same problems that Arsenal had coming into the season, and ones that new manager Emery was tasked with fixing. He certainly hasn’t been able to yet. The team seems set on trying to play a highline, squeezing up the field and taking away opponents space.

 

The problem is, just like last season, this defensive approach simply isn’t working. Arsenal’s defensive and midfield personnel aren’t good enough to make the system work. The left side in particular is a major problem. Granit Xhaka is not nearly rangy enough in midfield, and his poor defensive instincts amplify his lack of mobility. Nacho Monreal has aged past the ability where he can comfortably defend in space. The motley crew of Arsenal center backs don’t do enough to pick up the slack when faced with that much swiss cheese in front of them. And while Lucas Torreira may yet become a good defensive midfield linchpin, there’s only so much any mortal not named N’Golo Kante can do.

Faced with those defensive deficiencies, Arsenal would need to be an intergalactic level attacking juggernaut for the team to compete at a high level. They aren’t. They’re good, but not good enough given the handicap they’ve been saddled with. They’re averaging 1.41 expected goals per match, which is sixth in the league, just a shade behind Bournemouth for fifth. They don’t actually take a ton of shots, only 13 per match, but they do take efficient ones.

The good news for Arsenal is that despite the problems, they’ve won five in a row and 15 points puts them just about where they need to be, even on points with Tottenham for fourth place. Arsenal’s results have kept them in the mix, but if something doesn’t change, then those results won’t keep coming.

How can Arsenal fix themselves? The obvious answer is by upgrading their personnel at the back, but that can’t happen until January at the earliest, and more realistically next summer. Until then, the team needs to figure out how to change their game to give the back half of the field some help. One obvious possibility is simply dumping Granit Xhaka in favor of a more defensive option. But, it’s unclear if Arsenal have such an option available. Matteo Guendouzi is the obvious choice but at 19, his defending is very active, but not positionally sound. It’s not immediately clear he’d make the defense better.

Another option is playing more conservatively in attack. This is probably the path that can pay the most immediate dividends. When Arsenal attack they are extremely aggressive. Not only are they functionally playing with two strikers now that Pierre Emerick-Aubameyang and Lacazette are both starting, but they are also using the very aggressive movement of Aaron Ramsey and choosing to get both fullbacks involved to provide width. All of that thrust is then conducted by Mesut Ozil.

A slightly more conservative approach, making sure Monreal doesn’t overextend his ability to recover, and playing an actual winger on the left to protect him might help stabilize things. Or, rather than play Ramsey at the 10, play a truer stay at home midfielder, to be part of a central three, adding an extra body to clog the field, and trusting the top notch attacking talent to get it done anyway (and since Ramsey’s contract is running down, preparing for life without him makes even more sense).

Right now, Arsenal’s possession is predicated on scoring. They want to have the ball to use it to score goals. Then, if and when they’re ahead, they are happy to try and defend without the ball. It doesn’t go well. Instead, this team should be looking to be more conservative in possession. Keep the ball, maintain a more conservative shape in possession, use all those passers to kill off games by running teams ragged. Be boring.

The last lingering problem there is that a high dose of defensive possession means having a goalkeeper who can play the ball with his feet. Keeping the ball means when it goes back to the keeper it comes back out again as a pass, not as a 50/50 proposition. This was obviously not Petr Cech’s game. But, Cech went off injured against Watford. It’s now Bernd Leno’s time. He’s supposed to be more proficient with the ball. He better be, Arsenal’s season may depend on it.

Right now, despite the results they’ve managed to pull off, Arsenal simply aren’t working. They need to change what they’re doing if they’re going to make a real run at qualifying for the Champions League this season. They’ve still got time left to make that happen. But each passing week with the same old not good enough defence makes it less likely that the change will ever come. If it doesn’t happen soon, it probably won’t happen at all.

Header image courtesy of the Press Association