The Dual Life of Expected Goals (Part 1)
“Let me explain… No, there is too much. Let me sum up” –Inigo Montoya
The great thing about running a football stats website is that you get to do things like devote thousands of words entirely to a single statistic, and there’s nobody to tell you not to. So, let’s get into expected goals, what it is, where it came from, and most importantly where it’s going from here. Lots of football fans have only experienced the good ol’ xG as a single game number, either included on the bottom of a TV scroll, next to shots, fouls and assorted other stats, or on twitter as a pretty little shot map. That wasn’t what it was designed for though. Single game xG is a useful tool (and one we here at International StatsBomb Headquarters are committed to making more useful) but it was originally developed for something entirely different.
Goals: The Only Stat that Matters
In the beginning there were goals. Just goals. That was the only thing that was counted. Whoever had the most goals won the most games. You play to win the games. Therefore, the only thing that mattered was counting goals. There were some exceptions of course. Charles Reep notably counted passes by hand well before the rest of the world decided to do the same. But, for the most part, people watched football and counted goals, and the years went by. Eventually, somebody decided to count the passes leading to goals as well. And voila, there were assists.
At that moment, at the dawn of statistical time, a schism was born. On a team level, the statistic of goals gives you more information, than if you didn’t have it. Not only is it the way in which we keep score, but also the knowledge of a team’s goal difference helps observers determine how good they are with more accuracy than if they knew whether they had won or lost. On the other hand, knowing about a team’s assists doesn’t give an outside any more information about how good the team is. There’s a reason that goal differential is a thing and assist differential isn’t.
Statistics, at their heart, serve two purposes. The first is predictive. What do knowing these numbers tell us about the future? Knowing how many goals a team scored and conceded makes people better able to predict how likely a team is to win future games. The second is descriptive. What do these numbers tell observers about how things happened? Assists are a descriptive statistics, and a useful one, but they aren’t especially predictive. If assists were zapped out of existence overnight there’d be very little impact on the world’s ability to predict the outcome of football matches.
That’s a tension that has always existed, and it’s one that remains at the heart of how the football world is increasingly using xG.
Shoot Your Shots
Before getting to modern statistical times, there’s one more stop to make. One of the first things that statisticians began regularly counting was the number of shots teams were taking. It’s an obvious statistic to look at, and it turns out that it’s pretty important. You cannot score (for the most part) if you do not shoot. This is not rocket science. It’s not even bottle rocket science.
As Ted and James talked about on the last StatsBomb podcast the groundwork for looking at shots in football was laid in hockey. In hockey shots served both a clear descriptive purpose and provided predictive utility. Shots in hockey are a pretty good way of describing who has possession. Descriptively, by saying teams have a lot of shots, you can also say that teams have a lot of the puck. Predictively they also have a lot of value. In hockey the best teams reliably take a lot more shots than their opponents, but it’s very hard to control how often the shots a team takes are scored. By measuring how many more shots a hockey team takes than their opponents, it gets easier to predict which hockey teams will do well in the future.
Those findings were applicable to football, but only in a limited way. The first major problem is that descriptively comparing shots is not a particularly good way to measure possession. The relationship between possession and shooting is a lot looser in football than hockey (this will surprise nobody who has watched either sport for more than ten minutes. It’s mostly down to one sport being played with feet on grass and the other one being played with sticks on ice. Small things like that.). Using shots as a proxy for possession doesn’t really work. Broadly speaking football uses passes played to measure possession, which is better, but not perfect.
Despite that, measuring shots is still pretty good as a predictive tool. Knowing how many shots a team has taken and conceded makes you even more able to predict how they’ll do in the future than if you only knew about their goals scored and conceded. That’s great. It’s also frustrating. The gap between shots’ predictive power and descriptive power makes it impossible to turn the information we get from shot differentials into anything resembling insight.
The information those stats contains does a pretty good job of explaining what will probably happen next, and a terrible job of explaining why. If a team is scoring a particularly high percentage of their shots, or on a particularly cold run, looking at shot numbers doesn’t offer any answers as to why it’s happening. All that they have to offer is an assurance that it probably won’t continue.
One thing that’s important to note here is that just because these stats can’t provide a reason for the divergence between shooting and scoring doesn’t mean there isn’t one (or many), it just means that those reasons are incidental to predicting what comes next. That’s an answer that’s useful to only a very small group of people (mostly the ones looking to put a bet down). It doesn’t help people interested in understanding what’s going on, people like, say, coaches who have to make the hundreds of daily decisions which go into running a team.
And now, finally, we get to the good stuff.
What to Expect When You’re Expecting Goals
Using past shots to predict how will teams will do in the future is good. Further modifying that to factor in what type of shots teams are taking is even better. That’s, in effect, what xG does. Notably what xG was not developed to do is accurately describe a single shot or a single game. Rather, it was designed to take lots of information, thousands and thousands of shots, synthesize it, and use that information to represent how many goals a team might reasonably be expected to score or concede given the types of shots they’ve taken and given up.
This is good and useful information. There are ample studies showing how this process is better at predicting how a team will do in the future than pretty much anything else out there. It takes the old information, based purely on the volume of shots and improves it. It turns out that sometimes when a team is shooting better or worse than average it’s because on average they’re taking better or worse shots.
There are two problems with xG as currently constituted. The first is that just like with a basic shot based metric teams frequently spend stretches of time doing better or worse than where the metric thinks they’ll end up. And, just like with shots, xG doesn’t offer many answers other than the (quite good) prediction that eventually that will stop. It explains part of what shots miss, but there’s still plenty of room left blank.
The problem of what xG might be missing in the short term is encapsulated by how it’s used for single games. It’s important to start off by saying, that xG maps contain more information than pretty much any other form of quick glance game recap. But it’s not what it was designed for. The total goals a team score will often differ wildly from what xG predicts. Frequently this is by design. If a player misses a sitter, xG and actual goals should differ. That’s the point. The model is crediting the team for creating the chance, understanding that in the future creating those chances will lead to goals.
So, there’s a way in which single game xG totals differing from the result is a direct sign that the model is working. But, there’s another reason they can differ as well. The value that an xG model assigns to any given specific shot is based on an average of past similar shots. So, it takes into account things like location, whether or not it’s a header, the kind of pass that led to the shot, etc etc, mixes them all together and spits out a value.
The problem with averages is that they’re averages. Any single chance can differ significantly from that average. Because we know that xG works, and is quite predictive, we know that over the long run the ways those individual shots differentiate from average more or less cancel each other out. But, during a single game, that definitely doesn’t happen. A team with a high xG total but no goals might have missed a bunch of good chances, or the chances they had might have been harder than the model predicts. Single game xG totals don’t differentiate between the two.
Luckily StatsBomb can help with that problem. To find out how, stay tuned for part two.