There’s been a lot of really good work done on shot positions recently, much of which is both clever and tremendously useful. The short recap is that shots taken close to the goal in the center of the penalty area are good. Shots taken close and wide are pretty bad (due to the angle of goal space available to shoot into), and shots taken from outside the penalty area are also pretty bad and get worse as you move further out. Obviously the location of the shot itself only tells you part of the story. Like most of the analysis we can do right now, shot location is an abstraction. It’s a valuable one, but it only accounts for half of the offense vs. defense equation. Consider for a moment, which of the following two shots is more likely to result in a goal? Figure 1 takes place in the 20-25% conversion range. It’s central and inside the penalty area, which is good. If all of a team’s shots were taken from this position, they would manage to do very well at converting shots to goals during the course of the season. Obviously this is complicated by the fact that there are two players in immediate blocking distance that limit the available angle the shooter has to put the ball in the goal. Figure 2 is a shot from the exact same area, except the player has somehow beaten the defensive line. Most estimates say this shot is converted at closer to a 40% rate, which is about as good as it gets. The only limitation on the angle the shooter has is where the keeper is positioned at the time. That means the shot in figure 2, despite being taken in the exact same area as the shot in figure 1, is about twice as valuable as the first one. You run into this same problem with headers in the area. Very few headers are actually “free” where a defensive player isn’t doing something to at least put a body on the shooter and throw off their aim. But headers that are free are usually in the center of the penalty area and within 12 yards of the goal. Free headers are enormously valuable and have a high probability of scoring goals, if you can get them. That’s the issue with data abstraction. As a “shot,” they all get lumped in to the same areas and they look the same, despite the fact that one shot will be twice as valuable as the other, depending on where the defenders are located. That’s also one reason why I say positioning is everything. One Moment In Time Right now, a good baseball model can look at a number of inputs and give an extremely accurate probability of run scoring. If you take the current pitcher’s stats, the current batter’s stats, whether and where there are men on base, and the park factor, you come up with a figure that tells you the likely amount of runs scored for this at bat. Add in the rest of the batting order, and you can calculate that on a batter by batter basis throughout the course of a game. What does this have to do with football? Football is in constant motion. Baseball has static stops and starts (each pitch). They are not remotely the same game. But what if you sliced football into individual moments in time? Take any moment in time where the ball is in the attacking third, look at the current ball position and the positions of the defenders. Then analyse the probability of a goal being scored by a shot in that location. Do that with enough shot and position data, and you construct a model that tells you how likely any shot is at creating a goal based not only on shot location, but also on defender location. 25 shots per game… 380 games per league per year… it would take about two years to reach 100K shots from the big five leagues and have an enormous database of shot success vs defender positions to analyse. Why would you want to do this? Because it teaches you what situations actually yield the best chances. We don’t have to guess anymore, we can know what is most likely to happen. If a good shooter has 2 yards of space to take a shot at 20 yards, is that a good shot? Is it better than the shot in figure 1 above? What about scenarios where a player cuts to the byline and then crosses back to the center? Is it better to have a near post runner or a far? How about a near-post and a penalty spot filler? With enough game and positional data, you can answer every one of those questions with an actual goal likelihood. The value of this information to managers is immense. Every single thing you learn here is a coaching point, not just at the top level, but right down to the grass roots. Baby Steps into Really Big Steps If you feel like a lot of what we are learning in the stats community is exceptionally basic, that’s because it is. Not because the people analysing the data are dumb, or because they don’t understand how to do things more complex – there are a scary amount of post-graduate degrees donking around with football stats during their free time. The reason a lot of analysis seems pretty basic compared to what is happening in other sports, is that up until recently, there has been no data available to the public. Now that we have some data available, we’re taking baby steps, expanding the knowledge literally every day with new, useful information. But this type of modelling and analysis… that’s the end game. Some of the data companies have positional data on everything that happens on the football pitch going back for years. So the cool part is, it’s doable right now for teams or Football Associations that have the money, the talent, and the willpower to make it happen And positioning – plus analysis of how shot location position vs. defensive position affects goal probabilities – is everything.