Offense is always easiest to figure out. In Moneyball (the book, not the ever-vaguer Idea) the A’s essentially ignored defense to take advantage of an easy-to-measure offensive stat that was undervalued. Baseball didn’t really even have reliable defensive stats until the past few years and the public ones still come with much larger error bars than offensive ones.
The NBA is probably moving the quickest toward defense being accounted for but it’s still an area where we don’t really know nearly as much as scoring the ball. The analytics community in soccer has made great progress looking at strikers and team shooting as a whole but the opposite side hasn’t seen similar progress. This is mainly because offenses dictate the game in a way defenses cannot and simple shot totals get you a lot of the way there on offense (.55 R2 comparing shots to goals) while on defense the gap remains large (.33 R2). A tweet from StatsBomb founder Ted Knutson about trying to find what good defensive teams actually do sparked this dive into trying to find that out.
SPOILER: I haven't solved defense, but there is incremental progress and interactive stuff below so don't leave, please. If you want to just see leaderboards and then check your team on the defensive dashboard, you can jump ahead to the end.
The Holy Grail of a Single Number Is In the Future
The main result of trying to find the single thing that explains defense is quickly you see there is no secret sauce to judge a teams defense on right now. I thought teams that force teams into a high ratio of shots per deep passes would allow a low goal/shot because it indicated opponents had few options but that didn't really work out. I then thought the % of deep passes completed would be a clear indicator that teams couldn't cope, but it's messy as well. Shots from inside 10 yards? The ability to stifle a midfield? All explain bits and pieces but there are generally exceptions to everything and these categories are subject to wilder swings than offensive numbers. For now, a wider, more descriptive view of a teams defense is better than trying to find the single number to describe a team.
Takeaway #1: Team defense can take many forms, to get a feel for how a team is playing multiple metrics should be involved
They Can't Score Without the Ball
Dividing the two sides of the ball is complicated and possibly counterproductive in a sport without clear-cut possession changes. In football, baseball, and basketball the other team can possess the ball much more cleanly and evenly and your defense generally has to do the same amount of work as your offense. In soccer this clearly isn’t the case which is why if you want to avoid conceding goals, your #1 priority should be possessing the ball.
The more passes and completions your team makes the lower the chances of conceding which is a pretty obvious statement, but one that is almost impossible to get around statistically. If you want to make a good model for goals allowed using any kind of metric you want, passes for is going to be hard to displace as one of the most significant variables. Interestingly, passes for have a stronger relationship than passes allowed. I’d guess that the more passes you make, the more you push the opponent back out of position for the attack, and tire out their legs. It's common to disregard possession nowadays as kind of a useless stat, which it may be, but it's still one of the best ways to judge a teams defense.
This will be the only offensive factor I look at in this article but I think the interaction between the two is a rich area for research. Something like Deep xG's look at attacking width and length can easily be expanded to see where teams offensive possessions tend to end and how a defense can be constructed to play off that. A team like Ingolstadt is ending possessions further upfield than most Bundesliga teams, this isn't "defense" per se but forcing the other team to cover extra yardage can only help.
Takeaway #2: The more you have the ball, the fewer goals you allow.
Splitting the game up?
I found one of the best ways to see how a team is working is to split up defense into two categories: open field and goalmouth. Things that happen out in the wide expanses of the midfield generally rack up huge sample sizes and stabilize quickly. It's easier to see a teams philosophy when looking at open play stats like how intense their high press is, how much they have the ball, how strong their midfield D is, the length of passes they force their opponents to play, and what proportion of passes are played into the red zone (20 yard radius of goal).
As always enjoy Sampdoria's amazing logo, best in Europe.
At the goalmouth we can look at more familiar metrics like shot distance, % of shots inside 10 yards, % of dangerous passes turned into chances, SV%, SOT%, BLK%, and a few like shots allowed per deep completion and total passes per shot that either stabilize very quickly or are very descriptive. I kind of view the open field category as what teams set out to do (stop their opponents before it gets too dangerous) and the goalmouth as how they deal with the dangerous situations.
Arsenal's rate of chances allowed stands out as unsustainable. I expect a higher rate of chances conceded going forward if they face the same amount of passes.
Takeaway #3: Looking at how teams defend in different segments of the field can be revealing
Technical notes: which of these things are controllable?
Let's start with maybe the most interesting pair: shot distance and % of shots inside 10 yards. Shot distance correlates year on year with a R2 of .48, showing that teams generally exert a solid amount of control over shot quality. However, % of shots inside 10 yards correlates at an R2 of just .22. Why is this important? Shots inside 10 yards account for 40% of all goals despite being just 15% of all shots. The lower R2 could indicate that teams who allow more than their share of these shots are a little unlucky. Taking Leicester as an example: this year 28% of their shots allowed have come inside 10 yards which is more than 3 St Dev's above the average of 14.6%. Their average shot allowed doesn't come from super close in which makes me suspect they have been the recipient of some bad luck. A chunk of those chances that were something like .35 xG could have easily been .15 or so.
Shot distance is obviously related to % of shots inside 10 yards and it's equally related to shots per deep completion. This is simply a measure of how many shots a team concedes for each pass completed inside the 20 yard radius of the goalmouth. It is a very consistent ratio from half to half of a season (.7 R2). Teams with higher shot/deep completion ratio (last years top 3 in the EPL: Chelsea, City, Arsenal) tend to allow lower quality shots. It's a number that quickly tells you what kind of shots this team faces, a low number means there are lots of passes played around the box for each shot and a high number means teams are firing on sight.
Passes per shot is a way of showing how hard teams have to work to get a shot off. Man City opponents need an average of 56 passes to get off a shot while Champions League rivals Sevilla allow a shot per every 23 passes. This is another stat that can quickly conjure up an image of a defense. Sevilla are basically handing opponents a clear lane to the goal while City make you really earn each shot.
Takeaway #4: I might be wrong, heck I am probably wrong
I did something similar to this last summer and it was a very useful exercise. It identified Sarri while he was at Empoli as a coach to watch simply through their style of play and gave a much clearer picture as to what teams did things well. I said then that it wasn't any sort of final classification and it wasn't, the metrics used to evaluate a teams style are better now. I've tested them more, split them up, removed some, and added more. This is a more meaningful exercise than that was but there is plenty of room to improve further. Next week I'd probably do things a little differently than now, but I feel comfortable that this way of looking at a defense gives me more info at a glance than anything else.
Defensive dashboard info
Click here for the dashboard. See all your teams metrics and how they stack up with others around the Top 4 leagues.
You need to make a copy so you can edit without messing with others.
You can choose your team from the dropdown menu in the top left. That will bring up the z score for each team in a number of metrics described above. These are separated into goalmouth and open field. There will be a list of the 5 most similar defenses in each of the two categories. This similarity score is similar to what Baseball Prospectus does for their player projections and 538 does for their NBA player projections, just applied to teams. To get these this I used agglomerative clustering to find the Euclidean distance between each team. The distance with a legend is on the sheet as well. Green is generally "good" for a metric though there can be reasons why something is red and of course some metrics are more important than other. All the info is there though so you can get a fuller picture.
A few of my favorites.
Chelsea look fine in the open field, keeping pass attempts into the red zone well below average, pressing well and holding the ball. This is reflected in their similar teams of Roma, Liverpool, and Arsenal. In the goalmouth metrics we see a lot of red. Those pass attempts are turning into chances and those chances are going on target at crazy rates.
Augsburg, Sunderland, and Newcastle are not what a Champions League team should see in their similarity scores. Sevilla are not contesting anything in midfield, opponents are playing their dangerous passes from extremely close in and we can see from the goalmouth metrics that they are allowing an incredibly high amount of shots per pass. Their problems seem much harder to fix and more widespread than Chelsea's.
Sometimes similarity scores don't tell you much. If you can see the legend, you can see that Bayern has essentially no comparable teams. And if you see RZ Passes Length you can see they are a staggering 4.4 st devs above average.
You can imagine the benefit that comes from opponents passing from Bayern's origin (highlighted line) vs the majority of the Bundesliga, which is significantly closer.
Who is the Sarri of this year?
By this years Sarri I mean a manager who is taking a small side and having them play like a team with a much larger budget. The best candidate in the early going in Eddie Howe at Bournemouth. James wrote about them some here, showing their outrageously low SV% through 12 games. That probably involves some bad luck, but Howe has his team playing like a top team out in the open field.
So how do we measure defense? Uh, a lot of different ways. Keeping the ball, making opponents work hard for shots and dangerous passes, forcing shots from further average distances and keeping opponents from making lots of passes for each shot are all pretty reliable ways to stop other teams. Blocking shots, saving shots, keeping shots off target, allowing few sub-10 yard shots, and allowing a low chance% on passes into the Red Zone are not as reliable but reliable enough that we need to look at them. So, that's all cleared up right? Who's up for rating individual center backs?