This weekend, @7AMKickoff published a piece attacking the concept of adjusting defensive stats for possession. The piece was a bit dickish, but there were elements of it that deserve a reply, so I’ll do that today. For reference, here is my original look at possession adjusted defensive stats.
“I’ve done a fair number of regression analyses and I would probably never publish a .40 much less make some of the sweeping statements that Ted makes.”
For starters, I think the piece reads as a fairly cautious look at new research, not something that makes sweeping statements, but I guess mileage may vary. To address the regression bit, obviously I'm fairly well versed in statistics myself, so why would I publish something with just a .4?
There are two primary reasons. The first deals with statistical relevance in complex systems and the second has to do with the relevance of base defensive stats themselves.
Let’s quickly deal with statistical relevance. While it’s generally true that you would prefer to explain everything with just one figure, in most real world examples that’s impossible. After I linked to the 7AM piece this weekend, a number of social scientists spoke up saying they would often be happy to explain .10 to .20 of variation, while .40 is actually fairly useful. There are certain metrics that can have up to .80 r-squared in explaining total goal difference (depending on what adjusted shots model you use), but explaining smaller pieces of the puzzle often gets hard, fast.
Football is an extremely complex game, and defense in particular is very a complex system with multiple potential fail points, covering defenders, presses, low blocks, etc. (Unless of course, your manager is ‘Arry Redknapp, where your players just go fackin’ run about a bit.) When faced with complex systems, especially when just getting started, any additional relevant explanation is useful.
To put it another way, we know shots matter.
How does one prevent shots? That's a surprisingly tricky question to answer. Or at least it has been for me, and I've been frustrated about this for a while. Now for the second point… Did you know that by themselves, defensive stats like tackles and interceptions show zero correlation to anything useful? It's true. They are just numbers on a page. They don't correlate to shots against, goals against, goal difference, points, nada. This is despite them being an intrinsic part of the game, and the method by which most teams get the ball back.
But if you adjust for possession? Now you get the 40% explanation of variation. Going from absolutely no relevance to explaining 40% of important things seemed useful enough to write about. Keeping track of adjustments that have SOME explanatory power while continuing to search for better ones seems worthwhile.
That's the phase that football analytics is in right now. It's annoying to know that a lot of things you write about right now will be obsolete in a week, or a month, or in a year, but that's part of the progression. If I talk about p-adjusted stuff now, maybe someone else will go down that line of research and create adjustments that account for 60 or 80% of the variation in shots conceded.
"Possession is a measure of offensive dominance."
It's really not, and I never suggested it was. For the most part, I ignore possession stats completely, as it seems like a relic of the tiki-taka Barcelona era and little more. It is, however, a useful measure for evaluating who had the ball more and made more passes, which in turn is tied to the opportunity to make defensive actions.
Break it out: The opportunity to make interceptions is tied to your opponent making passes. The opportunity to make tackles is directly tied to your opponent having possession of the ball.
This seems to be what 7AM disagrees with most, but to me it’s fairly clear. Just because every team doesn’t actually try to tackle or intercept the ball in all areas of the pitch doesn’t change the fact that these actions are tied together.
Is possession adjustment imperfect? Absolutely.
I never suggested it to be otherwise. However, it lends statistical meaning to stats where none existed before.
For me, that was enough to force me to change fairly significant amounts of existing code to include them when looking at player stats on the defensive side of the ball. Is possession adjustment wrong (and um wrong)?
That seems like a value judgment, so I guess it’s for you to decide. I can tell you that p-adj stats are already on the fullback radars and will be used the other templates soon, so will be heavily incorporated in my work and the radar player charts I produce. What other people choose to do with them is out of my hands.
At the end of the day, I’m all for other people trying to adjust defensive stats to provide better explanations of metrics we care about. The only issue here is that they a) have to pass the theory barrier, and b) have to add statistical relevance. Do both and surpass what my initial attempt has done, and I’m sure the world will quickly adopt the new approach as more correct. In the meantime, I’ll keep using imperfect stats as opposed to irrelevant ones.