Splitting Possession into Offense/Defense

Does Possession % Matter?

Any analytically inclined soccer fan (a.k.a. you) is probably well-aware of the limits of possession % as a meaningful metric.  In fact, its faults are so numerous and well documented that the ubiquitous  ironic mentions of "but what about possession?" every time Barcelona loses have (mostly) stopped.  I understand the collective derision, but if we look at the metric in a deeper way can we glean some interesting information?  I think so.

One thing that I think does need to be stated is that there is a relationship between possession % and points (at least in the EPL – see graph below).

epl poss v points

The causes of this relationship are complex and difficult to disentangle, but probably the best way to think of possession % is as a symptom of playing winning football as opposed to the cause, though of course sometimes it is the cause! Confusing! A must read on this subject is  Devin Pleuler's  interesting take on possession as a defensive weapon.

How is Possession % Calculated?

Based on some good work a couple years back by Graham Macaree, we know that the possession % that the majority of media outlets use is really just a pass ratio.  The pass ratio approach is pretty simple: team possession % = team’s total passes / both teams’ total passes. This methodology was confirmed to me by an Opta employee.  We can debate the merits of this approach until we are blue in the face, but for many sensible reasons I think it is probably the best proxy.

Splitting Possession % into Offense/Defense

Not all pass ratios/possession % are created equal.  For example, let us assume that an average EPL match sees 900 passes on average between the two teams (450 for each team).  On this particular match day Arsenal outpasses Swansea 600-400 (60%/40%).  Across town, West Ham outpasses Crystal Palace 300-200 (60%-40%).  Both Arsenal and West Ham have the same possession % (60%), but they have achieved them in vastly different ways.  By comparing their passing #’s to the league average, we can essentially allocate Arsenal and West Ham’s 20% possession advantage (60%-40%) to an offensive and defensive component, as demonstrated below.  You start by comparing how many passes each team attempted and allowed and compare them to the league average.  Arsenal, in this example, were 150 passes above an average offense (600-450).  West Ham, by contrast, were 150 passes below an average offense (300-450).  But, West Ham makes up this difference by allowing 250 less passes than an average defense (450-200).

possession differential example

That was a hypothetical, but what does this approach look like for this year’s EPL? (stats are two weeks old)

epl pos diff

Talk about a tale of possession haves/have nots.   The difference between the #1 possession team (Swansea) and the #10 team (Chelsea) is closer than the difference between Chelsea and the #11 team (Newcastle)!  Another thing that jumps out is the comparison between Southampton and Arsenal; both have similar possession #’s, but achieve it in a very different fashion: Arsenal with offense and Southampton with defense.

You also might notice the larger variance in the offensive component compared to the defensive component.  This makes sense, as a team might face a variety of passing styles over the course of the year, but their offensive style is more persistent.  Running some regressions (based on past five years of EPL data – 100 teams) backs this up, as the offensive component has a much stronger correlation with total possession differential than the defensive correlation.  Interestingly, while you would expect a strong relationship (R2  > 0.7) between offensive and defensive components, the R2 was only 0.49, which I think demonstrates that this exercise of decoupling possession into offense/defense has some merit.

offense defense rsquared   offense v defense

Shots and Key Passes are Better than Goals and Assists

By Alex Olshansky (@tempofreesoccer)

Overview

The two most ubiquitous stats for attackers are goals and assists.  And why shouldn’t they be?  After all, goal differential explains ~85% of the variance in a league table.  Creating goals is a very valuable skill.  But how repeatable of a skill is it?  After scoring 11 goals in his first 19 EPL games for Newcastle United in 2010, Andy Carroll was sold to Liverpool for a staggering 35 million pounds.  Carroll would score only six times in his next 44 EPL appearances before he was loaned out and subsequently sold to West Ham.  I do not bring up Carroll because I think he was a poor acquisition for Liverpool, I bring him up because he exemplifies the variable nature of goalscoring.  When it comes to goals and assists, what is the signal and what is the noise?

Key Passes are better than Assists

The original intent of this piece was to test the persistence or repeatability of key passes.  To the uninitiated, key passes are passes that directly lead to an attempt on goal.  There has been some legitimate criticism of the fact that key passes don’t take proper account of the quality of the chances being created, but for now it’s the metric we have.  I looked at every player in the EPL who averaged over 0.7 key passes per 90 in any season from 2009-2013.  I then looked at the year over year relationship for key passes:  how well do year 1 key passes predict year 2 key passes (n=184)?  Quite well, it turns out.  While not overwhelming, the relationship is evidence that key passes are a somewhat repeatable statistic.

alexo_1

Next, I took the same sample and looked at how well year 1 assists predicted year 2 assists.  There really isn’t a relationship.  Assists are basically random from year to year.

alexo_2

On a hunch, I looked at how well year 1 key passes predicted year 2 assists.

alexo_3

Granted, this is not a great relationship either, but it is significant that key passes actually predict assists better than assists.  And, unlike assists, key passes have some degree of repeatability.

Shots are better than Goals

Earlier this year Ben Pugsley undertook a similar study, but he primarily looked at shooting statistics.  The statistic with the best predictive relationship?  Shots per 90.

alexo_4

Ben also found no year over year relationship for assists (although his r^2 differed slightly from mine) or goals (below).

Ben was kind enough to, as I had done with key passes and assists, run a regression comparing year 1 shots to year 2 goals. alexo_6

As with key passes and assists, shots predict goals better than goals predict goals.  Of course an r^2 of 0.12 is hardly predictive, but at this point in soccer analytics knowing what does not work is just as important as finding out what works.

Expected Goals Created Model

So if goals and assists don’t work, what might?  Key passes and shots, taken at their face, are not nearly sophisticated enough.  Luckily, much work has been done on shot location/type and expected goals (here and here and many other places).   As far as I know, adjusting for shot location/type hasn’t been attempted yet for shots resulting from key passes, but that is a logical next step.  Theoretically, an expected goal and expected assist model would be the best predictor.

Goals and assists are the unpredictable results of a more repeatable underlying process.  By understanding and quantifying this process, we can move towards the signal and away from the noise.