Football Analytics has a learning curve. That’s great, because learning is a fun, though occasionally painful process. This summer I did a review of my past work, and there’s some cool stuff in there from the early days along with some really boneheaded mistakes. It doesn’t matter how smart you are – your work is not going to be perfect when it comes to something new. The trick is simply to get over it and do better next time.

Today, I wanted to talk a little more about what I learned regarding player evaluation while going from zero knowledge in 2013 to running worldwide recruitment for two clubs in 2015. As part of that, I’ll introduce the new attacker radars in print for the first time, and I’ll talk about three of the most famous players in the world: Neymar, Eden Hazard, and… Andros Townsend?!?

Learning Curves
One of the first things you do when looking at a new data set is immediately boil it down to the important stuff and focus on that:

What is correlated with [important stuff?]

What causes [important stuff] to happen?

In football, we care about goals. In fact, for some pundits, that’s all they care about. The only number that matters is the score.

Imagine a classroom of ten-year olds talking through the data.

Alright children, today we are going to talk about football. Match of the Day and legendary England striker Alan Shearer said we care about goals more than anything else.

So the first thing we have to ask is, what causes goals?

“Shots, shots cause goals!”

Excellent, Timmy. You’re too young to remember, but Alan scored an awful lot of goals back in the day.

Now if we take a step back and say we care about “scoring”, which is actually a superset of goals, what else might we care about?

“Assists! Assists are passes that created a goal. They should count too.”

Great. Now we have goals and assists. And let’s find one more element to look at here – what do exciting players do a lot of when they attack?

“They uh… elbow people in the head?”

I know you like Diego Costa, David, but that wasn’t quite what I was going for.

“They dribble?”

Outstanding Samantha. So lets see if shots, assists, and dribbling are a great start to finding players who score more goals.

End Scene

It’s a bit forced, but this is literally what most people do when they start analysing football, which is great, because it’s an excellent, logical process. There’s one missing step in here going from assists to key passes, which is the functional equivalent of going from goals to shots, but that’s it.

Want to find interesting attackers? Look at shots, key passes, and successful dribbles. Do this and good players start to magically show up at your doorstep.

For instance, take the numbers for these two guys…

andros_v_neymar_crop

We’ve isolated what we care about in attackers, and these two young guys stick out like sore thumbs. They are similar ages, and even play for bigger clubs in good leagues, so there are no worries about league translation or anything like that. Indicators are that Andros might actually be a slightly better player than Neymar, but they are both very good for their age.

Plot them side by side on the original forward radars and you get this.

andy_and_neyney

Given our earlier conclusions about certain stats driving scoring outcomes, this begs the question…

townsend_frustration

Looking at this objectively, there might be a flaw in our process. These two players have a lot of similarities in driver stats, but the thing we actually care about – scoring – is massively different. Were either of the players lucky/unlucky in their output? Is it a teammate problem? A coach problem? You can think of a million different possible reasons why scoring might be different, but guessing is unacceptable.

So we now go back to the drawing board to find more clarity. There are lots of ways to do this, but one of the simplest, most effective ways of going about it stems from one of the most important lessons you learn as a data scientist.

Always plot your data.

Here we take locational data for shots and add it to the MK Shot Map format… and you get this.

andy_neymar_shotmaps

(click to embiggen. Made with Opta data)

Oh.

Oh my. That’s…

I mean…

It’s as if someone put a force field around the danger zone shooting ring for Townsend, and he’s not allowed to have the ball in that area. Meanwhile, almost every shot Neymar takes is from prime real estate.

The reason for potential problem we flagged up earlier immediately becomes clear.

So using numbers and visualizations, we have gone through a three-step advancement in the player evaluation process.

Step 1: These are numbers we care about. Let’s look at those and see what happens.

Step 2: Visualizing them on the radar charts while normalizing them for the population shows that we might have a hole in our basic process. Was Townsend unlucky not to score from all those shots? How do we get more clarity on this?

Step 3: Visualizing the data on shot maps makes the problem crystal clear. Neymar takes great shots. Andros takes terrible shots. In fact, Neymar’s expectation of scoring on an average shot is more than five times greater than Townsend’s. This in turn has an absolutely massive impact on their probability of scoring a goal from any particular shot.

Other Holes in the Process – The Eden Hazard Problem
Obviously with attackers we care about scoring, but what about players we know from watching have a huge impact on the game, but for whatever reason don’t show up very well in traditional scoring stats?

To put it another way, how do you find players like Eden Hazard? Hazard might have been the best attacker in the Premier League in 14-15, but his scoring stats weren’t close to overwhelming.

eden-hazard-2014-15

What can we do to tease out more data and find elite players who don’t always directly contribute to goals or assists?

For me, the answer was to take another step back in the process. We look at key passes and shots and they matter, but what about the ability to generate successful touches inside the box? And since football is fundamentally a passing game, what about players who are able to make successful passes into the penalty box, which might be one of the rarest skills in the game?

So I created two new metrics:

  • PINTO = Successful passes in TO the box
  • TINDA = Successful touches inside ‘DA box.

It turns out when you start to isolate players by this particular combination of skills, you get a useful additional perspective on players who contribute to scoring, both directly and indirectly.

pin_tin_scatter

Thus a new format of attacker radars was born.

eden-hazard-2014-15_predictive

I called the new template “predictive” because at this point in my head, I was thinking of the old template as “narrative.” The new template took a step back from narrative stats about what happened (goals, assists, goal conversion, etc), and started to use a few of the advanced, more predictive measures we’d developed since I created the early versions.

The new format more clearly illustrates what a monstrously talented creative player Eden Hazard was that season compared to the population of attackers.

(Note: OP stands for ‘Open Play’ which I get asked constantly on Twitter)

Finally, circling back to our initial comparison, this is what those Townsend and Neymar seasons look like on the new template.

andy_ney_ney_predcitive_radars

Conclusion
Learning how to use football data better is a process, but it’s a worthwhile and rewarding one. The new radar format came about from continually asking questions on how to analyse the data better. Can we iterate and improve on old metrics?

The old format was good as a starting point, but the new format shows player value much more clearly. It also contains years of work and improved understanding about how both the data and the game operate.

It’s also worth noting that even this “new” tech is 18 months old. If you are a club and interested in seeing some of the new stuff we’ve developed in the intervening months, drop me an email at mixedknuts@gmail.com.

The latest tech is both cool and extremely useful in helping your club make better decisions, both on the pitch and off.

–Ted Knutson
@mixedknuts

  • Thijs Leufkens

    More of these stories on the process and train of thought please! ?

  • borf

    I was always skeptical about how analysis could quantify less tangible player qualities, and to a degree, I still am, however, data like this does push towards the idea that in the future it will be possible.

  • kidmugsy

    I’m looking forward to seeing how de Bruyne shapes up in these charts once he’s played enough games this season.

    Here’s another comparison you could consider: the same player under two different managers e.g. Fellaini/van Gaal vs Fellaini/Mourinho, or de Bruyne/Pelligrini vs de Bruyne/Pep. of course there have been more changes than just the manager, but still …..

  • Ron IsNotMyRealName

    The problem is convincing teams that shots outside the box are only occasionally worth taking (as a change up, set piece, or if you are Luis Suarez and it doesn’t matter where you shoot from).

    Also, defense is still a big black hole for analytics, but this comes closer to getting a full picture of attack. Is it better to have players that fill the radar, or a team of players that specialize? Like Ozil may not fill the radar, but I can’t imagine there’s anyone better at creating chances in the box from a pass outside the box. And Sadio Mane is the opposite — very few non-strikers better at getting shots in the box.

    • kidmugsy

      One reason that a shot from outside the box may be worthwhile is that it can persuade defenders to charge out to try to charge down shots from distance. That will leave gaps next time. It’s only likely to be persuasive, though, if the shot is on, or very near, target. A shot into Row Z would encourage the defenders to hang back next time.

      • Tuiuan Veloso

        Other is that, analysing the situation of the attack, there may not be a better alternative in risk/return terms. A bad pass might generate a counter-attack or be very unlikely to succeed. If you are Özil and you have the option to try a throughball or shoot from distance, it’s better to try to pass; but if you are someone who can’t get an throughball to save your life, you may just move it or shoot anyway.

      • Ron IsNotMyRealName

        Good point, and I’ve wondered if this is even the right way to think about it.

        I think of this: how many games have you seen where the difference in the game was a shot that analytics people might say should not have been taken?

        I am reminded of the debate over sacrifice bunting in baseball. Analytics people that use expected value models for position of runners say the base is almost never worth giving up the out for. But when you look at what happens when managers actually call for the bunt, it turns out they’re actually not bad at deciding when to do that. The models weren’t situationally-specific enough.

        I wonder if the frequency with which an outside the box goal decides a result justifies their frequency of use.

        Also, really the box should probably be less wide (maybe about half a goal width beyond each post?) and extend out to about 25 yards. That’s the real shooting danger area.

        • lionkiller

          That area you define in your last sentence is pretty much exactly what Arsenal shot location maps look like! Woe betide any man who shoots from further/wider out (maybe Granit Xhaka will get fined for his screamer against Hull! Wenger is sure to coach it out of him soon enough though).

          FWIW I think even that it could be proven beyond all reasonable doubt that these efforts are, in the long run not worthwhile as measured by xG – and I imagine this is the conclusion Wenger has drawn from the work done by Stats DNA – one should think about the meta game.

          I totally agree with kidmugsy’s comment that it’s worth shooting from distance once in a while to get defenders to charge out, thus creating space. If they know that you NEVER shoot from a particular area they can take up their positions safe in the knowledge they are occupying the right areas.

          It’s dead easy playing poker against a player who never bluffs. He needs to bluff once in a while just to keep you “honest”.

          So it’s worth sacrificing a little short term negative xG to get a long term positive from those times a defender tries to charge down a “shot” and you can feint, play a little one-two, (or in Arsenal’s case a one-two-three-four-five) and gooooooooal !

    • fzr

      Chad Murphy actually wrote about this; it turns out that it might be a goos idea to take shots outside the box, because it will give you better chances of getting good quality shots from inside the box. If you only focus on taking high qualith shots in the box, the opponents will focus their effort on defending against this, which leads to you being unable to get those high quality shots you want (see Arsenal). When you mix shots in the box with shots outside it, the defending opponents will have a more difficult time in making the correct decisions, thus giving more opportunities for getting the high quality shots.

      PS. The need for shots outside the box is a good example of game theory in football.

      • Ron IsNotMyRealName

        But what’s the right mix? (Depends on the players, obviously) How do you know what the mix is for your players? How much is too much, regardless of what players you have?

        Just saying “you should mix it up” is not an insight at all.

      • Victor Rodrigues

        I don’t get why people insist in using Arsenal as example of the need for more outside shooting. Team was second best for shots on target last season, and last I saw (couldn’t find exact numbers for the full season) they were near the top on big chances created. Their approach seems to be perfectly fine, if only they had the finishing to take full advantage.

  • NinJa

    Interesting stuff. But surely, part of the reason why Townsend’s shot map is much worse than Neymar’s is the quality of their team-mates. If Townsend played for Barca, he would have many more opportunities for taking shots in the box.

    • Napsg

      Townsend just runs forward and shoots with his head down. His linkup play was nonexistent at Tottenham, which is why he never broke through

      • NinJa

        By Ted’s own metrics (above), Townsend has about 2 Key Passes per 90, same as Neymar.

        But I am not trying to play up Townsend here. The point is that the shot maps and other measures for individual players are saying as much about the team and style of play as they are about the individual. That is, the shot map is really for Townsend playing for Tottenham in 2013-14 under the coaching of Villas-Boas/Sherwood.

        • Tuiuan Veloso

          Context plays a role too, and that’s why it’s important to not imerse yourself just in the isolated player analysis. But Townsend was an anomaly even when you take that into account, a good way to see it is look at his teammates shot location, especially those that played in the same position.

        • Ron IsNotMyRealName

          Not all key passes are created equal either.

  • Tuiuan Veloso

    I like the radar with the xMetrics, and the two new stats, because they try capture other ways that how a player participates in the process of generating good shots, and those two are very important. It’s kinda like a defender that doesn’t make a tackle or an interception but just closes the space or is in the right position to force an attacking player to look at other options instead of the zone/man he’s covering. A good pass that unlock a good attack situation, leading to an assist or key pass may be just as vital as the assist or key pass itself.

    However, I’m Brazilian and I’m still thinking on how I could present PintoBox90 to my friend without they just laughing.

  • Om Arvind

    Very interesting article. Passes and touches in the box is a great new metric, but you have to wonder if it will favor possession-based teams.