CBs 2

The centre-back market is one of the hottest around this transfer window. Chelsea have been linked with every big name going, City are apparently quietly interested in similar players, and Liverpool are trying their damndest not to mess up again after Klopp sent some dodgy Snapchats to Virgil van Dijk, or something. The question which I’m sure is on your lips is ‘how do you scout central defenders statistically?’.

Well, you could ask Ted Knutson– but his methods are probably [redacted]. So I’ll take you through a method of my own. These are ideas that I have worked on and considered over time, so it’s not just plucked from the sky.

For centre-backs, the best thing to do is to whittle down the list by looking for stylistically similar players, and go from there. You can look at quality further down the line, probably with conventional scouting, as it’s very hard (perhaps close to not possible at the present time) to assess pure defensive quality in single players through the stats.

I have four basic defending things that I look at.

  • Activity: Literally how much they do per 90 minutes
  • Depth: How high up the pitch they do stuff, calculated by taking the percentage of their actions above a certain point and subtracting the percentage of their actions that they do below a certain other point. The resulting numbers don’t mean much as numbers, but the information they give is meaningful
  • Front foot: How ‘front foot’/proactive, versus how ‘back foot’/reactive a defender is. Tackles and interceptions versus ball recoveries and blocks.
  • Fail %: Missed tackles and fouls. Semantically, I prefer looking at the amount of times a player fails than calling all tackles and interceptions ‘successes’. For further reading on how stats don’t quite tally with our common parlance, see here.

 

I’m sure you’re all desperate to get on with the scouting. For the purposes of this we’ll work on the assumption that Liverpool are looking around for alternative signings to Virgil van Dijk. Based on Dejan Lovren and Joel Matip’s stats for this season, Liverpool would ideally want someone who can play in a pretty high line, although with a pretty balanced front-back foot balance. They do a fair amount of sweeping up, but still need to know how to squeeze the space between midfield and defensive lines.

As a starting point, it makes sense to look for similar players against the Activity, Depth, and Front Foot categories. Fail % is a messy area as it’s partly dependent on other stylistic matters, but too high a percentage can be used as a red flag.

Matip and Lovren profiled marginally differently in 2016/17, but Lovren – with some worse passing and higher Fail % (partly a result of style, partly an inherent trait of his) – probably has a little more scope to improve upon.

Depth and Front Foot numbers are probably marginally more meaningful for Liverpool than Activity, as whoever they buy will be left with the scraps that the midfield fails to mop up, whereas the other two are more directly related to how the defenders will play. Therefore, these two categories, and any particularly high Fail percentages, can be used to start whittling down names. A couple of untouchables like David Alaba (whose stats will be affected by his time at other positions anyway) and Samuel Umtiti can be struck off too.

However, Liverpool’s stylistic requirements are so specific that – when scouting by the data – there aren’t many similar options. Players in the Big 4 European leagues who don’t have any strong stylistic flags against them, or who have one which is balanced out by a particularly strong similarity, are below.

liverpool data scouting

There are several names on there who should maybe have joined the ‘untouchables’ list earlier (Godin, Azpilicueta, Rugani, Romagnoli, Sokratis). At a push there’s a slight chance that Liverpool might be able to tempt them over, but it’s pretty unlikely. Maybe. Marcos Rojo you can cross off for ‘Manchester United’ based reasons. Aythami Artiles and Andrea Masiello, while some of the more interesting options, are both 31, so can be disposed of.

Let’s alter the similarity parameters slightly to get a few more names. Passing, both in overall percentages and in terms of how much players advance the ball towards goal, are important for Liverpool and can come in. How Front Foot a prospective signing plays is probably more likely to translate to Liverpool than their Depth – a deep-lying player can be moved up the pitch, but an overly back foot player is harder to change – so Depth can go out.

And whaddya know, a certain tall, saintly, South Coast-based Dutchman appears on this list, although the most similar players are still ones at larger clubs. This is slightly less fun for an article on data scouting, but makes sense stylistically – smaller clubs tend not to play in a similar way to Liverpool. Taking the old guys and the likely untouchables off the previous list and replacing them with a small handful of more likely candidates, we get the below:

liverpool data scouting 2

David Lopez probably deserves an asterisk by his name given that he’s apparently a midfielder who played a bunch of centre-back minutes last season, and the midfield minutes may have skewed his stats a little to make him look more aggressive, so that’s something to bear in mind.

*scurries away to the film room*

Right, as this isn’t a full-scale scouting operation I can only watch snippets, and provide snippety feedback in return. So:

Ginter: *unconvinced, but not hateful, noise* (has, as i’ve been between writing and editing, joined Borussia Monchengladbach)

Nacho: Stats probably skewed by his time at full-back, and the snippets I watched for this didn’t much match Liverpool’s style.

Elvedi: Can play in a back 4 or 3, anywhere across it, and is only 20. Definitely worth proper scouting.

Martinez: profiled very similarly to Lovren although in the clips looked to like backing off instead of holding the line which may not entirely fit at Liverpool. Looked decent on the ball. Did not look forceful in the air.

Dragovic: *noncommittal, shrugging noise* Scout further, be prepared to cross him off.

Vestergaard: admitted to the list once Depth was disregarded. Generally plays a fair bit deeper than Lovren, and he looked out of his comfort zone when he was put in some one-on-one situations.

De Vrij: No big red flags from the clips, looks to understand space relatively well.

Lopez: As a sometimes midfielder, looked like he understood when to pinch the space in front of him, but looked to have some foundational defensive issues.

Murillo: Looks pretty quick, which is useful. Looks a bit of a mixed bag but probably just about young enough and good enough to be able to learn to be properly good with a good teacher.

Van Dijk: Have written enough about him elsewhere. See here, for example.

So I’d give thumbs up for further scouting to Elvedi, Ginter, Martinez, De Vrij, Murillo, Van Dijk, and Dragovic. Some of those thumbs are more enthusiastic than others.

From here you could look at these seven players in more depth statistically and physically, if you wanted to. You could even change the parameters with which you look for similar players, although the principle would remain the same.

The important part, the takeaway, is that through stats alone you’ve whittled down every centre-back in the biggest four leagues in Europe down to around 60 or 70, then down again to the around 20 or so names you’ve seen in the two lists above, down to 10 names who were more realistic prospects, and down to 7 after a cursory check at some video. And – once data is formatted and organised in the way you want it – that all takes less time than you’d spend watching one full match for a couple of separate players.

 

  • srinath iyer

    What these stats has done is pick centerbacks from the teams that play a similar style to liverpool. That explains why you have defenders from dortmund, atletico madrid etc. i.e. Teams who play a high line. The analysis should aim to pick the defenders who would be able to transition to this style. Not defenders already playing this style. Since defenders playing in a high line are expected to face – lot of one on ones, to be able to handle clearances from the opposition, To be able to dribble past a pressing striker, to be able to intercept through balls during quick counters.

    • Mark Thompson

      I included this in an earlier draft but the explanation clogged up the flow of the article. You’re right, basically (although not all defenders on teams which play in a similar way will themselves play the centre-back role in a similar way), which is why I changed the parameters to take out Depth as a similarity factor and focus more on Front Foot and passing – which is how Van Dijk got added in for example. 

    • Daniel Staley

      It is a much more complex and nuanced approach to find that though. One that ultimately, i suggest, is made by a scout’s intuition of reading individual’s decisions in games observed. This analysis does highlight players who do not require a developmental transition so it’s much less risky for a club like Liverpool who can’t afford to blood too many players.

      • Jeff Loftin

        Agreed. Finding data that suggests someone fits somewhere or into a scheme they have not played in is very difficult and would prove frustrating. This is where the marriage of data analysis and scouting thrives.

        • srinath iyer

          The role of data would be to create the shortlist. And on top of it other intangible attributes like personality can only be judged by scouts. So yeah. As you say, it would be up to the management to decide based on data/scout report. I think lot of other things like nationality, his looks!, his marketability are also taken into account.

  • BundesPremierLeague

    kinda surprised Willy Orbán didnt crack this list. Superb stuff as always.

  • Lucas Góes

    Congratulations on the publication. Almost all the mentioned names have a positioning versatility in the first defensive line and also in the dominant foot. Some also played in the second defensive line, as a defensive midfielder. However, when analyzing the statistics and the specialties of each, there are enough disparities between them. Dejan Lovren, used as a reference, performs these high-line principles also on the Croatian national team. However, for example. In the World Cup of Brazil, he for many moments of the game, reacted to a game of positioning and interception. He lowered the depth and chose along with his teammates for an interception and interconnection game within his area. Andrea Masiello is 32 years old. I would like an analysis of Piquet (Barcelona). I see football collectively, respecting the article and desires of Liverpool, I believe that football is based on a more collective approach and linked to the principles proposed by the team. Barcelona has always had a very strong defense transition and Piquet has always played in high rows. I do not see him as fast and powerful but with great play and interception reading. Do you think he’s a big name for Liverpool? Pablo, defender of Corinthians-SP, who belongs to Bordeaux, managed after passing through France, here in Brazil to develop content that places him as the best defender in Brazil today. However, it alternates line height according to the position of your team and the position of the ball and the opponent. I am Scout and I develop my analysis in this way. I think that the Defensor’s cognitive abilities are going to make choices that are varied rather than automatic. And about defensive principles and physical abilities of the players (Defenders) we can spend days, months and years arguing about. Best regards.

  • NunyaBusiness

    Mark, did you use a similarity score algo for this, or something more like a series of pipe operators (like magrittr, maybe)?

    I’ve tried to do something similar to this, got somewhat similar results but with some pretty significant differences. One very big difference is you didn’t have Mats Hummels come up at all, while in my analysis he was literally the most and, by any reasonable standard, only similar player to Van Dijk. Just wondering what we may have done similarly, and differently.