(All data for the models is powered by Opta.)
As many of you know, I’ve been working on a statistical scouting model. The theory behind is this that by using a mixture of key performance indicators, you can unearth players that teams should be interested in watching and potentially signing, while making scouting more efficient, saving time and money in the process.
Now I think the attacking player scout is really quite good. Backtested over four seasons, it seems to turn out near superstar players about 70% of the time, and unfortunate duds at a 15%. This falls in line with the fact that we understand attacking output fairly well, so it’s easy to pick them up with stats.
But what about midfielders?
I can tell you for a fact, scouting midfielders via stats is much harder. Thus far it’s been hard enough that some of my more skeptical friends don’t think this type of scouting is possible. However, this is the type of feedback that I tend to use as encouragement to persevere, so that’s what I did. This model is only a prototype right now, but early results are fairly promising.
Here is the list of players the model flagged after the 2010 season. The midfield scout is set to pick up players up to a year older than the attacking scout, mostly because midfielders have a longer age curve than attackers.
On first glance this looks like a mixed bag, but I think the output is really promising. Regardless of what you think of Alex Song now, Alex Song in 2010 was one of the most promising young mids on the planet, and eventually moved to Barcelona. Also present are Ramsey, Lucas, Mikel, Mark Noble (underrated), and uh… Denilson (can’t win ‘em all). Three Leverkusen guys show up from Germany, one of whom is now considered one of the best all-around midfielders in the world, and the other who is merely one of the best in Germany (Gonzalo Castro). Banega and Busquets are the two that show up from Spain, while you get Matuidi, Cigarini, and Marchisio out of France and Italy.
It’s an imperfect list, but it’s also a good start.
The other backtested years actually seem to improve on the overall results. In the big 5 leagues, a little over 50% of the names this model picks up become stars. About 65% are at least good or very good, while the outright failure rate is around 15%. Again, this is taking output of younger players and saying “we think these guys either already are or will become very good.” The failure rate on transfers as a whole is 55-60%, so a prototype that picks out stars near that rate, with a failure rate of 15% is very strong.
Much like with the attacking player model, with the midfielders I am going to be open with the players the model picked up for the 2013-14 season. I definitely won’t have time to write about them, but if you are interested and want to do so, feel free. Making these public also provides a testable record to look back on over the next few seasons to see who developed into good players and who didn’t make it. I’m also including Eredivisie picks here with one caveat. I only have one season of data for the Eredivisie, so this isn’t backtested at all. My hope is that it still works with some minor tweaks, but without data to go by, it is only a hope and not something backed up by data analysis (right now).
Anyway, I hope you guys are enjoying this series. You’ll see radars of these players gradually appear on my Twitter timeline over the coming weeks. I have to admit, despite watching a ton of football, I know nothing about the vast majority of them. If you have thoughts or comments, feel free to leave them here, or hit me up on Twitter.
If you work for a club and are interested in discussing the statistical scouting models I have been developing, feel free to send me an email at mixedknuts at gmail.