What You REALLY Need to Know About Football Manager Recruitment

Managerial recruitment is possibly the most important thing a football club does on a bi-annual basis. Hiring a poor manager or a bad fit can set off a chain of events that could see a club plummeting through multiple relegations. Hiring a good manager can take an average team and catapult them into title challengers.

However, at the club level managerial hiring is also the activity that might have the single most chaotic, backwards process of anything in football.

Here’s an example a friend of mine relayed to me a couple of years ago:

We got down to the final list of three candidates and something struck me as incredibly, almost impossibly strange…

I looked at the Sporting Director and I asked him, ‘Has anyone actually watched these teams play football?’

*crickets chirping*

No one had. Verifying that what these guys were telling them about their teams and what style they preferred them to play somehow wasn't part of the process.

What’s fascinating is that nowadays you can get video on almost any professional football team in the world. You can even find video down through U18s at a lot of top clubs that’s readily available online. Given the preponderance of potential evidence weighing in either for or against a candidate, not watching their teams actually play football is a baffling choice.

But my point earlier is that recruitment of managers and head coaches is filled with one baffling decision after another. Way more so than modern player recruitment. This is despite the fact that firing a manager and his staff costs millions to tens of millions in compensation costs, and can have serious knock-on effects for the club as a whole.

Because of this, today I want to discuss what teams are getting when they hire a new manager and why that matters. What do you get when you hire a new manager?

  • The Person

This seems obvious, but it’s often overlooked in the same way that footballer’s personalities are overlooked or brushed aside. This is the guy that sets the stage for every discussion you have within your club for the life of their contract. If they are closed off to new ideas, then that will have ripple effects for years. This is especially important to know for clubs that have undertaken a project to become more modern.

It sounds like a cliché, but man management matters. If you have a big squad to cope with things like European competition, you’ll have fringe players that are mostly there as cover in case someone gets injured. Some managers hate having big squads and that can cause huge issues if they aren’t also able to handle squad personalities well.

That said, sometimes a new manager is so good that you are willing to accept possible personality clashes in exchange for better performance. This is exactly the type of thing you really want to know ahead of time. What are the trade-offs we have to make when hiring this person and is it worth it?

Is your new coach a good teacher? A good communicator? A good leader? All of these things matter in general, but they become very important if you have a young squad, your club coaches need to learn their style to train academy players in, or if you have a lot of big personalities in the squad that need managing.

The person you are hiring deals with other people constantly. You really want to know before you hire them how that is likely to impact your club as a whole.

  • Their Staff

This one can be strangely overlooked, but the staff that comes with a head coach can be very important in executing their style, and they can also fill certain roles your head coach as an individual might not be very good at (like man management, logistics, etc).

Here’s an example: some clubs allow new managers/head coaches to only bring two staff members with them when they join. The theory is that institutional knowledge inside the club is important and they want to maintain that, so instead of losing knowledge every time they have to change coaches, they limit potential change. This also forces the new coaching staff to communicate more with club staff as a whole, which generally should be seen as a good thing.

The problem here is that every coaching group is different. In some groups, the most important person after the head coach is the guy who does the fitness training. In others, you have explicit roles and subject matter expertise. (Like the head coach, an attacking coach, a defensive coach, a set pieces coach, the GK coach, etc.) Breaking that when you don’t have ready made roleplayers to fill the needs of the tactical style is a problem.

Finally, over the last few years we have heard of certain coaches waging war on their medical staffs. In some cases, having a trusted physio on staff is a hugely important dynamic because it means new managers get more comfort and clarity about player injuries and when they are likely to return to performance. In the high pressure world of football, a second trusted medical opinion for important players absolutely matters.

I generally agree with minimising change at clubs and at providing a good way for new managers to impart their knowledge to long-term club personnel. However, you need to be sensitive about breaking a whole that is greater than the sum of its parts.

  • Their Style

This is perhaps the most important, obvious thing you get with football coaches but also the most misunderstood.

The common myth: Football coaches can change/learn new styles.

I covered this last year, explaining that how coaches learn is very different from how most of the population learns, and therefore it is difficult and time-consuming for them to take on new things. Expecting wholesale stylistic change is a near impossibility. Coaches can’t learn almost any of what they need to know about new tactical styles from books, so where is the information and execution coming from? Who is teaching it to them? What training sessions are they using to impart this knowledge to the players?

To be fair, some coaches are far more adaptable than others. Part of the education at Italy’s Coverciano is to learn to adapt your tactics and coaching to different requirements and not to be too married to one style. Italian coaches often seem more pragmatic and adaptable precisely because of how they are educated for their licenses. Most coaches can’t do that and expect any degree of success. On the other hand, plenty of people criticise Italian coaches for being too adaptable and too willing to change – the criticism cuts both ways.

So yeah, the vast majority of coaches are strongly married to whatever style they have displayed in the past and are unlikely to change much after you hire them, no matter how hard you wish it were otherwise.

With a particular style comes a whole host of other things including the most pressing and expensive concern: recruitment.

Tactical styles require players that fit the style. It’s easier if they also understand the style, but you need players with the right skill set for any chance of success.

The recruitment team needs to know the tactical style the coach wants to execute. Then they need to talk to the new manager and pull out specific role requirements for each position on the pitch and compare those needs to current playing staff. Once they do that, you can construct a recruitment plan based on the new requirements compared to squad weaknesses.

There is no generic football style. Different managers require different players. This is what makes changing managers regularly almost catastrophically expensive for clubs, especially if you let those managers also run recruitment.

Some tactical styles require pace and endurance at every position on the pitch. Size is a “nice to have,” but you will take a smaller, faster guy over a bigger one every time. Tony Pulis’s requirements are basically the opposite. Pace matters up front, but everywhere else you need beef.

Jurgen Klopp and Pep Guardiola’s teams both press. But their positional requirements for who they can recruit are almost surprisingly different.

Conclusion: Your recruitment team really needs to understand the style of play of your new manager in order to succeed.

Conclusion 2: If you do not recruit for your new manager’s style, you will fail. Especially if it differs significantly from the style that your team has played in the past.

The Case of Crystal Palace and Frank de Boer

In practical terms, the part about style circles immediately back to Crystal Palace’s situation this summer and the hiring of Frank de Boer.

As a manager, I think de Boer is actually pretty decent. He has proven he can coach a defense over the years, and though his attacking style is often regarded as boring, given the right players he has been successful. Palace hiring FDB was a bit of a risk, but also a decent shout if they wanted to raise the ceiling of the club. De Boer is perhaps not as good as hiring Marco Silva or Roger Schmidt might have been, but those guys went elsewhere, and Schmidt at least would have had even greater recruitment needs to succeed.

What de Boer also brings to the table is a very clear tactical style. His clubs are going to play a variation of Dutch/Ajax press and possess football. To pull this off, you need players who are understand how to execute this style, especially in the center of the pitch. And unlike the German zonal style, individual players probably have higher style-IQ requirements because it’s largely a man-marking system.

Right, so clear tactical style. Very unlikely to adapt it. Data analysis also shows zero indication he’s capable of tactical variation. Because of all this, you need to provide him players that can succeed and PROBABLY quite a bit of time to teach the team his system.

Remember, Pochettino’s first season at Spurs was bad defensively. He needed a year to imprint the learning on the squad and also multiple transfer windows to get players capable of playing the system at a high level.

So to support their new style that comes packaged with their new coach, Palace bought Jairo Riedewald (age 20), and loaned Timothy Fosu-Mensah(19) and Ruben Loftus-Cheek (20). They also added Mamadou Sakho on deadline day.

This is a complete and utter failure to recruit for the stylistic needs of a new manager/head coach. They also brought in Dougie Freedman as Sporting Director this summer, which was nearly as interesting/baffling as their recruitment when it comes to joining up style/head coach/club needs.

This is why clubs need to clearly understand the impact their choices on head coach/manager have on their future. If Palace weren’t going to do a lot of recruitment this summer because they didn’t have the budget, they should have gone a different direction with their coaching hire. Save the money you’re going to have to pay from sacking FDB and his staff in the first half of the season and use it in the next transfer window.

Hiring FDB and not recruiting for him is basically lighting money on fire.

To be fair to Palace, the squad as a whole is decent. There’s just this huge problem in that it doesn’t fit de Boer’s tactical needs at all. This is something every single football club needs to be aware of when making new managerial hires.

Later this week, I’ll discuss an improved process we've developed for coaching hires that delivers a better chance of immediate success and a brighter future.

Revisiting Radars

As you may have seen, Luke Bornn set Twitter on fire yesterday (to the tune of nearly 500 RTs) re-posting something that Sam Ventura mentioned previously on why radar charts are bad.

Obviously, a lot of eyes turned toward me, since it is probably my fault they exist at all in soccer/football, and possibly my fault they have crept into other sports. Daryl Morey then managed to do a drive-by on my career so far, posting this tweet

Which I don’t think was calling my entire analytics career into question, but could be interpreted as such. THANKS DARYL. I am pleased to note that at least I don't use pie charts or 2 Y axes. Anyway, none of this is personal to me and please don’t assume I took it as such. I do have incredible respect for Luke, Daryl, and Sam though, so I thought this topic was actually worth revisiting. In addition to hot takes, that thread under Luke’s tweet generated a lot of great discussion. The fact that lots of people have reactions to this type of work is a good thing, not a bad one. Anyway… many smart, analytically savvy people hate radars mostly for the reasons explained in that thread. They can be misleading. Ordering of variables matters. There are more precise, accurate ways to convey the data. The thing is, I knew all of this before I started down this road. My stuff used to just feature tables of numbers. Then I spent the better part of six months doing a deep dive into data vis before I ever spat out a silly radar. And yet, some might say despite my education, I still did it. Why? It's obviously the result of a choice, not of ignorance. Before I’m tried and hanged for data visualization crimes against humanity, I’d at least like a chance to mount my defence. Often when someone allegedly smart (that’s me) continues to do something somewhat controversial in the face of some serious criticism, there are things we can learn.

Learn to Communicate

I have been to a lot of analytics conferences at this point, and the biggest point of emphasis on the sports side is always communication is key. You need to understand your audience (usually coaches, sometimes executives), and take steps to deliver your analysis in a form and language that they can accept. Rephrase that a bit, and you end up with:

Audience. Dictates. Delivery.

In order to succeed, you need to take account of the audience you are pitching to and give them something they can understand. Even better, give them something they want to understand. (It helps if it's pretty.) In soccer/football circa 2014, the fanbase had no real statistical knowledge. The media was just glomming on to the idea that maybe stripping out penalties from goalscoring stats made sense, assists might be vaguely interesting, and the concept of rate stats wasn't completely insane. I’m not being glib here, this was how it was. “xG” (or Expected Goals) was seriously weird and controversial and people seemed to think, presumably via the result of someone else's misguided analysis, that possession had something to do with the probable final score. In situations like this, visuals go a long way toward opening the conversation. If you show a table of numbers to a coach who isn’t already on board, you’re dead. Bar charts? Only mostly dead. Radars? Interesting... Tell me more. The same was true of the general public. Radars grabbed people in a way almost nothing else did. I think part of that is related to the fact that various soccer/football video games had used spider charts for a long time already, so they were somewhat familiar. Math = bad. Familiar = less scary = good.

Right, we have a vis style that grabs attention - can I fix the flaws?

Rewinding, when faced with a cool visualisation framework that would allow us to talk about player stats in an accessible way – something ALMOST NO ONE WAS DOING IN SOCCER at the time – I set about seeing if I could correct radars for some flaws. Major flaws with radars:

  • Order of variables matters
  • Area vs length issue means potential misinterpretation
  • Axes represent different independent scales

So what did I do?

  • Added the 95th/5th percentile cutoffs to normalize for population. Suddenly axes weren’t really on independent scales, even if it seemed like they were
  • Broke the stats we care about for different positions into their own templates
  • Clustered similar element stats together. Shooting over here. Passing over here. Defensive over here, etc.

One thing I was also clear about up front was that I wanted to include actual output numbers, not just percentiles. This was another choice about audience impact. Sports quants mostly care about percentiles. Normal fans barely cared at all about numbers, so percentiles would be even more abstract. Plus no one had ever done percentile work for most of the stats in football. What is a high number of dribbles per game? No one knows. Putting percentile info made even less sense then, because we were just starting to have conversations about basic stats. Going back to my youth collecting baseball cards, I wanted people to be able to talk and argue about Messi vs Ronaldo from a stats perspective, and the only way to make that happen was to have some actual numbers on the vis. I don’t even know if this was successful, but it was a design impetus that was constantly in my head.

Impact vs Accuracy

Most of the people ranting about radar charts on Twitter yesterday are pretty hardcore quants. To many of them, sacrificing precision for anything is strictly verboten. The problem with this perspective for me was: radars aren’t for you. Hell, radars aren’t even for me. I work in the database, and my conclusions are largely drawn from that perspective. The minor inaccuracy issues of radars don’t affect my work. BUT I wanted to talk to a resistant public about soccer stats, and this enabled discussion. I needed to talk to coaches about skill sets and recruitment, and this was a vital way of bringing statistics into that discussion while comparing potential recruits to their own players. As I designed them, radars exist to help you open the door with statistical novices, and from that perspective they have been wildly successful. Even in 2017, football/soccer doesn't have the volume of knowledgeable fans that basketball and baseball have in the U.S. We also don't have coaches who are comfortable with almost any statistical discourse, although that is definitely changing in the last year.

Actual, practical feedback

So a funny thing happened on the way to the boardroom: In football, radars became accepted as a default visualization type. I’ve visited a number of clubs who just incorporated the work as part of a basic suite of soccer vis, only occasionally to my chagrin. “My coaches love these. They want us to do physical stats in this form because they feel like they are easy to understand.” “This is cool. I like the way the shapes become recognizable as you use them more, and clearly indicate different types of player.” At Brentford, we took two non-stats guys, taught them the basics of interpretation, and churned through over 1000 potential recruits in a year. Football isn’t like American sports. Players can come from a ridiculous number of vectors, and radars were the best, most easily understandable unit of analysis I could find. Combine no money, huge squad needs, and limited recruitment personnel, and the only way we could hope to succeed was via efficiency and volume. They were not the end of the analysis. In fact, for recruits we liked, they only comprised a tiny portion of the evaluation cycle. From a volume perspective though, radars were the most used form of evaluation in the process.

john_drazan_STEM_tweets

On StatsBomb IQ, our analytics platform, even non-recruitment people seem to be taking a deep dive in a way they never have before. One person researched nearly 1000 players and teams over the course of the first two weeks, just because they liked learning about the game and the stats in this way. In my own opinion, having researched many alternatives, I feel like radars are the fastest way to get a handle on what skill set a player may or may not have, and include some basic statistical context.

You can’t prevent misuse of statistics

This tweet from Luke yesterday made me laugh

Two words: competitive advantage. If you’re going to take the research for free and apply it while failing to understand how to actually use it, you deserve what you get. I’m 98% certain this club never talked to me, or else I would have forcefully steered them away from that type of analysis. And the problem is, you can’t prevent people from doing bad analysis on any type of stats. Single numbers? Mostly useless. Thinking the wrong stats are important? Happens all the time, even from smart, highly educated individuals. Bad interpretations of basic visualizations? Check newspapers almost every day. Bad/useless visualization? So many, it’s a surprise we don’t all walk around with our eyes bleeding. Look, everyone makes mistakes in their jobs. We try to be objective, but everyone in stats and analytics also makes mistakes. Daryl Morey drafted Joey Dorsey, even though understanding age curves and competition cohorts is a pretty basic concept. Soccer stats once thought possession% was important. Pep Guardiola apparently thought he could win a Premier League with a bunch of fullbacks over 30. I said nice things about Luke Bornn. It happens to the best of us. The fact of the matter is, unless you are there talking to the users every day, you can’t prevent people from taking your work and potentially using it poorly. It doesn’t matter if that work is in tables or numbers, or bar charts, or radars, or fans, or code, or in a shot map, or whatever. The interpretation, application, and execution of analysis will remain more important than simply having the information from now until the end of time.

What next?

The bashing of radars is almost a yearly event at this point, and like I said at the start, I concede that there are flaws in the vis style that even my adjustments haven’t completely overcome. With that in mind – and despite the fact that customers actually seem very happy with our current style of vis – we will probably add an alternate form of player data vis to StatsBomb IQ by the end of the year. I’m not sure exactly what it will be, but as football clubs move from novice to intermediate to advanced statistical analysis, precision will become more important and I want us to stay ahead of that curve. In the meantime, I hope my defense of radars above has at least explained why I made this horrible, unforgiveable visualisation choice in 2014 and continued to stick with it over the years. Communication and opening doors to talk about stats in football with coaches, analysts, and owners remains the most important hurdle we have to overcome. Radars start a conversation. They get a reaction. And for whatever reason, football people are often more comfortable talking about and digesting them than almost any other vis type I have encountered. Maybe that will change in the future. Until then radars remain a pretty damned good visualisation for displaying most of the different elements in player skill sets*, which is the most important conversation topic we have in player recruitment.

Ted Knutson

ted@statsbomb.com

@mixedknuts

*In my opinion at least, and provided you correct for their flaws and educate your users.

StatsBomb Transfer Stories - Outliers Are Everything

In statistics, you rarely care about the outliers. If the data set is big enough, these are naturally occurring, but generally we want information about trending in the population as a whole. Outliers are something to be discarded. In sports, outliers are everything. In summer 2015, I was lucky enough to head up recruitment for Brentford football club in West London. We had to rebuild everything that McParland and Warburton took with them, and we had to do it from scratch, which meant scouting, market knowledge, player fit, etc. It was a monumental task, but we ended up with a really good recruitment team of Ricardo Larrandart, Nikos Overheul, Mark Andrews, and Robert Rowan, and a couple of part-time scouts including tactical superstar Rene Maric. From the point we knew Warbs was leaving until the close of the summer transfer window was one of the craziest and most exciting times of my life. We were both researching and applying statistical football theory to the transfer market on the fly. How well would players from various leagues translate to the English Championship? What was the lowest price we could pay for players and still get them? Could we rebuild an ageing squad into something that could potentially challenge for a promotion place again while playing an attractive, positive style? This is one of the stories from that summer... We knew we definitely weren’t getting Alex Pritchard back on loan. After finishing in the Championship Team of the Season in 14-15, Spurs wanted to keep him in training camp and then likely loan him out another rung up the ladder. There was the briefest chance we could get Dele Alli, but that quickly dissipated as he wowed Poch in training. This left a big hole for us in the 8/10 position. Our first choice was to get Arsenal’s Jon Toral back on loan. Toral was tremendous in limited minutes for Brentford in the playoff season, and his profile was unlike anyone else we could get in our price range. I sat next to him and talked him through what I saw from the numbers and what his age corollaries were in the data set. He seemed smart and interested. Unfortunately, somehow [former head coach] Marinus dragged his feet on whether Toral was the right fit. He was slow to make up his mind or get in touch with the player. Jon apparently was guaranteed starter minutes at Birmingham, and POOF! What seemed like a great fit flew right out the window, leaving us without a first-choice AMC. Owner Matthew Benham had negotiated to bring in Andy Gogia from Bundesliga 3’s Hallescher on a free in the spring. He could fill the role, but a bit like Alan Judge, we thought he would be better as a creative passer and dribbler out wide. (We also had Judge as a potential 8 because his defensive numbers were so good, but that never quite worked out.) We could not get Pascal Gross or Ziyech, and no one else was super exciting. Faced with a ticking clock and a very low budget that we would prefer to spend elsewhere, I put this Austrian guy no one had ever heard of back into the scouting queue. The data suggested he was a solid attacking midfielder who could dribble and had the great ability to create shots for teammates. He also had reasonable tackling stats for a guy who primarily attacked, and scouting agreed that he was decent at pressing. Now this was clearly a risk. At no time did we ever think, “Yes, this guy will be great in the Championship.” Instead we thought, “For the right price and in the right role, he certainly shows enough potential to be a solid performer in England.” Everything in transfers comes down to money. Are you paying the right price for the talent and the risk involved? In Brentford's budget, half a million pounds is a big deal, and a difference of £500K in valuation will kill a deal. In a Premier League budget, half a million pounds is chump change, and you'd be an idiot for missing out on a player for that small an amount.   Konstantin Kerschbaumer - Austrian Bundesliga - 2014-2015 The numbers lined up and scouting was positive, so we needed to get in touch with his club and his agent to find out if we could afford him. That’s where the Chris Palmer story came from. [Scroll to the bottom here.] An eventual deal was sealed for low six-figures, and we had ourselves a low-cost wildcard of a 10 with potential upside. Even if Kersch was a bust, he was still probably cheaper than anyone we could have signed from League One, and for a club like Brentford, that mattered. The Real World Kerschbaumer showed up at training camp in amazing shape, and tested for the highest vO2 max in the group. Dude could run for days. It was all very exciting back then. Unfortunately, things in football go weird sometimes. Brentford went through three head coaches that season and by the end of it no one really knew he was supposed to play 10 except the recruitment guys. He basically never played at AMC until the dead end of the season in 15-16. Brentford had a horrible winter run, and things looked very grim. The club announced the closing of the academy and also the Football Analytics Team – my group – was made redundant as part of cost-cutting efforts. We had already finished most of the recruitment workload for the 16-17 season, and the perception was that the squad we had recruited was struggling mightily. Now the truth was that we intentionally built a youngish squad with the blessing of the owner because that is what we could afford, and also so that they could potentially grow and improve together. As long as your recruitment is good, this is a good plan. Then a funny thing happened. Brentford had an amazing run-in. From April 2nd at Nottingham Forest until the close of the season, they only lost one match, against eventual promoted side Hull. They also won six and drew two, most of which was without player of the season Alan Judge, who broke his leg in a nasty tackle at Ipswich. Scott Hogan finally came back from two different ACL injuries to be the hottest scorer in the league. Yoann Barbet started regularly with Harlee Dean in central defense, displaying an impressive passing range from his left boot, and a team that could not win a match from Christmas through February suddenly could not lose. Brentford finished 9th. Without the poor start from the Dijkhuizen era, they might have been right back in the playoff mix. Additionally, they did it with a massive surplus of transfer fees. Worst case scenario, performance suffered a little but the club was now making big money in the transfer market. Lost in this was Kerschbaumer’s performance. He subbed on when Judge broke his leg at Ipswich and set up Sam Saunders for the first Brentford goal. He also created an early goal for Hogan against Fulham, and two more in the final match of the season at Huddersfield. Then the summer came and seemingly Brentford once again forgot about Kerschbaumer. This wasn’t unfair – Brentford had a lot of competition for the midfielder roles, and Romaine Sawyers, Ryan Woods, Nico Yennaris, and Josh McEachran shared the bulk of the minutes. Injuries bit throughout the season though, and Kerschbaumer finally started to see more playing time, once again in the spring. Since March 18th, Brentford have lost once, drawn twice, and won five times. And once again, KK is out there racking up assists. Why the long story about a bit player in a small Championship team? pin_tin_scatter The answer is because Konstantin Kerschbaumer is a major outlier. Combine his minutes across two seasons and you get the following: 2320 minutes, 1 goal, 12 assists. That’s an assist rate of about .47 per 90, which is in the top 3% of footballers. Kersch also doesn’t take set pieces, meaning nearly all of his assists come from open play. To give you an idea of how unusual this is, in the last four seasons in the Championship nine players have posted 12 assists or more, all with more minutes and nearly all of them taking set pieces. Assists are really valuable – I view them basically the same as goals. Fans still have a very different perspective if a player scores half a goal a game than if he creates half an assist a game, there's a decent case to say they shouldn't. The bulk of Kerschbaumer’s minutes also came during that first year, many of which were not in his natural position. That's a tough situation to succeed in, but his numbers in this one particularly valuable area continue to be crazy. Is Kerschbaumer a success? I have no idea. It would be hard for Brentford to lose money on his transfer should he leave the club, so if that's how you grade success, I guess it's a check mark. He's also produced exactly what I thought he could when we recruited him. But... there are questions about whether he does enough on the pitch when he plays, and I can certainly see why those exist. I think he's still learning, and I hope he ends up with starter minutes next season, preferably in a system that plays him in his natural AMC spot. Like most data scientists, I want more data and preferably a lot of it. Part of me roots for the players we recruited like they are my children. I want them to succeed no matter what. There's also a part of me that is scientifically evaluating their successes and failures to see what worked, and what I need to do better the next time I have a chance to dabble in the transfer market. Anyway, the combination of Kersch's crazy assist rate in the run-in and Fabregas's continued creative skills for Chelsea made me think back to four years ago, when I first started writing about player stats. So much has changed in my approach, but remarkably, so much is still similar. I think a lot of the early ideas I latched on to as mattering ended up being very valuable. That said, I have made plenty of mistakes along the way, both inside and outside of football. Making mistakes - and learning from them - is most of the fun. Ted Knutson @mixedknuts ted@statsbombservices.com *Thanks again to Matthew Benham for the chance to do all of this while learning on the fly. Looking at the quality in the squad right now, I think we did pretty well.    

Understanding Football Radars For Mugs and Muggles

What is a radar?

It’s a way of visualizing a large number of stats at one time. In our case, the radars specifically deal with player stats. Some people also call them spider charts or graphs because they can look like they make a spider web.

Why bother creating them? What’s wrong with tables? Or bar charts?

Hrm, let’s deal with the last questions first. There is nothing is wrong with tables of numbers. My brain loves them, and so do many others.

However, you have to admit that tables of numbers are a little boring. Bar charts are better, but they kind of fall apart when trying to compare many attributes at the same time. Radars allow exactly that.

Why bother creating them? That one is complicated. Why bother making infographics or doing data visualization at all? The answer is probably at least a book long, but the quick response is because people like to look at stats presented in this way far more than they like to look at a set of numbers. Radars invite you to engage with them. They create shapes that brains want to process. People have real reactions, and once you get used to what they display and how they display it, you can interpret them much faster than if you had to do the exact same analysis with a table of numbers.

Many of the shapes created correspond to “types” of players, at least when it comes to statistical output. Pacey, dribbling winger. Deeplying playmaker. Shot monster center forward. Starfish of futility.

There’s a lot more methodology chat in the various articles I have written about on StatsBomb, but I need to explain one very quick thing before I move on to player type shapes and examples.

Radar boundaries represent the top 5% and bottom 5% of all statistical production by players in that position across 5 leagues (EPL, Bundesliga, La Liga, Serie A, and Ligue 1) and 5 seasons of data.  In stat-y terms, the cut-offs are at two standard deviations of statistical production.

In non-stat-y terms, Lionel Messi made EVERYONE look terrible. I know, that doesn’t sound that bad because it’s true, but trust me, the newer way the templates are constructed is better.

Messi_2013_vs-JoeAverage1

The design for these was taken from Ramimo's 2013 NBA All-Star poster. I thought it would be really interesting to apply this to football, and then through testing, became irritated by what Messi made everyone else look like if I just used pure stats output. That's when I added the standard deviations idea, and started playing with different positional templates.

QUICK NOTES:

  • The only thing these represent is statistical output.
  • If you put players in different systems, it may change their output.
  • If you put them in different positions, it almost certainly WILL change their output.
  • Age will also change statistical output.
  • In short, these are a tool to help evaluate players. Like any tool, they have strengths and weaknesses. In general, I have found it much easier to evaluate players WITH this information than without it.

 

Explaining Bits and Bobs

per_90

This means that all the non-percentage stats in this are normalized for 90 minutes played. The reason you do this is to correct for the fact that some players don’t always play 90 minutes. Players that frequently get subbed on or off will inherently look worse if you look at per game stats than per 90 minutes played.

Age

This is the age the player would be at the end of the season. We will change this soon to season age + birthday.

Non-penalty-goals

Why use non-penalty goals? Because penalties are converted at a 75-78% rate almost regardless of who takes them. They are a different skill to scoring goals that are not penalties (some teams have even had goalkeepers as their lead penalty takers), and so we strip them out of the scoring numbers.

DRAWING penalties is a great skill (and will be added to assist stats over time). Converting penalties is a very common one.

Shooting%

How many shots were on target out of ALL shots that a player has taken. This includes those that were blocked.

Key Passes

Passes that set up a teammate to take a shot. These are highly correlated with assists, which are passes to teammates who score a goal quickly after. (Note: This is the same stat as Chances Created. Somewhere along the way Opta made Key Passes only mean passes that lead to shots that are NOT goals and CC is all. Which is weird.)

Through Balls

Opta definition: a pass splitting the defence for a team-mate to run on to. Why do we care? These types of passes are generally considered the single type of passes most likely to score a goal.

Scoring Contribution

Combined non-penalty goals and assists per 90 minutes.

PAdj

PAdj stands for “possession adjusted” stats. The reason why we do this is because it normalizes defensive stats for opportunity. Think about it this way: If your teammates always have the ball, then you can’t make any defensive actions, and you would look worse in this statistic compared to a Tony Pulis-style team that sits deep and constantly defends.

When adjusted for possession, tackles and interception output becomes moderately correlated with shots conceded and goals against, as opposed to having no correlation without the adjustment. In short, it’s an imperfect adjustment, but much better than not having the adjustment at all.

bottom_left_table

In the bottom left of every radar is the actual statistical output in numbers for each spoke of the radar. Numbers in green are in the Top 5% of output in that stat for the player population and numbers in red are the Bottom 5%.

Forwards + Attacking Midfielder Shapes

Pure Goalscorer

Pure_Goalscorer

Elite Creative passer

Elite_Creative_Passer  

Wide, Dribbling Playmaker

Wide_Dribbling_Playmaker

All Around Super Forward

All_Around_Fwd

Starfish of Futility

starfish_of_futility

Bowtie of Sadness

Josmer_Volmy Altidore_2014-15

Central and Defensive Midfielders

Pure DM

Pure_DM  

Heavy Attacking CM

Attacking_CM  

Deep-lying Playmaker

Deeplying_Playmaker_CM  

General All Around CM

All_Around_CM

Fullbacks

Defensive

Defensive_Fullback

Attacking

Attacking_FB

All Around

Daniel_Carvajal_2013-14

I broke down fullbacks in detail here: http://statsbomb.com/2014/07/introducing-and-explaining-fullback-radars-sagna-debuchy-lahm-alves-and-more/

Center Backs

These were developed later, and to be perfectly honest, they are less valid overall than the other positional templates. I knew this ahead of time, but legendary Scotland, Everton, and Rangers player David Weir - who is also a centerback - asked me to take a swipe at creating these and I couldn't say no. They give you a sense of how a centerback plays, but become tricky beyond that.

I do know that Thiago Silva is pretty fantastic, though.

Thiago_Silva_2013-14

--Ted Knutson

mixedknuts@gmail.com

Introducing Possession-Adjusted Player Stats

We’ve known for a few months now that defensive rate stats at the team level are fairly useless. Tackles and interceptions – the base defensive action metrics -  have almost no correlation between shots allowed or goals scored. However... once you adjust these numbers at the team level for amount of possession, the r-squared when compared to shots conceded, goals allowed, etc generally shoots up in the .4 range, which is about the same as you get for possession itself. This isn’t perfect, but it’s not meaningless, which is a positive.

The reason we see a much higher correlation when we adjust defensive stats for possession is simple: opportunity. If your team has possession of the ball, you can’t rack up defensive rate stats.  Teams that have a ton of possession don’t give their opponent the ball very often, and thus can’t accumulate defensive stats.

What do you do when you know the basic rate stats are meaningless? You adjust them. Hopefully in a way that isn’t completely terrible, but we’ll wait and see on that.

But what about at the player level? Shouldn’t a guy who plays on a team that always has the ball (think of Sergio Busquets) get more credit than a guy who plays defense for a team that never has the ball (think anyone who plays for Tony Pulis)?

Of course! Math and logic both say this is correct.

So, those of us who are working on the player stats side of things need to adjust the numbers to compensate for this. As long as you have the data, this shouldn’t be that hard, right?

*sound of crickets chirping*

The initial adjustment method I used had some issues with extreme values. Barcelona’s int + tack went from 36.8 up to 97 or so. When you distribute that at the player level, what you end up with is Dani Alves making 13.4 tackles and 10.6 interceptions. I like Dani Alves a lot, but... that’s a tough pill to swallow. This is especially true once I started thinking about changing the radars to account for this. One of the things I like about the radars themselves is that they are a snapshot of reality. These are actual numbers produced by a player in a season. By not using abstraction (like percentages), you keep them approachable to your average fan.

24 interceptions and tackles a match for Dani Alves is no longer approachable.

Thankfully, I work with really smart people like Marek Kwiatkoski, and he helped me find a better fit for the adjustment that remained at least somewhat attached to reality and maintained the solid correlation between goals conceded/shots conceded and possession adjustments. If you want to know more, skip to the methodology section at the bottom.

So What Now?

The point of all this was to see what we end up with at the player level if we adjust for a team’s possession. Tackles and interceptions as they currently exist are mostly just noise because they don’t account for opportunity – can we do better? Whatever we end up with, do you see the best players on high possession teams rise to the top of the rankings? If so, we’re probably on the right track in developing defensive metrics that also correlate to things that help clubs win matches.

Well, let’s see what happens to our tackle and interception rankings when we adjust them at the player level. We are working on English Premier League data for 2014. I’ve listed the rank of top 15 tacklers here by the sigmoid adjustment, then the simple adjustment (see the methodology section), and finally by their base numbers.

Tack_Rank

As you can see, the two forms of adjustment produce fairly similar ranks to each other, but can produce dramatically different rankings than the base numbers. And this is how the interception rankings change with the adjustments.

Int_Rank

And finally the combined defensive output of interceptions plus tackles.

I+T_Rank

And if you want straight player names next to each other, the column on the left here is the sig adjusted rankings, the column on the right is the base rankings. Notice all the guys who play for top teams on the left hand side? That’s pretty cool.

I+T_sigmoid_vs_base

Additionally, here’s a look at how much the numbers change from the base vs the sigmoid adjustment for the top 15 adjusted Int + Tackle guys in the league.

I+T_Numbers

Those numbers are no longer “real” numbers. You could think of them as tackle and interception points or "defensive things" instead, if that makes it easier to swallow. They are, however, much more closely related to outputs that help teams win matches, which is exactly what we were looking for at the start.

This could probably be another 5000 words long looking at different leagues and such (people will obviously want to look at Barcelona stats with La Liga as well as what Pep is doing at Bayern Munich – we’ll get to it), but for now I’m going to leave it here.

Any surprises listed there? Problems you see? Complaints? This is totally new research, so I expect there will be additional fixes/adjustments/hiccups before we get to a final product. If you’re really interested, check out the methodology discussion below.

Thanks,

--TK

@mixedknuts on Twitter

Methodology Info - In Steps Marek

First of all, the sample used at the team level was 4 full seasons from 2009-2013 across Ligue 1, Bundesliga, Serie A, EPL, and La Liga. When I did my initial research on this a few months ago, I was looking at the defensive rate metrics at the team level compared to shots conceded, goals allowed, goal difference, etc.

I was talking about this to my sometimes partner in crime Marek Kwiatkowski, and he suggested using a slightly different approach for the player info than I was initially using. Instead of taking the full difference in possession, Marek’s first new approach (which I called the “simple approach” above) was to attach everything to a base of 50, which is the game by game mean. So a 66.7% possession rate would go from 2:1 to 1.33:1.

I liked this because it delivers numbers more attached to reality, especially for extreme possession numbers. However, at the team level, it lowered the r-squared from shots conceded to .24 from .4. This is more correct than just using tackles and interceptions. It is possibly less correct than the full base team adjustment I was working with initially.

The next step Marek suggested was testing out a sigmoid function, which I had never heard of.  Swapping this in pegged the r-squared for both shots conceded and goal difference at .39, meaning we’re explaining about 39% of the variation in both of those outputs with adjustments to two simple defensive rate stats. When I first looked at this I was a bit disappointed, but given how complex football is, I actually think that’s pretty good.

Here’s the equation for those who are interested. The bit at the end is the possession adjustment.

Tackles * 2/(1 + e^(-0.1*(x-50))), x/50 for x in [0,100]

As noted above, the basic adjustment and the sigmoid version produce very similar ranks. It’s just that the sigmoid one gets more extreme in adjustment as you get further away from 50% possession.

All of this assumes that I have also done the basic queries correctly in pulling game by game possession stats for each game the players were involved in. I don’t have that info on a minute-by-minute basis right now, so it’s not game-state adjusted or anything.

It also assumes that adjusting defensive stats by possession to increase correlations makes sense and doesn’t simply fall on its face from a methodology standpoint. The logic behind it makes sense to me, but I’m just some guy who works in gambling, not a Ph. D. in stats or math.

Again, we’re probably imperfect at a number of levels.

But it’s a start.

The Best Young Prospect in Europe, 2014 – Alvaro Morata

Alvaro-Morata

If you listened to the podcast yesterday, you know that there’s one guy that I tabbed as the best attacking prospect in Europe. Ben expressed a fairly strong degree of scepticism on Twitter when I initially said this and then again on the pod, and rightly so.

Young player scouting and prediction is basically impossible. When you do it via the eye test and someone doesn’t work out, you shrug and point to transfer numbers that say 50% of ALL transfers fail. Guys get injured. Home sick. Played out of position. Fall out with their new managers. Humans are bloody complicated.

When you scout via stats and a guy doesn’t work out, you shrug and point to the same stats as the eye test guys, but hopefully your model has a success rate of better than 50% or what’s the point? As you know from the intro article, the new scouting model that I'm developing backtests quite a bit better than 50%, but guys can still fail to turn into world beaters.

These same problems are also what makes it tough to evaluate model picks. If a guy has one really good year after the model “finds” him, is the pick a success? Two good years? It’s tricky.

Here’s an example: In 09-10, YAPSS (Young Attacking Player Scouting System) said Marko Marin is a prospect teams should be very interested in. Was that pick a success or a failure?

Now Chelsea fans will tell you he failed with them. And yet… He played 1.59 90s in the league while at Chelsea and had a scoring contribution of 1.26 goals and assists per 90. Basically, he couldn’t get on the pitch, but when he did, they scored.

Outside of Chelsea, Marin’s contributed at about a .4 scoring rate wherever he’s gone, including 3G and 9A the year after the model triggered, and he’s a career-long good to great dribbler as well. That has to be a hit, at least statistically.

Football isn’t just stats, but models generally are and need to be evaluated on that basis. The guy I want to talk about today might just be the best statistical prospect in the last half decade (meaning, the entire data set I have access to).

What do you have to do to be labelled “The Best Prospect in Europe?”

You have to be statistically very special.

That’s what Alvaro Morata is.

Name: Alvaro Morata

Age:  21

Position: Center Forward

Team: Real Madrid

Fair Price: £25M

Who should buy him: Every team that needs a forward and can afford him. Actually, throw need out the window. Every team that can afford him. In fact, Real Madrid are dumb to sell him in the first place.

Morata_2014_Madrid

Morata is unreal. 6.2 shots per 90, a non-penalty goal rate of 1.29! 2.25 key passes per 90, 1.45 dribbles… for a guy who is only 21, those stats are absurd.

Check that, for any player in Europe, those stats would be absurd. The list of guys who have shot more than six times per 90 in the last five season is as follows.

Messi. Ronaldo. Wayne Rooney. Mario Balotelli.

So why does Ben (or anyone sensible, really) have reservations? There are a number of good reasons.

  • Morata played on Real Madrid, one of the most talented attacking teams in the world. If/when he moves away from there, those stats will fall off because his teammates are unlikely to be as good.
  • Morata only played a little over 6 full games in the league this year, and importantly, most of that time came as a substitute, which we know has a big boost on attacking performance.

Those are large asterisks to statistical performance. So why am I still so high on this kid?

The answer is: because of all the young players in the last five seasons of data from all five big leagues in Europe, Morata looks the best. All the other young players to come from top teams, title-winning clubs, minnows, whatever… no one looks as statistically good as Morata.

In five years of data, Mario Balotelli is the only guy to average more than 6 shots per 90 at 23 or younger. The top two young players in Shots per 90 the last four years looks like this:

2013: Balotelli, Nelson Oliveira

2012: Balotelli, Jovetic

2011: Sturridge, Lewandowski

2010: Darron Gibson (no, I have no idea either), Karim Benzema

It’s a pretty strong indicator that a kid is hugely talented.

And here’s the other important thing – he also passes the eye test. He’s 6’3, has a big frame, dribbles extremely well for a big kid, is surprisingly good at picking out teammates with key passes and he’s fast.

Watch his highlights and you’ll see balance, strength, and shockingly soft feet. He scores with both feet (though primarily his right), and heads the ball into the net regularly. He still hasn’t fully filled out his frame, but he’s far from a waif. He also gave Dani Alves a torrid time out on the wing this year in a Classico.

Statistically, we can try to overcome the small sample size a bit by adding in Morata’s earlier time with Real, which includes previous seasons and his Champions League play.  When you do that, you wind up with 12.7 90s played, NPG of .86, Scoring Contribution of 1.1, 4.71 shots per90, 1.96 Key Passes, and 1.73 dribbles. All playing for the A team. That’s still bonkers and would be great for his age at half that. His scoring rate at Real B and for the U18, U21, U23 Spanish National teams has been consistently outstanding.

Step back for a second and consider this question. Take what you know about Daniel Strurridge or Robert Lewandowski now and put them on the open market at age 23 so you have all of their prime years ahead of them. What price do you think teams would pay for their services? £50M? £60M?

Jovetic sold for £23M last summer. Lewandowski probably would have sold for £30M with only a year left on his deal, but Dortmund flatly refused to sell to Bayern and made him see out his contract. He’d be worth £60M otherwise. Real bought Benzema for £31M in 2009 and his young player profile wasn’t this good.

I know Morata only has one year left on his deal, and I know there’s some uncertainty from the sample sizes, but from a statistical perspective, it feels like teams should be thinking about how much they would pay for the next Zlatan or Ronaldo or Lewandowski or Sturridge at age 21.

He really does look that good.

If things go badly, you overpay slightly for an average forward. (There's almost no way he's worse than that.) You can probably sell him off somewhere else two years from now for £12-15M to cut your losses.

If things go as the stats suggest they might, you buy the good version of Fernando Torres, right as he is turning into el Nino.

Stop dithering over a couple million pounds. Buy him outright, plug him into your team for the next decade, and enjoy the ride.