On the surface Arsenal waters are calm. While the team left it until late, they ultimately managed to put a couple in the net and defeat Watford 2-0. Usually beating Watford would not be a notable achievement but given that Tottenham weren't able to pull it off, and that the Hornets started the day above Arsenal in the table, and the accomplishment becomes a little more impressive. The win was also Unai Emery’s fifth straight, after two opening defeats. Things seem to be looking up. Look a little deeper, however, and there’s deep cause for concern for the Gunners.
Let’s start by looking under the hood of that comfortable 2-0 win.
Well, that’s not great. The teams were mostly even until Watford took control of the match in the 67th minute with a flurry of activity. Arsenal managed to weather the storm, and then go ahead thanks to an own goal. But, what this clearly shows is that rather than Arsenal dominating the game and then finally pushing one past Watford, this match was a case of Arsenal hanging on, and then getting a fortunate bounce at the right time.
Other than the result, there wasn’t a lot for Arsenal to walk away from this match hanging their hat on. It wasn’t just the expected goal total of 1.95 to 1.34, but more prosaic measures as well. They were out shot 14-9. Arsenal managed just two shots on target to Watford’s five. Arsenal, as expected, dominated possession, completing 468 passes out of 591 attempted to Watford’s 232 out of 345. But those possession numbers are perfectly within acceptable ranges for Watford’s press and counter approach. And given the shot numbers, it’s clear the game was played stylistically in the manner Watford preferred. Even if Arsenal ended up sneaking out a win.
Now, individual games are not dispositive and there are always mitigating factors. Alexandre Lacazette was denied a penalty early on that could have changed the course of the game. If Arsenal was playing from ahead then maybe everything looks different. And it’s only one game, and one at the end of a winning streak at that. Maybe a broader look would mitigate the concerns raised by Watford.
Except that the broader look turns up lots of concerns for this Arsenal side. Arsenal are conceding an eye popping 1.50 expected goals per match. The only five teams with worse defensive numbers than Emery’s team are Brighton, Burnley, Fulham, Huddersfield, and West Ham. That’s bad for any Premier League side, let alone one that expects to challenge for the top four places. Their shot numbers are marginally better. They give up 14.14 per game, which is eight worst in the league. That puts that at 0.11 xG per shot conceded, the fourth worst total in the league. A defense that doesn’t suppress shots, and also allows good shots is not a good defense.
These problems aren’t new, of course. They’re the same problems that Arsenal had coming into the season, and ones that new manager Emery was tasked with fixing. He certainly hasn’t been able to yet. The team seems set on trying to play a highline, squeezing up the field and taking away opponents space.
The problem is, just like last season, this defensive approach simply isn’t working. Arsenal’s defensive and midfield personnel aren’t good enough to make the system work. The left side in particular is a major problem. Granit Xhaka is not nearly rangy enough in midfield, and his poor defensive instincts amplify his lack of mobility. Nacho Monreal has aged past the ability where he can comfortably defend in space. The motley crew of Arsenal center backs don't do enough to pick up the slack when faced with that much swiss cheese in front of them. And while Lucas Torreira may yet become a good defensive midfield linchpin, there’s only so much any mortal not named N’Golo Kante can do.
Faced with those defensive deficiencies, Arsenal would need to be an intergalactic level attacking juggernaut for the team to compete at a high level. They aren’t. They’re good, but not good enough given the handicap they’ve been saddled with. They’re averaging 1.41 expected goals per match, which is sixth in the league, just a shade behind Bournemouth for fifth. They don’t actually take a ton of shots, only 13 per match, but they do take efficient ones.
The good news for Arsenal is that despite the problems, they’ve won five in a row and 15 points puts them just about where they need to be, even on points with Tottenham for fourth place. Arsenal’s results have kept them in the mix, but if something doesn’t change, then those results won’t keep coming.
How can Arsenal fix themselves? The obvious answer is by upgrading their personnel at the back, but that can’t happen until January at the earliest, and more realistically next summer. Until then, the team needs to figure out how to change their game to give the back half of the field some help. One obvious possibility is simply dumping Granit Xhaka in favor of a more defensive option. But, it’s unclear if Arsenal have such an option available. Matteo Guendouzi is the obvious choice but at 19, his defending is very active, but not positionally sound. It’s not immediately clear he’d make the defense better.
Another option is playing more conservatively in attack. This is probably the path that can pay the most immediate dividends. When Arsenal attack they are extremely aggressive. Not only are they functionally playing with two strikers now that Pierre Emerick-Aubameyang and Lacazette are both starting, but they are also using the very aggressive movement of Aaron Ramsey and choosing to get both fullbacks involved to provide width. All of that thrust is then conducted by Mesut Ozil.
A slightly more conservative approach, making sure Monreal doesn’t overextend his ability to recover, and playing an actual winger on the left to protect him might help stabilize things. Or, rather than play Ramsey at the 10, play a truer stay at home midfielder, to be part of a central three, adding an extra body to clog the field, and trusting the top notch attacking talent to get it done anyway (and since Ramsey’s contract is running down, preparing for life without him makes even more sense).
Right now, Arsenal’s possession is predicated on scoring. They want to have the ball to use it to score goals. Then, if and when they’re ahead, they are happy to try and defend without the ball. It doesn’t go well. Instead, this team should be looking to be more conservative in possession. Keep the ball, maintain a more conservative shape in possession, use all those passers to kill off games by running teams ragged. Be boring.
The last lingering problem there is that a high dose of defensive possession means having a goalkeeper who can play the ball with his feet. Keeping the ball means when it goes back to the keeper it comes back out again as a pass, not as a 50/50 proposition. This was obviously not Petr Cech’s game. But, Cech went off injured against Watford. It’s now Bernd Leno’s time. He’s supposed to be more proficient with the ball. He better be, Arsenal’s season may depend on it.
Right now, despite the results they’ve managed to pull off, Arsenal simply aren’t working. They need to change what they’re doing if they’re going to make a real run at qualifying for the Champions League this season. They’ve still got time left to make that happen. But each passing week with the same old not good enough defence makes it less likely that the change will ever come. If it doesn’t happen soon, it probably won’t happen at all.
Atletico Madrid are off to a curiously slow start. It’s the rare year where there might be some cracks at the top of La Liga. The departure of Cristiano Ronaldo from Real Madrid means there’s a crack of daylight at the top of the table. But, three games in, the perennial favorites, Barcelona, and Ronaldo-less Madrid are perfect and Diego Simeone’s team has already dropped five points. Is there anything amiss with Atletico, or have their opening three games been the kind of fluke that the next 35 games will make everybody forget?
The most notable difference for Atleti so far this season, is that their defense simply hasn’t been up to their usual standards. Last year they were the best team in the league defensively. They only conceded 0.87 expected goals per game, one of only two teams, along with Getafe, under the one goal barrier. This year, they’re at 1.21. That’s a really big jump. It’s easy to see that the defense has performed relatively poorly, getting under the hood and figuring out why is important for figuring out whether it will continue.
Traditionally Atletico have been happy to cede territory in their opponents half, but been extraordinarily aggressive in their own. Here’s what their defensive heat map looked like last season.
This year, well, so far that pressure map looks entirely different.
There are lots of reasons that might be. They’ve played an interesting set of opponents, going away to Valencia and drawing, before beating Rayo Vallecano at home and then going on the road (and playing 20 minutes down a man) to lose 2-0 to Celta Vigo. It’s certainly possible that some combination of tactical interplay is what’s driving the difference.
It’s also possible that Simeone is playing catch-up when it comes to indoctrinating parts of his new team defensively. Here are the players that led the team in ball pressures per 90 minutes last season (and played over 900 minutes).
It’s a peculiar system that Simeone runs. The team doesn’t press high up the field, but still asks the forwards and wingers do to a lot of defensive work, dropping into their own half and then harassing the ball. Often times one or both wingers are asked to pinch inside and serve as an extra midfielder while one of the forwards drops back to cover the wing. It’s how Simone can both nominally play a 4-4-2 system while also seeming to have bodies everywhere behind the ball.
Several of the players who were instrumental to that system are now gone. Four of the team’s ten most active players at pressuring the ball are no longer around. For the attackers that’s not a huge change. Kevin Gameiro, and Fernando Torres were both bench options by the end of last season, and Yannick Carrasco left last January for China, but in midfield, the loss of Gabi looms large. Yes, he was 35, and his legs weren’t what they were, but his expertise at holding the midfield together was real.
In three games so far this season, Simeone has deployed three different midfield pairings. Against Valencia, Koke and Saul Niguez started together. Against Rayo, Koke sat and Rodri came into the side to pair with Saul, and then during the debacle against Celta it was Saul and Thomas Partey. Three matches, three different partners for Saul. It’s possible Simeone is planning to mix and match all season long, customizing his midfield approach to his opponents. It’s also possible that he simply doesn’t know what his best midfield is yet. World Cup summers are short, and they cast long preparation shadows. Simeone is trying to work newcomers like Rodri and Thomas Lemar into the side. He is trying to do it despite Lemar, along with both starting forwards in Antoine Griezmann and Diego Costa, as well as Koke and Saul being gone for most of the summer on international duty. It’s certainly possible that the first three weeks are simply a symptom of everybody not quite knowing each other yet and a team that needs to work its way up to full speed.
But, Simeone needs to get them there, because the defense simply isn’t working as well as it’s supposed to. They’re conceding both more shots, and better shots than last year. Last year they conceded 11.87 shots per game and 0.08 expected goals per shot and this year that’s up to 12.33 and 0.10. The fact that both numbers have gone up suggests that rather than seeing a style change, we’re simply seeing a defense that isn’t executing at the high levels we’re accustomed to seeing from Simeone teams.
Atletico’s style also gives them a smaller margin for error. Their grind it out defensive approach means that often times even beating mediocre teams means lots of work. Last season, even while finishing second the team was ninth in expected goals scored per game with only 1.18. They only took 10.95 shots per game, only five teams took fewer. Even at the best of times Atletico make things difficult for themselves.
Of course, if anybody deserves the world’s faith when it comes to molding and shaping a defensive unit, it’s Simeone. He has repeatedly worked with shifting personnel to put together teams that dominate on the defensive side of the ball. There was a time when the midfield was patrolled by Gabi and Tiago. He managed to navigate Diego Costa leaving and then returning. This is a team that managed to move on from once talismanic winger Arda Turan seamlessly. Simeone has managed the same thing in the defensive line where Diego Godin is the only constant. Miranda left, Filipe Luis left and came back. Toby Alderweireld came and went. A cadre of young defenders took their places with Lucas Hernandez and Stefan Savic in particular stepping to the fore. Simeone is the constant on the sideline, the majority of the players on the pitch have changed over the years.
Atletico will probably be fine. Simeone will take the new pieces, integrate with the old ones, and put together a strong squad that will finish comfortably in the top four of La Liga. That doesn’t mean there’s nothing wrong at Atletico right now. There is. It just means that given his track record Simeone is almost certainly the man to put it all right, and to do it quickly.
Meet the new Fulham, relatively different from the old Fulham. On the heels of an exciting promotion campaign, Fulham have undergone an audacious summer makeover. They return to the Premier League with an upgraded squad, an exciting style, and ambitions for taking the bottom half of the table by storm.
A Unique Championship Approach
Most teams in the Championship are not built around their attack. The tried and true method to success in the Championship is to, more or less, be Steve Bruce. Build a solid disciplined defense. Be difficult to break down. Have enough resources to plug a couple of above average goal scorers up top, and let them do the hard work. If you must have possession, build through the wings so that your midfielders can maintain defensive solidity, and cross the ball a lot. Sure, it’s not the best way to score goals, but it’s good enough, and it minimizes the risk of getting caught on the counterattack. Steve Bruce has been a successful Championship manager doing that, he was successful at Aston Villa last season doing that. Aston Villa lost to Fulham in the promotion playoffs.
Manager Slaviša Jokanović built a team heavily invested in having and using the ball. They were the most possession oriented Championship team. They shot the ball 14 times a game. Fulham trusted in their attack to win them matches, even if it meant frequently leaving their defense exposed. The approach was the direct opposite to the four teams right behind them in the Championship, not just Aston Villa, but Derby County, Middlesbrough and Preston North End as well, all of whom built attacks which worked within the constraints of responsible defending, instead of the other way around.
But, there’s a reason that teams have traditionally used a defensive approach to get back to the Premier League. Actually, there are lots of reasons, but the one of particular concern to Fulham is that a strong Championship defense translates reasonably well into the Premier League, while a strong attack is much harder to execute. Attacking teams need the ball, defensive teams don’t want it. A more defensively oriented Championship team will get promoted and go on to play it’s preferred style in the Premier League. An attacking oriented one almost certainly will not. They simply won’t have the talent to allow them to possess the majority of the ball against a new set of opponents, almost all of whom have better players.
Fulham then had three choices. They could either change the style they played, significantly change the talent level on the team, or lose. They went with option two.
Upgrade Button
Fulham spent the summer building an almost entirely new starting eleven. They looked at their squad, decided that, other than a few notable exceptions, it probably wasn’t strong enough to attack in the Premier League, and spent the money to fix the problem. Those exceptions were Ryan Sessegnon, and Tom Cairney. Sessegnon started out as a precocious leftback, but he has the kind of nose for attacking the penalty box that demands to be played higher up the field, and he’s evolved into an extremely dangerous winger. He’s also only 18, and has the potential to be an absolute superstar. Cairney pulled most of the creative strings in midfield for the team in the Championship. At 27, he’s a poster boy for the way creative passing midfielders can get lost in the lower levels of the English game. He’s not a star, but he’s a player who thrives in a system that prizes possession, and as such spent the first half of his career not getting to take full advantage of his skill set. Fulham found him and unleashed him, and now he’ll get to ply his creative skills in the top tier. Fulham also signed Aleksander Mitrovic on a permanent deal after his loan stint last season. So that’s three players from 11 that are definitely staying around. After that, the flood of signings.
Fulham announced their transfer presence with authority back in July with the signing of Jean Michael Seri from Nice. Seri was linked with clubs as far up the European food chain as Barcelona, so Fulham snagging him was quite the coup. He’s the kind of midfielder who does a lot of the basic possession work and defensive pressure that makes a possession side tick, and should be a perfect fit in this Fulham side. The hope is that he and Cairney combine to make sure that the possession Fulham has is actually used to good effect, and facilitates opportunities for the attackers further up the field.
That attack has been bolstered not only by the permanent arrival of Mitrovic, but by a number of other attacking pieces. Andre Schurrle arrives from Borussia Dortmund after two exceedingly mediocre seasons. While Schurrle’s name is well known, his output over the last five to six years has been extremely mediocre. The 27 year old hasn’t managed to play over 2000 minutes since his days at Bayer Leverkusen, specifically the 2012-13 season, which was also the last time he scored double digit goals.
The hope is that Schurrle’s career has been slowed by a combination of injuries and the Peter Principle. That is, that Schurrle performed well enough when healthy at clubs like Mainz, Leverkusen and Wolfsburg that he earned moves to places like Chelsea and Dortmund where he simply wasn’t good enough to be a regular contributor. At Fulham, it’s certainly possible that he will find the sweet spot where his skills will feature, but he won’t be buried on the bench by other, better players. Fulham are also taking a similar bet that Atletico Madrid forward Luciano Vietto, a 24 year old who has failed to gain purchase at La Liga’s top sides, might come good in a slightly less competitive environment.
Fulham built themselves a whole new defense too. Alfie Mawson arrives at center back from relegated Swansea, the incredibly French named Maxime Le Marchand from Nice, and Calum Chambers on loan from Arsenal. At fullback, as Sessegnon moves higher, Joe Bryan from Bristol City arrives to take up the leftback slot and across the field Tim Fosu-Mensah is a new rightback option. This isn’t exactly an all-star unit, but Fulham’s defense last season wasn’t anything to write home about either. The new group could end up significantly below Premier League average and still be a hefty upgrade over last year’s crew.
Fulham’s biggest defensive acquisition came late on deadline day, when the squad swooped for Marseille defensive midfielder Andre Zambo Anguissa. The move addressed Fulham’s major looming weakness. A team that wants to use the ball in possession has to be good at winning possession back. Upgrading the attack, and even upgrading the defense to be more resilient, isn’t enough. Fulham will need to take the ball from opponents. That’s what Anguissa specializes in.
Anguissa was an all action defensive midfield machine for Marseille. He put up great numbers in every defensive action category. Importantly for a possession based squad, he’s comfortable doing that defending in space.
And, while the focus might be on his defense, and the way he’ll be able to clean up behind Seri and Cairney, it’s worth noting that he’s not a liability in possession. His passing range might not be excellent, but it’s good enough, and he’s comfortable enough on the ball to contribute to the attack, especially since he’s likely to have time and space on the ball as Fulham’s third most creative midfielder.
If everything goes according to plan for Fulham, Anguissa will be the glue that knits together a better defense and a robust attack, turning the two halves into a cohesive whole.
An Optimistic Outlook
Despite the incredible transfer window, Fulham is still a newly promoted side and will have some of the typical struggles that any new side will. Depth is likely to become an issue over the course of a long and physical Premier League campaign. Fulham has done a lot of upgrading, but that means that the squad which was generally not deemed strong enough, is still sitting there waiting to fill in in the case of injury. It’s not a knock on Fulham to say that any newly promoted side is just the wrong injury or two away from the middle of a relegation battle.
Similarly, should the season not go strongly, this is not a squad built to eke out points and stay above water. In a relegation fight, a good defense is slightly more important than a good attack. If this squad gets to April hovering around 16th or 17th place, with just a point or two separating themselves from the drop, they’re likely going to be in worse shape for the home stretch than a similarly situated more defensive oriented side.
Those dire scenarios are certainly in play, but they aren’t Fulham’s most likely outcome. Their average season probably sees them comfortably avoiding the drop as they settle comfortably into the non-threatened zone of the bottom half of the table. They’ll likely struggle against better sides, their insistence on attack is fun, but not conducive to challenging the best sides in the league. They’ll also take more than enough points against weaker sides. For a newly promoted side, surviving comfortably while also being pleasing on the eye meets any definition of success.
Thank you for reading. More information about StatsBomb, and the rest of our season previews can be found here.
Welcome to the StatsBomb Premier League season preview. Over the next two weeks we’ll be previewing every team, breaking down the good and the bad, but sadly no longer the Charlie Adam. This post is your one stop shopping for all of those previews. You can scroll on down and find links to each piece as they post. If you’re familiar with our work, that should be just about all you need. Check back daily, we’ll be posting a couple of teams per day more or less. If you’re discovering us for the first time, here’s a little FAQ to get you started. What exactly is StatsBomb? We do cool things with football data. We collect it, poke it, slice and dice it, and learn stuff from it. The company does all sorts of consulting. You probably don’t care about that, but if you do here’s how to get in touch. The website also does cool stuff with football data. We use our data to do analysis that is intelligent and fun, and hopefully makes interested readers smarter about the game. Oh, so you’re nerds. Do you even watch the games? Yeah. We watch the games. Everybody here loves the game. That’s why we do this. We just also happen to like numbers and using numbers to help learn about the game. But if you watch the games, why do you even need numbers? An excellent question! There are a handful of different ways that numbers are useful even to people who obsessively watch the sport. First, no matter how diligent a watcher you are, even if you’re one of the lucky few who get paid to watch the sport for a living, there’s more football going on across the globe than you have hours in a day. Stats are not a replacement for watching, but they can be a guide. You’re definitely not watching the vast majority of League One matches, but if your team gets linked to a precocious talent plying their trade there, these stats can give you a road map for what to expect if you decide to tune in. Second, stats are an extremely effective tool for detecting small changes across large numbers of games. It’s hard to watch a couple of different strikers play five games each and then come away and know for sure which one took five shots per 90 minutes of play* and which took three. The former would be the most prolific shooter in the Premier League, the latter barely in the top twenty. *Most stats you’ll see here are recorded on a per 90 minute basis instead of a per game basis, that way starters who play the lion’s share of the minutes and substitutes who get short runouts get examined on even footing. Knowing that a player shoots more, or scores more, or does anything else more isn’t useful if it turns out the only reason they do it more is that they play more minutes. Third, numbers are an extremely effective way to describe what’s going on over large periods of time. Using statistics to describe a team or a player is a way of supporting (but not proving) and argument. Most football analysis is done at the minute and specific level, analyzing unique and individual moments. Watching a fullback get spun inside out by a winger is a good way to determine what went wrong in the leadup to a goal. Statistics are the tool that helps to figure out whether that fullback gets roasted too often week in and week out. In the end statistics, and their slightly fancier cousin analytics, are a tool to be used. Coaches have video now to help them prepare for games in ways they wouldn’t have been able to before. Television viewers have instant replay, and tactical cameras, and super slow motion all of which provides more information than a generation ago. Statistics work similarly, providing more information than ever before, another tool in the arsenal for players, managers, supporters and anybody else interested in the ins and outs of how football works. Okay but what about that Disraeli thing about lies, damn lies, and statistics? Well, first of all, it’s not actually a Disraeli quote. Mark Twain attributed it to him, but nobody really knows where it came from. It’s also not wrong. Be wary of people using statistics to provide firm answers. While analytics and stats can answer some basic questions, what they’re best used for is helping to ask better questions. They can, for example, move the conversation from “Is Raheem Sterling bad because he can’t shoot?” to how is Raheem Sterling so good, even though it seems like he can’t shoot. Analytics has all sorts of tools to help do that. Whether it’s normalizing things to per 90 minutes, or using expected goals to try and figure out exactly how good or bad various shots are, using numbers smartly helps to offer a better baseline for discussions about the game that fans are having all the time. You wore me down, and I clicked on that last link. It talked a lot about expected goals. It’s the same thing annoying people on twitter talk about all the time. Why should I care about expected goals? An official sounding definition of expected goals would be something like, the average number of goals that a player or team would be expected to score given the set of shots that they took. Here’s a wholebunch of details on how the metric works, but the long and short of it is that expected goals is a better predictor of future goals than actual goals. Because scoring is so difficult, and so much stuff can go right or wrong on a given shot, or in a given game, or even a given month, taking a step back and looking at the quality of chances created and conceded is a more reliable indicator of a team’s ability than the actual amount of goals teams have scored. Frequently, the stat is also used as a shorthand way to describe a single game. It’s useful knowing a team’s expected goal total in a single game, and can help describe a game with more context than other numbers, but single game expected goals aren’t particularly predictive. Asking the question, “do the expected goal numbers of these two teams accurately reflect the game I just watched?” is a useful exercise for analysis. Saying, “This team had 3 goals, but only 1.2 xG so they were undeserving winners” isn’t. Like I said before, analytics is about asking better questions more than it is about providing definitive answers. So all you people do is nerd out about expected goals and I’m supposed to read 20 previews about that? Nope. We’ve got lots more. It’s useful having a metric that helps more accurately depict which teams are better than they might seem and which teams are worse, but it’s much more interesting diving into the why and how of it all. That’s what these previews will be about. Using data (and lots of other stuff), these previews will be breaking down not only how good each Premier League team might or might not be, but more importantly what they actually do that makes them tick. Which players do which things in which areas of the pitch and how often. Who needs to up their game and how. Which managers will have the biggest impacts, and which ones might be past their primes. All of the kinds of questions that football fans are interested in, StatsBomb is interested in too. We just use data to help figure out the best way to approach them. Shockingly, you’ve persuaded me, a fake person on the internet asking you questions on the website you run, to go ahead and read all the previews on your site. Excellent. Enjoy! And be sure to check back every day. The Previews: July 30 -- Manchester City July 31 -- Tottenham Hotspur July 31 -- Leicester City August 1 -- Manchester United August 1 -- Liverpool August 2 -- Wolverhampton Wanderers August 3 -- Watford August 5 -- Southampton August 5 -- Cardiff City August 6 -- Bournemouth August 6 -- West Ham August 7 -- Everton August 7 -- Arsenal August 8 -- Newcastle August 8 -- Brighton August 9 -- Chelsea August 9 -- Burnley August 10 -- Crystal Palace August 10 -- Huddersfield Town August 10 -- FulhamHeader image courtesy of the Associated Press
If there’s one piece of transfer advice that should be tattooed on the inside of the eyelids of everybody even remotely connected with football it’s this: Pay for what a player is going to produce, don’t pay for what they’ve already done.
That isn’t as easy as it sounds of course. Projecting performance is hard! What a player has done, isn’t necessarily indicative of what they will do. There are all sorts of things to take into account. How will a player react to a different system? Will changing leagues impact their performance? Have they played with really great teammates who flatter them, or really poor ones who held them back? If they are moving countries how will the adjustment impact them? Will there be language barriers, either on the player or coaching side, which slow down the bedding in process? And that’s all before trying to take the fickle nature of the football gods into account.
Throughout all the complication though, one thing is easy. Aging. Young players tend to get better and older players tend to get worse. That’s simply the way of things. Too often teams make the simple mistake of looking at a 27 or 28 year old’s production and assuming that they too will continue to get those goals. It just doesn’t usually work that way. There are exceptions of course, the Cristiano Ronaldo’s of the world. But, for the most part it's Wayne Rooney, not Ronaldo who is the norm. A player who is getting signed to play through age 30 is a player who is going to get worse.
The flip side of the coin is also true. Young players get better. They don’t all improve of course. There’s no such thing as a sure thing. It’s hard enough to predict how a player will perform when they change teams, but add in the uncertainty of aging and things get very fuzzy very fast. All players eventually get worse, not every young player gets better.
And so we come to Richarlison, who Everton just spent something like £40 million on. To judge that transfer solely on what Richarlison has produced to date would make it loot absurd. In the year since Watford acquired him for £10 million from Fluminese he scored five goals and assisted on four more and that’s supposed to quadruple his value?
Well. Maybe. The place to start, as always is a quick look at his expected goals. His xG numbers are much more promising. A 21 year old who racks up 12.05 expected goals is actually quite an exciting prospect.
That’s not to say that Everton should assume Richarlison is nailed on to bounce back to his underlying numbers. That’s a pretty big gap, it’s certainly possible there was a reason the young Brazilian attacker was so wayward. But, what is nailed on is that you’d much rather buy the player who had five goals from 12 expected than the reverse. In the absence of more data the default assumption should be that Richarlison’s xG is a more faithful representation of what he will produce in the future than his actual goals. In a world where Richarlison took exactly the same shots but 14 or 15 of them ended up in the back of the net, nobody would bat their eye at this price tag.
The fact that Richarlison is 21 years old suggests that he’s likely to improve. The fact that he scored fewer goals than expected is a sign that what Everton are getting is a player likely to produce more than he did last season. The fact that he’s already performed at a promising level for a year in the Premier League alleviates some of the adjustment concerns. Everton are operating in an environment considerably more definitive than Watford were. Watford took on the risk of buying Richarlison without having any guarantees about how moving from Brazil to England would impact the young player. Everton are not.
Then there’s the issue of teammates. Richarlison was the most prominent attacking force for Watford. He led the team in expected goals per 90 minutes (players with at least 1200 minutes) with 0.36
Despite playing from the wing, he took the most shots on the team with 3.02 per 90, and had the second highest xG/shot with 0.12. He also had the third most touches in the box with 8.82 per 90 (virtually tied with Andre Gray’s 8.92 and really only trailing Troy Deeney’s 10.30). Contextually these numbers paint the picture of a player who was pushing at the edges of what he could do given the teammates around him.
Everton aren’t world beaters, of course, but they do certainly have more talent than last season’s Watford side. Players like Gylfi Sigurdsson, Theo Walcott, Oumar Niasse, and Dominic Calvert-Lewin aren’t, for various reasons, elite players, but they’re certainly a better attacking corps than Troy Deeney and company. It’s yet another sign post pointing towards Richarlison being a player who will likely be better in the future than he was in the past.
And finally there’s the matter of the actual specific skills Richarlison brings to the table. He’s a scoring winger, but he’s not small. At 179cm he makes for a surprising aerial presence as well. Last season there were only seven players who played over 1000 minutes and averaged both more than one successful dribble per 90 and more than one aerial duel won. They’re an interesting group, Romelu Lukaku is the only star in the mix. And Salomon Rondon is the only striker besides Lukaku. Serge Aurier, the forever frustrating Spurs right back is the lone defender on the list, unless you're like Slaven Bilic chose to count Michail Antonio there for some reason. Then it’s Mikel Merino, Calvert-Lewin and Richarlison. That’s a weird list, but at the very least it’s indicative of an interesting and relatively unique skillset.
Does that all add up to a player whose worth the hefty price tag? Well, there’s a lot more that goes into evaluating a transfer fee than a player’s potential. Were there other ways Everton could have spent the money better? What will the market look like in three years’ time? What are Richarlison’s wages? All of those things are important factors in weighing whether Richarlison is worth what Everton spent.
There are absolutely reasons to be skeptical of Everton’s purchase of Richarlison. It’s just that the fact that he only has five goals and four assists isn’t one of them. Richarlison profiles as an exciting prospect, one who has buckets of talent and a reasonable chance of turning that talent into stardom. Everything about his last Premier League season screams that his production is likely to improve, and quickly. Young players, with numbers and experience that suggest that their best performance lies ahead of them are exactly the kinds of players that teams should be pursuing. Richarlison fits that profile to a tee.
The defining force of the 2018 World Cup has been France’s defense. They’ve conceded four goals in the entire tournament. Didier Deschamps has built a team that is devilishly difficult to break down, an immovable object that Croatia will have to pry apart if they want to stage a historic upset.
Those four measly goals came in only two matches. France gave up one to Australia in their opening match, before switching to a lineup which included Blaise Matuidi playing in the attacking band of three to bring defensive stability. Other than that, the only goals France conceded were against Argentina in their wild 4-3 round of 16 victory.
Unsurprisingly France’s underlying defensive numbers are stellar. Only Uruguay, who France soundly dispatched in the quarterfinals, conceded fewer expected goals per 90 minutes than France’s 0.47 over the course of the tournament. Similarly Uruguay was the only team that conceded a lower xG/shot rate at 0.04 to France’s 0.05. Only three teams, Australia and Senegal who didn’t advance from the group stage, and Spain gave up fewer shots. The combination of giving up very few shots and having those shots be of low xG value is the holy grail of defending. France achieved it.
France’s strong defense means that even though their somewhat meager attack has a tendency to greatly undersell its talent, the team still regularly generates more and better scoring opportunities of all types than their opponents.
The question of how France execute their defense is an interesting one. This is neither a typical modern defense, ball dominant and full of intense high pressing to win the ball back quickly nor a traditional deep block. They don’t sit back and absorb pressure. Instead they operate firmly between those two extremes.
Where France pressure teams is highly dependent on their personnel. On the left Blaise Matuidi’s presence in the attacking band means he often is challenging opponents in their own half of the field. On the right, where Kylian Mbappe is often pushed very high in the attack, France contain attacks, forcing them out wide before pressuring them as they advance.
It’s also important to understand that France’s defensive activity in the middle of the pitch is relatively light, not because they are vulnerable, but because they are so impenetrable the ball rarely makes it that far. Their dominance and solidity actually frees up defensive midfield superstar N’Golo Kante to go win the ball all over the pitch. Looking at defensive actions where the ball changes possession, it’s clear that Kante gets to have the freedom to go take the ball from opponents wherever he finds them vulnerable. It’s a clear example both of how stellar he is at getting to the right place to cause damage, and how good France are at letting him do just that.
Nobody has quite figured out how to get the best of France’s approach at this World Cup. In the semifinals Belgium, perhaps the best attacking team of the tournament were simply stopped cold.
That shot map has a whole lot of nothing on it. France were happy to let Belgium filter the ball out to the right where their defense was more than well equipped to deal with the challenges presented by Nacer Chadli.
Eden Hazard was both much more dangerous and much less able to get on the ball during the course of the match. But, ultimately, even his unstoppable dribbling on the ball didn’t do enough to pull France out of their shape and create real opportunities either for himself or for his teammates.
This will be the major problem for Croatia when it comes to the World Cup final. On the one hand, they have the midfielders who could in theory be equipped to break down France. If anybody is going to break through this fortress a combination of Luka Modric, Ivan Rakitic, and Marcelo Brozovic would be that trio. On the other, the way Croatia wants to attack is by using those players to filter the ball out wide.
Croatia’s best moments all involve Modric and Rakitic drifting wide to combine with wingers and fullbacks in order to facilitate attacks down the flanks. That approach has provided just enough juice to get the job done to this point, but they also haven’t had to overcome a defense like France’s. Meanwhile, France just handled a Belgium setup where Eden Hazard was one of the wide men on the ball tasked with unsettling a defense. Croatia’s Ivan Persisic is very good winger, and Ante Rebic has his moments, but neither of them can do anything close to what Eden Hazard can on the ball. And France handled Hazard.
The other thing that France’s slightly lopsided defensive structure does is free up Kylian Mbappe in attack. Because he doesn’t have to defend like Matuidi does, he can stay higher up the pitch and combine with Antoine Griezmann and Olivier Giroud on the counter. In Croatia’s semifinal against England, they looked their most vulnerable when England was able to get the ball forward to Raheem Sterling quickly and get him isolated against a central defender. If France can similarly isolate Mbappe then Croatia will likely be in for a long day.
Coming into the tournament the lack of a cohesive attack out of France was the main story. The way the side seemingly squandered, dynamic attacking talent overshadowed just how defensively strong they’re set up to be. As France head to the finals, the defense deserves all the attention it’s getting. Kante is every bit the superstar that his more glamorous teammates are, the beating heart at the center of a defense that has completely owned this tournament. Defense can be harder to appreciate than attack, but it’s no less important. And France have the best defender in the world in their midfield. And they’ve shown him off all tournament long.
If you just looked at Germany’s expected goal numbers you’d think they were extremely unlucky to be eliminated in the World Cup. That’s a good reason to not just look at expected goals numbers.
In their three group stage games, Germany averaged 1.84 xG per game. Only Belgium, Brazil, England and Spain averaged more. Defensively they weren’t quite as strong, they gave up 1.17 expected goals per match. That was pretty average for the tournament, 15th give or take a rounding error. Put it all together and you get the sixth best xG difference in the tournament. Hard not to look at just that numbers and come to a very simple conclusion. If the team with the sixth best xG difference doesn’t make the last 16 teams there’s nothing much to be done. Bad luck happens, shrug your shoulders, go get 'em next time.
This, of course, would be an incredibly incomplete analysis. Let’s take a look at the games individually. Here’s the Mexico match which started it all off.
Two important things to note. When the game was even, Germany trailed from an xG perspective before they conceded and started trailing from an actual goal perspective. The bulk of their xG accumulated as they were chasing the match, ultimately unsuccessfully.
And, hey, what do you know, they same thing happened in their second match
Germany were not only chasing the match, but chasing their survival in the tournament and ultimately they snuck ahead thanks to all that pressure, and an amazing set piece strike from Toni Kroos.
Match three is a bit of an eye of the beholder situation, and also demonstrates the limits of game state analysis in a tournament structure. Germany needed to win to advance, they started the match accordingly. As the match progressed they were increasingly (correctly) playing tactically as if they were down a goal.
And, of course, they kept missing. It’s certainly reasonable to look at the entire match and understand that Germany were unlucky not to score, but also that if they had scored earlier, their xG total would likely have ended up lower as they switched modes to preserve their lead.
Imagine a world where Germany came out and scored against South Korea in the first minute. It’s obviously impossible to know exactly how the rest of the game would play out but, at a minimum it’s hard to imagine that protecting a lead with advancement on the line, two forwards, Mario Gomez and Thomas Mueller would have come on for two midfielders in Sami Khedira and Leon Goretzka. Of course, the flip side is also true, if Germany weren’t ramping up the pressure and desperately chasing the game, they certainly wouldn’t have been so wide open, like Manuel Neuer in midfield leaving an empty net wide open, in the dying minutes of the match.
Small sample size isn’t just a problem when it comes to the difference between goals and xG, it’s also a problem when it comes to what xG says about teams themselves. It takes only a passing understanding of xG to look at Germany and correctly point out that they were unlucky not to score more. But, it’s also true that if they had not been unlucky, then on the attacking end their xG would have been more modest.
Three games is only three games, and it can only tell us so much about at team. In the same way that goal scored and conceded don’t tell us the whole story and there aren’t enough matches to even out the whims of the finishing gods, so too there aren’t enough matches to get past how the whims of the finishing gods can influence xG. We don’t know what Germany’s xG totals would have looked like if they hadn’t spent so much time chasing matches. But, what we do know is that in two games, against two teams they were supposed to be better than, they played roughly equally until falling behind.
Over the course of a season, of course, these things even themselves out. Sometimes a team will chase a handful of games in a row, and sometimes they’ll be chased. Sometimes they’ll get exceptionally hot or exceptionally cold for a match or two or six, and that will impact their xG totals, but eventually, the finishing evens out, and so does the xG. Not over three matches though, and especially not over three matches when a team spent one of them chasing the match from kick-off.
There are two separate ways that xG is predictive. We talk a lot about the first, that it's better at predicting future goals than just about anything else out there. But, there’s a second factor as well, and that’s that xG is very predictive of itself. Generally speaking, teams find their xG levels fairly quickly and those levels remain fairly stable going forward. That’s why the metric can underpin predictive models. But, even if xG is a fairly reliable fairly quickly, it’s not settled after only three games. And even if it was, outliers remain, and part of the task of doing analysis involves looking at outliers and evaluating whether there’s any reason for their outlier status…like say spending an entire game running up the xG score specifically because there’s no actual scoring going on.
Ultimately, it’s difficult to say anything definitive about Germany in just three games. We can’t say for sure whether a more normal slate of games would have served to increase or decrease their xG difference, but it certainly could have changed it. Over three games, looking at xG without the context of what happened will often lead to getting the wrong impression. Statistical tools are great, but the smaller the sample size the more they need context. And the World Cup is an absolutely tiny sample size.
I’ve done a lot of work with analytics for wider audiences, and there are a number of things I’ve tried to keep in mind over the years. The first is that you can’t make people learn things they don’t want to learn about. There’s a fundamental difference in the incentive structures for people who work within the game, and for fans who consume it. People who work in the game are obligated to try and get things right, or they risk getting fired and replaced by people who do it better. Fans are under no such obligation and they like what they like, and cheer how they cheer, and get invested in whatever they get invested in.
There are just large numbers of fans who will be perfectly fine with analytics when it tells them what they want to hear, and not interested in hearing about it when it tells them something else. That’s ok. They also won’t particularly be interested in learning anything about it, so anytime they hear arguments they don’t like they’ll be left critiquing the entire endeavor, since they don’t speak the language fluently enough to actually engage on the specific point. There’s just nothing you can really do about that.
Instead, I’ve always chosen to focus on people who are interested in stats an numbers and football, but haven’t been exposed to it before. And there are more of those than you think. It’s a cliché but I really believe in trusting my audience. At mainstream outlets I don’t expect them to know a lot coming in, but I don’t think readers are dumb. Stuff like expected goals isn’t complicated, and I think that’s really the point of emphasis for me; a focus on how analytics is just a slightly more numeric way of talking about concepts everybody already understands. People that watch football get the tension between taking lots of shots and taking good shots. They get the difficulty of figuring out which midfielders really contribute to buildup play, and which ones add very little while also being defensive liabilities. Analytics operates in the same places that normal fan conversations do.
Typically, I think the best way to ease audiences in is to present the evidence analytics provides as an additional point in an ongoing dispute. Is a striker struggling to score goals? There are a million ways to approach the question of what’s up. If he gives you a quote that says “I’m getting in good spots, and the ball just isn’t going in. I just have to keep my confidence up and the goals will come,” well that’s an xG argument. If a manager talks about how it doesn’t matter that his team didn’t have very much of the ball and gave up lots of shots, because they were all bad shots and besides our counterattack was great. That’s an xG argument, and a perfect moment to talk about how that stat can either support or refute what the manager is saying.
I’m always actively on the lookout for moments like that, places where the space between the work stats do and the way people inside the game are evaluating it is small. That’s a great place to start. Then I believe people who want to learn can pretty easily follow along from there.
Everybody who works in the field gets this question a lot. One good place to start is by narrowing down exactly what it is you mean by “analytics” because it’s a wide field. The work that’s done inside clubs is different than the work that’s done at consulting services, is different than the work that’s done in media. All of that is different than what being an interested and educated fan entails. There’s lots of overlap, of course, but the primary focuses of all those lanes diverge greatly. So, the first thing to do is try and figure out what being into analytics means to you.
I knew very early on that I wanted to be working on the media side. The through line of my professional career from poker to finance to football analytics has been explaining technical concepts to audiences that aren't versed in them. I’ve read baseball analytics writers for years, and had lots of experience watching how baseball, and then basketball, hockey and NFL coverage, could get better if people who understood statistical concepts were working to use them to explain the game to fans. I saw what heavily analytics informed writing from people like Jonah Kerri or Zach Lowe could do, and knew that’s what I aspired to do in football.
That informed the decisions I made. I focused on becoming a better writer, on reading everything I could and on making it a priority to write regularly even if only a few people were reading it. I learned a lot about how analytics worked, and how best to make that accessible to people less knowledgeable than myself. If I wanted to work inside the game all of that time would have been better spent learning to code. But I didn't. What I did was the right path for me, but I was able to take that path because I knew where I wanted to go. That's not to say you need to have your life figured out before trying to read an xG map, but just that if you're interested in being in analytics then what you should do is heavily predicated on where you want to end up.
And finally. I entirely wasn’t kidding when I said I’d answer a relationship advice question. Thanks for being brave enough to submit one.
Let’s start with the positive. Not settling is good. Looking to be challenged in life and love and relationships is a healthy, positive way to go through the world. There’s nothing necessarily wrong with shooting your shot and seeking romance with people you perceive as unlikely to reciprocate (provided you are, you know, not an asshole when it doesn’t work out). That said, the way you’re approaching your process sets off a couple of alarm bells for me.
Maybe it’s just some cutesy phrasing, because this is a soccer analytics mailbag after all, but I can’t help wondering why you start off believing that the best, most interesting, fantastic women are ones that you will have no shot at. I think it’s worth doing a little introspection on exactly what it is you're valuing, and how you're evaluating it.
If I may stretch this horrible xG analogy just a bit (because again, relationship advice in a soccer analytics mailbag). The process of figuring out how to measure shots has been a long and complicated one, and we keep learning stuff along the way. Shots we used to think were terrible have often times turned out to be more valuable than originally expected, and vice versa. I’d be wary of the way you are deciding who has value, and who will be a good long term romantic partner beforehand, and then deciding to pursue them. Also, pursue is a crappy word. Ask them out. Or hang out. Or whatever. If they don’t want to, that's fine, cool, move along. Life is short and there are many people under the sun.
It strikes me that what you’re doing now is forming an expectation of what future romantic partners will be like, and then evaluating both how attractive they are to you on a range of axes and how likely they are to be interested in you. But you're doing all that before factoring in what they are actually like as a person in the world. This is a really good way to end up romantically involved with people you think you should like and not with people you actually like. It also seems like a good way to convince yourself that nobody you like could actually like you back. This is not true!
Dating is hard enough. Meeting awesome people who you like and who like you and who you want to keep doing awesome things with forever is a stressful process. Trying to skip the steps of getting to know partners as people by figuring out in advance which of them would be great forever and which of them don’t make the cut doesn’t make the process simpler. You can’t know ahead of time which cool people are going to be worth it for you to bother to make the effort to get to know better. Knowing people casually is different than dating them. Dating people casually is different than dating them seriously. Dating people seriously is different than marrying them and on down the line forever and ever.
It’s important to seek romantic partners who are actual people, not who are people you’ve crafted from your imagination ahead of time. There’s no way to build a reliable expected love interest model from afar. If you’re interested in somebody it’s great to ask them out. It’s not bad if they say no. Sometimes they'll say yes and then it turns out they weren’t actually as cool as you hoped and that's alright too. Life and romance is messy. Liking people is fun. Loving people is great. Sadly, as nice as it may be to fantasize about, you don’t get to decide who you’ll like and love ahead of time.
Embedded below is the final video of our presentations from the StatsBomb launch event. StatsBomb CTO Thom Lawrence discusses Actions Under Pressure. Thom looks at all the ways that pressure from defenders impact the team with the ball, and what we can learn from recording those events. He examines how we can look for weak links, which players are able to beat pressure most effectively, and much much more.
Just How Unpredictable Is The English Premier League?
With just over a month remaining in the 2013/14 season there is still all to play for in the Premier League. The league title, European qualification and the relegation battle all look like going right down to the wire. Many commentators are calling this “the most unpredictable season ever” and we often hear the Premier League referred to as “the most unpredictable league in the World”.
Never being one to take a commentator’s word for something I wanted to discover if this is really the case.
Just how ‘unpredictable’is the Premier League?
What do we even mean by ‘unpredictable’? Can we measure it?
Furthermore, is there an ideal level of ‘unpredictability’ or ‘competitiveness’ for a league?
How Can We Measure Unpredictability?
Fortunately there are companies for whom it is their job to accurately predict sporting events – bookmakers. The Football Data website records match statistics and pre-match bookmaker odds for thousands of football matches across Europe every season.
How Accurate Are Bookmaker Predictions?
The website Kaggle runs competitions for predictive modelling of many scenarios including sporting events. Recently they ran a competition to predict the outcomes of US College Basketball matches during March Madness. Kaggle evaluated entries using the Binomial Deviance method and I will use the same scoring system here. Hopefully this isn't as complicated as it sounds. 'Binomial' just describes the way matches are evaluated on a scale from 0 to 1 (1 for a home win, 0 for an away win) and 'deviance' just means we will measure by how much our predicted outcome deviates from the actual match outcome.
The difference between the forecast outcome and the actual outcome is measured in terms of the log-loss between the two. The smaller the log-loss the more accurate the predictions are considered to be. The idea here is that a very confident prediction that is incorrect is ‘punished’ more than a less confident pick would be. This is perhaps best shown with an example:
Example: Liverpool vs Tottenham Hotspur (30th March 2014)
Liverpool were strongly favoured to win this match. The average bookmaker odds were:
Home Win - 1.45 Draw - 4.65 Away Win - 6.76
Bookmakers odds represent the percentage chance each game is expected to end in a home win, draw or away win so can be easily converted to the 0 to 1 scale (a drawn match is scored as 0.50). The expected 'score' for this match from the bookmakers odds is therefore:
Expected 'match score': 0.757 [Please see comments section below for a full explanation of this calculation]
Liverpool did win as expected (actual 'match score' of 1.000) so the resultant log-loss was small: 0.278
If the match had been drawn ('match score' 0.500) the log-loss would have been larger: 0.847
If Spurs had pulled off a shock win ('match score' 0.000) the log-loss would have been very large: 1.416
How (In)Accurate Are Bookmaker Forecasts?
Now we have a method for evaluating predictions we can produce the following chart:
[All data correct up to and including 1st April 2014]
This chart shows the average per match log-loss of pre-match bookmaker odds for the last 5 seasons of the EPL (remember the smaller the number the more accurate the predictions). It actually seems that the ‘predictability’ of the Premier League has remained pretty consistent of this period.
If anything, this season has actually been the 2nd ‘easiest’ to predict in the last five years.
Further details are below:
2013/14 = 0.591 per match Biggest Upset: Man Utd 1-2 West Brom (1.724 log-loss)
2012/13 = 0.603 per match Biggest Upset: Chelsea 0-1 QPR (1.945)
2011/12 = 0.623 per match Biggest Upset: Man Utd 2-3 Blackburn (2.290)
2010/11 = 0.635 per match Biggest Upset: Arsenal 2-3 West Brom (1.948)
2009/10 = 0.583 per match Biggest Upset: Tottenham 0-1 Wolves (1.770)
What is Happening Here?
Technically our scoring system is a measure of how 'inaccurate' the bookmaker predictions are. The smallest log-loss scores result from very confident predictions that prove to be correct (i.e heavy favourites that go on to win their matches). Although the 13/14 title race remains unpredictable, in reality there have actually been very few genuine ‘upsets’ this season. The top teams have all been very consistent and have largely beaten the teams they are expected to. The biggest upsets have been Manchester United losing at home to West Brom (log-loss 1.724), Everton losing at home to Sunderland (1.588) and Chelsea losing away at Crystal Palace (1.525).
Towards the end of the recent Liverpool against Sunderland match the Sky Sports co-commentator Alan Smith described Sunderland’s pretty disappointing (and ultimately unsuccessful) second half comeback as something along the lines of “What makes this league so great”.
Is this really the ideal level of unpredictability for a league?
How Does The Premier League Compare To Other Leagues?
This table represents the same measure for the current 13/14 season for every league that is covered by Football Data (again, the smaller the number the more ‘predictable’ the league).
[All data correct up to and including 1st April 2014]
This table suggests that the Premier League is actually one of the more ‘predictable’ leagues around Europe? What might be causing this?
Is it possible that it is actually easier for bookmakers to set odds on some leagues than it is on others? It is certainly possible that there is some truth in this. Several of the leagues with the most accurate odds are also those that are the most covered in the media (EPL, Serie A, La Liga) and have the most information available. In contrast, I don't think there aren’t too many odds compilers who specialise in the Scottish lower leagues. Does this mean we should all start betting on the Bundesliga Two? I won't be rushing to do so just yet. I think any differences here are still very small and that this method should rather be considered as an interesting way to highlight differences in the competitive shape and balance of competitions.
For many of the leagues studied there appears to be an inverse relationship between how predictable the matches are and how competitive the league is. For example the leagues with the lowest average log loss include the SPL and Scottish Division One where Celtic and Rangers have already clinched the respective titles with a month to spare. The most predictable league is the Greek Superleague which has been won by the same team for the last 4 seasons. This method is still the best we have for evaluating competition ‘predictability’.
If we consider this a useful measure of predictability then it is surely also a useful measure of the ‘competitiveness’ of a competition.
Why might the Premier League have a lower score than the Bundesliga? Although Bayern Munich has romped clear in Germany, below them the league has been very competitive. As mentioned, in the Premier League the top 4 teams have all been consistently excellent (the top 5 have only 4 home defeats between them all season). The title races remains open but it is widely accepted that it will probably be decided by the two games Liverpool play against Manchester City and Chelsea.
Does this mean commentators should be more careful what they describe as unpredictable? For the EPL it seems fair to say the title race is unpredictable but in general it is not actually one of the more unpredictable leagues.
Is the Premier League actually not competitive enough?
Is There An Ideal Level Of Predictability For A League?
The question of how competitive we might want the league to be is an important one and has implications for a wide range of decisions, in particular with regard to revenue distribution from the leagues lucrative media contracts. Many of the leagues that we have seen to be the most ‘predictable’ are also those that have very uneven financial structures. In contrast, the major US sports leagues such as the NFL and MLB openly engineer greater competition through the use of salary caps and draft systems.
Yet is it really desirable to have a league where ‘anyone can beat anyone’? Does this mean every team is as good as each other? Or does this just mean every team is as bad as each other?
Before we get too excited and start speculating about revenue redistribution it is important to remember that the best Premier League clubs are also those that represent English football in UEFA competitions such as the Champions League. This is not a consideration for any of the major US sports as they do not have to compete with other leagues overseas. This season only 2 English teams have made the quarter finals and neither are favourites to progress. Interestingly, the favourites to win the Champions League (Bayern Munich, Barcelona, Real Madrid, PSG) are all sides who compete is seemingly lop-sided domestic competitions (see above).
Is there an optimal balance to be sought between the competitiveness of a league competition and the opportunity it affords its best teams to build squads to rival the best in Europe?
Conclusions
I admit my premise was a little facetious – I do not actually think the EPL is too predictable and actually think this has been the most interesting Premier League season for a long time. I am sure plenty of football fans in other leagues are envious of such a close finish in prospect. Also, I noted that only two of our sides are in the quarter finals but Manchester City and Arsenal didn’t exactly disgrace themselves – coming up against the 2 best sides in Europe and some unfortunate refereeing decisions.
Yet I do think there are some important issues to look at in terms of what it actually means to have a competitive league. Should competitiveness be ‘engineered’? What if this is to be at the expense of the performance of our sides in Europe? If this season is representative of the future then I think the current balance between the league and European performance is about right but this doesn’t mean we should be complacent.
And it definitely doesn't mean the Premier League is ‘the most unpredictable league in the world’.
Apparently it’s “finishing skill” season. The debate happens every year, more or less, usually precipitated by an incredible run of goals by somebody or other.
This year, obviously it’s Luis Suarez who has spurred the discussion (including some particularly long and heated ones between me and @SimonGleave…sorry everybody).
In general the debate boils down to three specific questions: What is finishing skill? Does it exist at all? Even if it does exist, does it matter? So, let’s wade into the murk shall we.
Defining the terms of this discussion is actually a pretty tricky enterprise. Arguments generally start over a player’s conversion percentage, or goals vs. expected goals ratio, and devolve fairly quickly, often times into people talking at cross purposes.
So, let’s look at two possible definitions on opposite sides of the spectrum. What if we defined “finishing ability” as the simplest most basic moment of ball hitting foot and ball flying into the net (or into row z).
If we define it that way, it means we are controlling for absolutely everything else on the football field. When we deal in expected goals, we’ve already started this process, since that involves controlling for shot location, shot type, probably pass type, and some other things depending on the model (an important note, the vast majority of a shot’s chances of going in the net are due to its distance from the net, everything else that we’re talking about here is much much less important when it comes to impacting a shot’s chance of succeeding. I’ll come back to this point later).
We can go beyond that though, at least in this theoretical world I’m operating in. We could control for dominant foot, we could control for what part of the foot the shot was taken with, we could control for whether the player was on or off balance, the speed with which the ball was moving when struck, the speed with which the player was moving when he struck the ball, etc. etc. You guys can all come up with your own examples. So, controlling for all of those other factors leaves us with a narrow definition of finishing skill. That’s fine, and given that definition it makes sense that perhaps there would be little to no difference between players, especially between players playing the forward position at an elite level. But now let’s go back to the initial question.
If a player is shooting well above, or below average levels after accounting for shot location can we simply say it’s down to natural variation because players all have the same finishing skill? There’s a little bit of a problem here because we have a whole set of skills which we both aren’t including in our definition of “finishing skills” but which also aren’t accounted for by any models which we’re using. So, by asserting that it’s simple variation because players all have the same finishing skill, we’re either asserting that those skills don’t impact shooting results at all, or we’re asserting that their impact on shooting outcomes is dwarfed by the random variation that arises from the very act of kicking. Is that true? I don’t know.
But I do know that the more narrowly we define finishing skill the harder the argument is to make that the fact that it’s constant (if it is) among players is the reason for variance in performance. I’m just simply not comfortable writing off a whole set of variables we know nothing about and assuming they are unimportant to outcomes without proof.
Okay, but what if we defined “finishing skill” more broadly. Instead of trying to zero down to that exact moment of foot to ball, what if we just loosely defined finishing skill as, “the ability to score more goals than an expected goal model would predict.”
Well for starters, thanks to @MCofA we do have someevidence that in very very large samples some players can outperform expected goals. So, it’s there, but it’s very very hard to do and equally hard to spot.
But, just like with the narrower definition, it’s worth saying explicitly what this means. What we’re saying is that taking into account all of these, let’s call them sub-skills, that go into finishing and players still more or less show no ability to improve. Why is that? Well, one possibility is that all of these sub-skills are as random as narrowly defined finishing skill.
Saying that though, is basically saying that every attacking player is exactly the same and differences are due solely to randomness.
Subjectively I’m not comfortable with the implications of that argument, since it would mean things like Arjen Robben’s left foot is as good as as Cristiano Ronaldo’s or Zlatan Ibrahimovic and Olivier Giroud both hit the ball with the same velocity, and differences are solely down to randomness. Again, it’s a defensible position because we don’t have the data that conclusively proves these differences yet, but it’s not territory I’d be willing to stake out.
Rather, I’d say, these sub-skills vary greatly in terms of both a player’s ability to control them, and their ultimate importance in finishing. Some things will be well within a player’s control and some will largely vary due to luck. And then on top of that a player’s decision making layers in the frequency with which various skills come into play (how often does player X shoot with his weak foot, as opposed to how good is that weak foot when he shoots).
Put all that together and you have an equation with so many variables that it’s next to impossible to be good enough at enough stuff to actually move the finishing needle. This isn’t exactly news. It’s just another way of saying, finishing involves a diverse skill set, is really hard and has lots of luck involved. Pretty much the same as it ever was. All of which is fine and good, but leaves the third question unanswered.
Does any of this matter? At best, finishing skill when we define it loosely is really difficult to spot, and we can’t see it in players season to season, so it doesn’t really matter much for scouting, so why bother writing this many words about it at all? I mean who cares if Johnny Soccerboy only scores farpost curlers with his weaker foot at 4% and Clive Footielad does it at 14%. Remember, shot conversion overwhelmingly depends on location, this stuff is exceedingly small potatoes by comparison. It doesn’t impact things enough to make a difference.
Working with our two definitions of finishing skill we can look at it this way. Define it narrowly and we can ignore it. Players may have uniform skill, but that skill is fairly minor when it comes to examining all the things that go into shooting (this is a point I’d certainly be willing to change my mind on if data proved otherwise) or it encompasses lots of factors. In that case, when we define finishing skill as an amalgamation of all of these factors, we can see differences. Those differences, however, lie in how players finish, not in how often.
That’s important, because it allows us to understand that the way players shoot can differ, even if the end conversion percentage doesn’t much. It gives us a number of options for further examining football, shooting, and team construction. Perhaps the most severe limitation right now is that the differences in the level of finishing, even when accounting for location, are so small compared to the number of shots players take in a season that it is impossible to differentiate between a player who might finish at 13% (location adjusted) and one who finishes at 17%.
There’s not disagreement on which player you want on your team, just that it’s impossible to look through all the variance to definitively find those players. But, what if instead of looking at players we could look at various skill-sets and see if they reliably over large samples provided better adjusted finishing rates. A cross sport example to make my point: a defensive specialist in basketball is often times a detriment to have on the court, but a defensive specialist who can also shoot a three-point shot from the corner is worth his weight in gold. Skill combinations.
Examining which possible combinations are reliably above average and then either recruiting to them, or developing players in them is a possible way to envision teams getting beyond the sample size barrier. From an analytics standpoint not dismissing finishing skill is hugely important. That’s because the ways in which a player shoots, what he may or may not be good at, how he decides to balance his varying shooting options partially define the set of shots he takes.
And when we talk about defining the set of shots a player takes, we are now moving beyond the realm of goals, and to the realm of expected goals. It’s easy to see how on one side of the coin shot selection defines conversion percentages, and so counts as part of what we consider finishing skills, but it also plays a huge part in establishing expected goals. Obviously a shot not taken has no ExpG value, and equally obviously what shots a player takes are defined by a number of factors, all of which could also be filed under finishing skill. I vehemently believe that understanding the specifics of how and when players shoot is important.
Given how complicated a sport football is, and how rare goal scoring events are in general, we’d never know decisively just by looking at outputs if players managed to increase their conversion percentages, or even teams for that matter. That doesn’t mean that those margins aren’t important, and it certainly doesn’t mean that they don’t exist. It simply means that to find them we may need to start systematically looking at inputs instead.
And it’s those inputs which make up finishing skill. It seems like to insist on finishing being completely luck you have to take one of two stances.
One define it so narrowly that you then leave yourself lots of work to do to prove that variance from finishing skill impacts statistics at all (as opposed to variance from other factors relating to shooting), or if you define it more broadly, insist that differences between player shooting events, both how often they score, and the specifics of the shots they take, are a result of pure variance, an insistence that both shot result and shot type are due to variance. There’s so much we don’t know about the sport. It seems a shame to dismiss a whole area of study, just because it isn’t clearly reflected in the data we currently track. Who knows what we might find.
In Part One Oliver Page looked at what statistical data is available to domestic clubs outside of the Premier League and how clubs might be able to use this to increase their efficiency in the transfer market. In Part Two he investigates further the transfer market (under?)performance of these leagues and whether a way forward can be identified.Compare The Market: Is The Transfer Market Efficient?
How do domestic divisions perform in the transfer market in comparison to other leagues around Europe and the World?
In part one I wrote about the value of using statistical player comparisons to make better informed transfer decisions. Similarly, I want to use league comparisons to look more closely at the apparent decline in transfer market performance of domestic divisions outside of the Premier League. Comparing anything across different leagues can obviously be problematic as inherent differences exist in the relative standard of those leagues. If player A has performed well in his league and player B has done the same in a different league can we really compare them? If league A generated £x million transfer revenue what does that mean? Is it just a reflection of the quality of that league? To attempt to address this I conducted on on-line survey asking users to rate the relative strengths of 25 different leagues from across the world. The methodology is inspired by this article by Jay Ulfelder which also explains the scoring system. A sample of the current scores (as of March 10, 2014) are as follows:
England - Premier League (95 out of 100)
Germany - Bundesliga (91)
Spain - Primera Division (87)
Italy - Serie A (84)
Netherlands - Eredivisie (70)
Russia - Premier League (61)
England - Championship (50)
Scotland - Premier League (25)
England - League One (23)
England - League Two (6)
These results are based on 1,251 votes so far and the full results can be see here. Obviously league standards can fluctuate over time (e.g. Glasgow Rangers demotion has weakened the SPL) but to my eye the ratings appear reasonable and are considered a useful tool for comparison. I took a selection of these world leagues and plotted their 'ratings score' against their respective transfer revenues for the last 8 seasons. Please note from here on I am combining pairs of seasons (e.g. 2006/07 and 2007/08 combined) as otherwise a single transfer in one season can sometimes distort results. Of particular interest was the comparison between the 06/07 & 07/08 period and the 12/13 & 13/14 period which can be seen below along with the full table of results. [see notes at end of article for further details of methodology]
From 06/07 to 07/08 the English Championship generated more transfer revenue than the German Bundesliga.
For the period 06/07 to 07/08 both the Championship and SPL were generating considerably more revenue from player sales than many leagues of a similar, and even higher, rating. Since then however, they have both been overtaken by the 'better leagues' and caught up by many of the 'worse' leagues. There may be lots of different factors at play here (e.g. the most recent Russian Premier League revenue is skewed by the collapse of Anzhi Makhachkala) but the most recent chart does show evidence of a growing relationship between transfer revenue and rating score.
Worryingly for these leagues, the data suggests that the Championship and Scottish Premier League were actually OVERPERFORMING in the transfer market in previous years. Have they now just found their 'true level'?
It is also interesting to note, however, that a number of these leagues that have shown an increase in revenue are also those that have built strong relationships with data providers. In Part One we saw how the level of detail with which data companies such as Opta, Wyscout and Prozone cover competitions can vary from league to league. For example, the Bundesliga, Eredivisie and Russian Premier League have all had the full-detail level of Opta data available for at least 4 full seasons now. For the Championship this data only became available during 2013/14 and for the SPL, League One and League Two it remains unavailable.
Obviously we should be careful to draw sweeping conclusions – correlation does not imply causation – but it is difficult not to be intrigued by the possible existence of this additional relationship.
Where Do Championship Clubs sell players to? The data we have seen so far only shows total transfer revenue and a league could generate revenue just from buying and selling in-division and between its own members. Focusing now on the English Championship, where do its clubs sell their players to the most? In particular, what changed between the 06/07 to 07/08 period and the 12/13 to 13/14 period? [For an explanation of Superior 7 and Threatened 13 see this article by Infostrada Sports' Head of Analysis Simon Gleave] Firstly, Championship clubs appear to have next to no market for their players outside of the top two English divisions. The majority of transfer revenue has always been generated by sales to either teams in the Threatened 13 or the Championship. Interestingly, a similar pattern exists in terms of where Championship clubs buy players from too. For example, for the period 06/07 and 07/08, Championship transfer expenditure was £203.8M. £76.5M of this went on players from the Championship and £45.1M went on players from the Threatened 13.
Historically the Championship and Threatened 13 clubs have been locked in a cycle of selling and buying players the same players to and from each other.
All three of the Championship's main ‘customers’ are declining however. For example, sales within division are down from £89.7M to £35.1M and sales to Threatened 13 clubs are down from £95.4M to £58.3M.
Is this due to lack of data to evaluate Championship players? Is the data available but just not being used for recruitment purposes?
Or, is it the more worrying scenario that the data is available, it is being used for recruitment, and top clubs are just choosing to eschew an overpriced and overrated market?
Where Do Premier League Clubs Now Buy Players From? Championship transfer revenue is down but the Premier League recently signed another record broadcasting rights deal and is continuing to spend as much as ever. Where is this Premier League money now going? Again, I will focus here on the changes between 2006-08 and 2012-14. As we have already seen, the historically inefficient domestic ‘loop market’ between the Threatened 13 and the Championship has been greatly reduced in value. The leagues that are the greatest beneficiaries of this include Spain, France, Netherlands and Italy.
This cannot just be dismissed as the inevitable consequence of Bosman – this ruling celebrates its 19th birthday this year. In 2012-14 Premier League clubs actually signed players from LESS different overseas leagues than in 2006-08.
It appears that there could be a trend towards Premier League clubs concentrating recruitment on certain specific leagues around Europe and the World. Several factors could be causing this. Firstly, the leagues which have seen the largest increases are also those grouped in or around the top of the world league ranking seen earlier. Empowered by the new television deal, even Premier League clubs outside the Superior 7 can now shop for players in these leagues. For example, a club like Southampton can now buy players from a club like AS Roma. Secondly, the most successful international side of recent years is Spain, and the most revered club side in world football is FC Barcelona. The unique style of football with which these teams have achieved their success has inevitably led in part to some Premier League clubs trying to replicate this style and to increase their signings from the Spanish domestic leagues.
But it is also interesting to note again that the leagues which have seen the largest increases are also those who have been amongst the first to adopt detailed statistical coverage.
Are we witnessing a more data-driven approach to recruitment making the transfer market more efficient?
Championship clubs currently have next to no market for their “goods” outside of the UK and the Superior 7 are increasingly willing and able to find a more efficient market overseas. There is also evidence of this trend making its way down the ladder to Threatened 13 and Championship clubs. For example Newcastle now makes most of its signings from French Ligue One.
Are clubs simply concluding that domestic leagues offer poor value? That is, a high cost player of a quality that, even if you can measure and benchmark it, is inferior?
If the data shortage is a concern for young footballers’ attempts to get scouted, it is of even greater concern to the football clubs who have previously relied on revenue from an inefficient transfer market to survive.
Unknown Unknowns
"Analytics isn't about making your decisions 100% correct, but about moving from 48% to 52%"
(Paraag Marathe, President San Francisco 49ers)
The above quote is my favourite from the recent Sloan Sports Analytics Conference. Without wishing to go over all of the recent pro- versus anti- statistics in sports arguments I think it is worth remembering that nobody is saying that statistical analysis in the be-all and end-all and the answer to all clubs problems. What I believe it can offer is a way to add context to decision making that would otherwise be made on the basis of such things as instinct or experience. Perhaps it can tell you how a midfielder's attributes compare to similar players elsewhere around Europe? Or perhaps it can provide you with an objective way to draw up a short-list of young talents outside of your own division.
My background is in sports betting where everyone understands that a shift from 48% to 52% could be the difference between winning and losing in the long run. Unfortunately such long term and probabilistic thinking is rarely a luxury afforded to football clubs. Football, and indeed sport in general, is a game of opinions and almost everyone has one. Go to a stadium, watch a match in the pub or follow the game on Twitter and almost everyone has a opinion and everyone is an expert.
I do not know if this is a trait unique to sports but it isn't often you hear someone admit 'You know what I am not sure about that' or 'I haven't really seen that player play much actually'. When looking at the results for my on-line quiz it was noticeable how few responses were given as 'I don't know' or 'I don't know enough about that to vote'. It often seems people within sport are afraid to admit they don't know something. So here goes...
I have basically watched sport for a living for the past 8 years but will happily admit there is a LOT that I still do not know about it.
There, I said it. One way I like to help to get more information to help me make decisions and form opinions is to use statistics.
"Sports analytics doesn't take the fun out of sports, it mostly takes the dumb out of sports"
(paraphrasing Edward Tufte, Sloan Sports Analytics Conference 2014)
My version of the above quote would be something like 'it mostly takes the bravado out of sports'.
What Is The Way Forward? As we saw in Part One, we may not know for sure exactly what the situation is ‘on the ground’ - clubs and data companies are secretive - but we do have increasing evidence of a trend towards analytical recruitment in football. Data analytics is not 'taking over' but it is an invaluable tool for assisting in decision making processes. The top clubs are doing it and before long everyone else will follow.
It is no longer a choice of whether or not to embrace statistical analytics but WHEN and HOW.
Teams outside of the Superior 7 need to recognise that they operate in a world market now and can no longer rely on the domestic market for transfer revenue. They will need to become more analytically ‘savvy’ and use every new technique at their disposal to compete in this increasingly competitive market.
But who will pay for it?
It is understandably difficult to know details of the funding for data collection and analysis. We do not really know who the largest clients of data companies are (professional clubs? the media? bookmakers?) or how much it costs to provide and get access to all the most detailed data but it is something not every club can afford. Why is it so expensive?
To listen to some speak they would have you believe the data companies are the evil gatekeepers holding all the data for themselves in their ivory towers and charging a kings ransom to anyone and everyone for the privilege to use it.
Yet I have seen first-hand the intensive process Opta undertakes to fully code just a single game - I am sure similar processes exist at Prozone - and obviously data companies cannot provide this service for free. My understanding of the current system is that clubs are responsible for their own relationships with data providers – they are individual clients and have to pay for the breadth and depth of service that meets their own needs. This ad-hoc system is in contrast to how much of the same information is provided in the major American sports. For example the NBA recently agreed a deal to install optical tracking cameras at every team and also to make the data available to the public. In soccer, the MLS has a league-wide relationship with Opta which has been considered a great success both on and off the pitch. A hot topic in UK football at present is the perceived poor performance of our national teams and the relatively limited opportunities given to young British players at Premier League clubs. The Football Association are currently commissioning their own investigation into this and only this week the Times newspaper is running a series entitled The Good of the Game.
What would happen if a governing body such as the FA, SFA or Football League decided to invest in data analytics for the benefit of every club?
I do not know what the cost of this might be - this could be an impractical non-starter - but investment does not have to be purely financial. If clubs’ own analysts do not have the time or skills to deal with the newly available ‘big data’ then could this work be centralised and centrally funded? I am sure there are lots of people with the necessary skills out there who are only too willing to help as this article makes clear. If statistics and video coverage makes it way to all of the domestic leagues will we necessarily see a recovery in transfer revenue in those leagues? We do not know. Will it just confirm the suspicion that clubs have been overpaying for players in these divisions for years? Possibly.
Only time will tell but if you are a club and you don’t adapt to these new market conditions your future could be difficult.
Or if not, and you are a young player at one of those clubs, it might be time to check you have a valid passport.
NOTES [Note 1: TransferMarkt historic values for transfer fees are inflated to reflect current market prices. At present I have not received a response from them to confirm the exact method for doing this. It is assumed that this is consistent across world leagues] [Note 2: To account for the demotion of Glasgow Rangers in 2012 the combined revenue from all Scottish divisions is included throughout] [Note 3: Sales by promoted and relegated clubs are counted for the division they were playing in the previous season. E.g. Wigan sold James McCarthy to Everton when they were a officially a Championship club (summer 2013) but because Wigan was relegated the previous season this is counted as a Premier League to Premier League transfer (i.e. it is assumed Everton made the signing on the basis of his performances in the Premier League the previous season). At the other end, Dwight Gayle’s transfer from Peterborough to Crystal Palace is considered Championship to Premier League despite Peterborough’s relegation to League One.]