Picking the optimal Colombian XI for the World Cup
This article is part of the Goalimpact World Cup series. The Colombian XI was picked by Bobby Gardiner. Bobby regularly writes about a variety of football topics on his own blog and other outlets. To read more from him, follow him on twitter at @BobbyGardiner or have a look at www.falseix.com. The eminently skippable subjective introduction to player ratings and Goalimpact is by Marek Kwiatkowski (@statlurker).
Player analysis is a big deal given the sums spent in transfer fees and player wages. Accurate measurement of players' actions is now possible in a number of areas thanks to the detailed data collected by companies like Prozone, Opta and Infostrada. But individual output is at best a proxy for performance, and the same has to be true of any player rating built on top of individual action counts. Often there will be players with excellent output who are clearly less valuable than some of their peers with lower output. You could call this effect the Podolski Paradox, only it is anything but a paradox: it is a logical, mundane consequence of the fact that at the low level, football is not bean counting but a complex, non-linear and, above all, densely interlinked dynamical system, where individual events take place in a rich context.
Consider for a moment the steps necessary to turn actions into a rating: a single-number performance score for individual players. First, you need to select the relevant actions, and these will be different for different player positions. Incidentally, "position" is not a very well defined concept at all, but we plough on. Now you need to weigh the actions: how many tackles is a key pass worth for a full-back? How about for a forward? Hmmmm. But let's go on; say we scored all actions separately. Now we need to normalise these scores for various factors such as time spend on the pitch (easy-peasy), quality of opposition (feasible) and opportunity to perform every kind of action (errrrrr). But say we've managed to do that, so now it's time to combine the scores into a single rating. Is it a straight sum, or at least a linear combination? No it isn't, and so on. The multi-dimensionality and complexity of the game bites you in the ass as soon as you begin and doesn't give up until you do.
In other words, what we need before we can build a robust player rating system based on individual actions is nothing less than a complete theory of football: a set of axioms that would allow us to put a precise value on every action in every context. We often -- always -- proceed as if such theory were not a prerequisite for bottom-up player evaluation, and as long as we do ad-hoc comparisons and the signal is strong we can get away with it, too; whatever that elusive ultimate theory is, it contains rules like "more goals is better" and "give-aways are bad". But a comprehensive player rating system developed on such a flimsy basis is guaranteed to be bad -- witness the WhoScored ratings, whose chuckle:insight ratio is somewhere in high single digits. Lastly, even if we ever arrived at an apparently robust action-based rating and tied it to real-world outcomes such as wages, chances are it would be gamed by players before you could say "Hang on, Mr Mendes".
But once we free ourselves from thinking in terms of individual actions, a new perspective opens. Football is a team sport, as the cliché goes, so how about giving equal credit for every goal scored (and equal blame for every goal conceded) to all players on the team, regardless of who scored it, who assisted it, who intercepted the ball for the move and who made the decoy run? This simple, elegant and fair approach is the basis of top-down player ratings, chief among them the Goalimpact, developed by Jörg Seidel. Goalimpact has its weaknesses, but in my opinion is still miles ahead of any other systematic, public player rating scheme. To give you a flavour of Goalimpact, the most recent update (February 2014), has Ronaldo, Lahm, Fabregas, Schweinsteiger and Messi, in that order, as the five best players in the world. I wouldn't be a lapsing academic if I didn't use this list as a starting point for a quick overview of the strengths and weaknesses of the model.
Ronaldo makes perfect sense and provides basic validation of the model. Lahm is a fantastic pick, highlighting the core strength of top-down ratings: independence from individual output, the scoring of which is as a rule more difficult (or at least less settled) for defenders. By way of strengthening this point: Lahm is not in the top 50 players according to WhoScored, who instead have Wolfsburg's attacking left-back Ricardo Rodriguez as the 5th best player in the world (good player; no further comment). Messi in 5th seems low, but perhaps it's a sign that his otherworldly performances over the years should be also credited to his excellent Barcelona teammates? Schweinsteiger on the other hand looks too high, but maybe he is responsible for Bayern's runaway success -- or maybe his position is a signal of an unindentified (by me, as yet) bias in the GI formula. That leaves Fabregas, who can be a poster boy for Goalimpact's major weakness: a standout player who remains at a level lower than that to which he belongs will be overvalued. This refers to Fabregas' time at Arsenal, and we also found similar issue with the Goalimpact of the Colombians who made it to the big European leagues being often lower than that of the best players in the Colombian league.
Over to Bobby.
* * *
Thanks to Jörg Seidel of GoalImpact for the data. If you’re wondering what GoalImpact is/entails, check it out here - http://www.goalimpact.com/p/blog-page.html (or, indeed, scroll up --ed).
After a sixteen year wait, Colombia are returning to the World Cup Finals. Inspired by a ‘golden generation’, Los Cafeteros have risen to 5th in Fifa’s World Rankings and given their notoriously passionate fan-base cause for genuine hope. Their first obstacle is escaping unscathed from what is possibly the most even of Rio’s groups – Greece, Côte D’Ivoire and Japan join them in C.
The XI crafted from GoalImpact scores alone:
This particular set up is quite far off a likely XI, probably due to a combination of old defenders (Perea, Yepes and Mosquera are all in their 30s and in Pekerman’s squad) and a lot of players in the Colombian League. That isn’t to criticise GoalImpact as a measurement, though, as we all know that players aren’t always picked or not picked based on ability and/or output alone (ask Samir Nasri or Carlos Tevez). Context is needed, and so I’ll somewhat systematically rifle through each area of the team:
In between the posts, David Ospina of Nice is likely to start. His GI of 92.6 is quite noticeably lower than the 123.67 of Faryd Mondragon, but he is 17 years younger and fast establishing himself as Colombia’s first choice keeper. They’ll be joined in the squad by the uncapped Camilo Vargas (102.78).
Oscar Murillo (110.46) and Alejandro Bernal (110.78), both of Atletico Nacional in Colombia, failed to make Pekerman’s initial 30 man squad and so are out of contention.
Pablo Armero (116.39), recently on loan at West Ham, is likely to start at LB while PSV’s Santiago Arias (94.5) should take up the other full back spot. The young right back may have been capped just 4 times by his country but he is only 22 and has been linked with the likes of Manchester United recently.
I’m not sure what the oldest centre back pairing at a WC is, but if Luis Perea (93.92) and Mario Yepes (40.97) start together as they did in Colombia’s most recent friendly against Tunisia, their combined 72 years may just break that record. Although their GIs (especially Yepes’) are very low, their peak GoalImpact scores are 123.67 and 138.33 respectively and so an experience vs current output trade-off may be Pekerman’s thinking here. I would personally start AC Milan’s Cristian Zapata (102.17) over Yepes. At 27, he’s not far off his peak but still possesses the necessary experience for the occasion.
The general rule of thumb with the Columbian team is the more you push into the attack, the higher the quality of the players. Sadly, Diego Arias (110.86) will not play a part in Rio - like his aforementioned Atletico Nacional teammates, he failed to make the provisional squad.
If you didn’t know who Fredy Guarin (115.04) is, you probably did come January after a frankly confusing transfer fiasco with his club Internazionale and a whole host of English and Italian clubs. In the end the talented all-rounder stayed and he is extremely likely to start in Rio. One of Abel Aguilar of Toulouse (94.08), Elche’s Carlos Sanchez (92.35) or Monarcas’ Aldo Ramirez (93.6) will probably start next to him at centre-mid. I’d go with Aguilar myself. In terms of GI, there’s almost no difference between the three, but the first two are a tad more defensively minded than Ramirez while Aguilar edges it over Sanchez because of his ability to (albeit occasionally) score.
James (pronounced Ha-Mez) Rodriguez (122.87) is one of my favourite players in the world. At 22, he is quickly becoming the perfect combination of dangerous pace and brilliant creativity; managing an extremely healthy NPGA90 of 0.65 this season at Monaco. A lot of attention directed towards the Columbian team will focus on his Monaco teammate, but keep an eye on the man likely to start as a winger but equally adept in a 10 role. On the opposite flank, Juan Cuadrado (106.87) is my choice. An extremely important part of Fiorentina’s attack this season, the skilful winger has improved tremendously in terms of efficiency with a NPGA90 of 0.52. If Pekerman wants a more central attacking midfielder, Macnelly Torres (116.13) is almost a certainty for the squad. Although now plying his trade at Al-Shabab in Qatar, the quirkily named creator is renowned and feared in South America for his trickery.
To be honest, the shape of Columbia’s attack is entirely dependent upon the fitness of one man – Radamel Falcao, whose GoalImpact score is the highest of any of the squad at 131.46. Any ‘golden generation’ comments or ‘dark horse’ bets are almost entirely focused on him. Seen by many as one of the best strikers in the world, his race to fitness has encouraged Pekerman enough to include him in the provisional squad and we all hope that he makes it.
The thing is…it might not be THAT big of a deal if he doesn’t. Before I’m thrown into some kind of metaphorical football taboo dungeon, let’s have a look at Columbia’s other striker options. Jackson Martinez of Porto (119.5), soon-to-be Dortmund’s Adrian Ramos (109.4), Sevilla’s Carlos Bacca (112.46), Luis Muriel of Udinese (105.08) and Cagliari’s Victor Ibarbo (99.52) have all made the provisional squad. That’s an incredible amount of good quality strikers, and Ramos and Bacca especially are coming off the back of very strong seasons with their club sides. As for which of those start if Falcao is out, I’d go with Martinez and/or Ramos depending on the formation. Both are adept target men, but deceptively good with their feet and I think they’d provide the best outlet to James.
My XI, combining context and GI scores:
Formation wise, it’s quite difficult to predict what Pekerman will do. A 4222 type set up helped him through the qualifiers but he was equally reliant on a more defensive 4231 away from home. Against the offensive prowess of Cote D’Ivoire and Japan, the latter might be the better option.
Muriel has been regularly used by club and country as a winger and putting him there allows James to unleash his creativity more centrally. Obviously, if Falcao is fit, he starts either over Martinez or alongside him (or any other one of their billion strikers) in a 4222. The average GoalImpact score of this particular team is 107.60 which is pretty low, but a lot of these players are either young and having their first few good seasons (Muriel, James, Cuadrado) or old and likely to bow out after the World Cup.
‘16’ is an important number for Los Cafeteros in another sense – the furthest they’ve been in a World Cup was the round of 16 in 1990. Maybe, with or without Falcao, Pekerman’s men will be able to better that this summer.