Following on from Daniel Altman’s excellent piece on the scoring rate of substitutes I thought I would undertake my own analysis on the impact of substitutes.
The methodology I will use is slightly different to that employed by Altman in his article. I will use the Big 5 European leagues for last season (2012/13), and I will study the goal scoring rates for all players that scored at least 6 league goals last season.
The use of this filter gives me a list of 268 players that scored a combined total of 2,782 goals in 617,331 minutes of playing time. This equates to an average scoring rate of 0.41 goals Per 90 minutes for our sample of players.
At this stage it is a well-documented fact that more goals are scored in the second half of games than the first half, and the apportionment in the Big 5 leagues last season was no different with just 44% of all goals scored in the first half and 56% in the second half.
The following is the distribution of goals in 5 minute time intervals for the 5 leagues last season:
We can see that, generally, the goal scoring rates increase in line with the time elapsed during the match. For my purposes, the minutiae of the goal scoring rates isn’t important, instead we just need confirmation that this trend does exist in my data sample.
In his piece, Daniel Altman found that forwards coming on as substitutes scored at a higher rate than starting forwards. But when we consider that more goals are scored in the second half than the first half then this is no great surprise. Substitutes will spend a greater proportion of their playing time in the second half (when goal expectation is higher) compared to the first half than a starting player.
So what do we take from this?
The fact that substitutes have a higher scoring rate means that you can’t directly compare Goals Per90 figures between players that regularly start and those who make frequent substitute appearances. Very simply, the substitute will have his numbers inflated and we would expect his Per90 numbers to drop in the event that he was handed a starting position.
However, Altman didn’t stop there and he found that “fatigue among forwards was a more powerful force than fatigue among defenders”. That sentence struck a chord with me and I wanted to investigate the general phenomenon of fatigue in footballers a little further.
Hierarchy of Goals Per90
We have established that the longer a match goes on the greater the goal expectation. This is one of the reasons why substitutes score at a higher rate than starting players. So, by extension of this logic we would therefore expect players who are substituted to score less Per90 than players who played the full 90 minutes.
Not only would the substituted player be swimming against the tide of playing at least as many first half minutes as second half minutes when the goal expectation is at its lowest, but the fact that he is substituted may also indicate that he hasn’t played a great game thus far.
That second suggestion certainly won’t be true all the time. The player may be injured, withdrawn for tactical reasons or just tired but it seems reasonable to assume that some of this cohort will have irked the manager enough with their performance to be substituted.
Even ignoring the suggestion that the substituted player has been having a less than stellar performance, due to the increasing goal expectation it is reasonable to assume that the hierarchy of Per90 goal scoring rates would rank as follows:
- Full 90 minutes Players
Now we're finalised our hypothesis, how does that compare with what actually happened last season?
Each game that our 268 players took part in last season was divided into the 3 categories: Substitutes_On, Full 90 and Substitutes_Off and I totaled the number of goals and minutes that the group of 268 players as a whole racked up in each category.
Big 5 Leagues 2012/13
As expected, substitutes coming on scored at the highest rate of our three groups. This group scored at a clip of 0.65 Goals Per90, however players that played the full 90 minutes actually posted the lowest Per90 numbers of 0.38 with the players that were substituted off sandwiched in between at a rate 0.42 Goals Per90.
I think this is a super interesting finding and it appears that Daniel Altman was spot on with his suggestion of fatigue being a big issue in the rate that forwards score goals. My sample doesn’t specifically just include forwards, but as it includes the leading goal scorers it will obviously be forward biased.
It looks like the fatigue factor is so strong that it is even able to overcome the fact that more goals are scored in the second half than the first half. We have shown that a player who starts the games and is withdrawn scores at a higher rate Per90 than a player who completes the full 90 minutes.
When you think about this, it is common sense. Players tire and it’s better to replace them with fresh legs, but I’ve never seen the impact of tiredness quantitatively assessed before. I have no doubt that clubs and organisations like Prozone have data that records the physical drop off in player performance due to fatigue but I am surprised that the impact is so strong for goal scorers that it outweighs the benefit of playing the entire second half of a game with its increasing goal expectation.
I’m sure that if we analysed the actual minutes that each player played and their scoring returns for those minutes we could remove the second half scoring bias and calculate exactly how much more likely a fresh player is to score than a player that has played the entire game. However, I’m going to stop short of these calculations in this article as that would require another level of data analysis.
I am conscious that the above findings are based on just one season of data, so to give me some comfort as to the integrity of those findings I looked at each of the 5 league separately to see how they performed individually.
Encouragingly, all 5 of the leagues follow exactly the same trend. The substitutes coming on comfortably post the highest Per90 scoring rates. This group have the parlay of being fresh as well as spending proportionately more of their playing minutes in higher goal expectation periods of the game. The players that were withdrawn have a slightly higher Per90 figure than the footballers that played the full 90 minutes with the benefit of freshness outweighing the back ended scoring bias.
I therefore feel that we can conclude that, not only do substitutes score at a higher rate than starting players but that the players who are subbed off score at a higher clip than their teammates that play the full 90 minutes.
What are the implications of this?
I can think of at least two implications. The first is in terms of comparing players' scoring rates it was presumed that substitutes' scoring rates were inflated due to the nuances of the back ended time they spent on the pitch. Daniel Altman confirmed this in his article. However, we also need to be equally aware of players who were substituted off as they too will tend to possess higher Per90 performances than players who play the full match duration.
The second impact is much more important. Unless there is a large difference in quality between the starting 11 and his substitutes any manager that doesn’t use all 3 substitutes are giving up some expected value. And by "using substitutes" I don’t mean introducing them in the 85th minute or in injury time to simply run down the clock.
I find myself agreeing with Altman’s almost throwaway suggestion that players should be substituted early in the game. Not only do we get the boost of the player coming on having fresh legs but we also reduce the negative impact of the fatigue of the substituted player as the change is being made earlier than "normal".
I realize that managers may need to hold a substitute back to cover the chance of injury later in the game, but leaving that aside there really should be no reason why managers don’t ensure that they empty the bench in enough time to get the full benefit of the fresh player.
When are Substitutes used?
After establishing that it is important that managers properly balance the trade off between ensuring they can finish the game with 11 players and ensuring that they obtain maximum benefit from the use of their substitutes I found myself wondering how subs are currently used.
Here is the data from the first 20 Game Weeks of the 2013/14 Premier League season showing the percentage of possible substitutes that have played a minimum amount of minutes.
2013/14 Premier League (Weeks 1 - 20)
The blue plots are the first subs that were used by Premier League managers. 50% of all first substitutes played at least 30 minutes. The noticeable drop off at the 45 minute mark is interesting; and this clearly shows the reluctance to substitute a player in the final minute of the first half.
The red plots represent a team's second substitute. 50% of second substitutes play less than 20 minutes, and only approx 15% of second substitutes play at least 30 minutes.
We can see from the green plots that, in only 50% of the time does a third substitute play 6 minutes or more and 1 in 5 managers wait until the 89 minute to make their last change. In fact, during the first 20 weeks in the Premier League there was a total of 98 possible substitutes that were not used. I know the managers have a desire to finish the match with a full complement of players, but there is a trade off where this prudence has the opportunity cost of not making maximum use of fresh legs against a tiring opposition.
In this article I have concentrated on scoring players, primarily forwards. Perhaps fatigue affects forwards more than other positions, but it's more likely the case that we are better able to measure a goal scorer’s output and thus comment on their performances.
Would it be far-fetched to assume that a central midfielder would suffer less fatigue than a forward? I don’t think so, and I assume that the clubs would be in the position to know how much physical fatigue each player suffers during a full 90 minutes. But are they in a position to be able to quantify how much that level of fatigue actually affects the chance of his team scoring a goal or conceding a goal? I have my thoughts on this, but I just don’t know.
Am I advocating that players should be substituted on the 30th minute, the 45th minute or the 60th minute? At this stage I cannot answer that. As stated above, I would need to undertake more detailed analysis to assess the fatigue impact on a minute by minute basis to arrive at a definitive answer. However, this analysis has shown that the fatigue impact is large enough to overcome the difference in the scoring rates between the two halves, so with that in mind there is really no reason for a manager not to avail of all of his available substitute opportunities.
Indeed, the use of substitutes is just another facet to the game that good managers will use to their advantage whilst poor managers will not realise the tactical advantage that smart substitutions could be able to give them.
EDIT (16/01/14 at 10:39)- A few comments has suggested that there may be a forward bias in the players that are substituted off. Here is the split of only the starting forwards in my sample:
Even within this starting forwards group the players that are substituted have a higher Per90 rate than the forwards that play the entire 90 minutes. Any further granular analysis than this would involve the identifying of individual players to see how they perform when substituted off compared to when they played the full 90 minutes. But I would be concerned that we would be slicing the data very thinly at this point.
ADDITIONAL EDIT - (16/01/14 at 12:40)
To eliminate the data contamination that has been suggested may arise from players with a higher Goal Per90 figure being more likely to be substituted than those with a lower Per90 number I divided my data set into two groups.
I ranked all 268 players by their Goals Per90 figure and divided the table in half, thus creating a top half that includes all the marquee strikers and a bottom half that included players that scored 6 goals but who weren't prolific goal scorers.
Even when looking solely at the bottom half of this table (so the players that aren't prolific goal scorers), this group of players also show that they have a higher scoring rate when they are subbed off than when they play the full 90 minutes.