Bayern Munich vs FC Koln Player Positional Tracker

Bayern Munich had a fairly comfortable 4-1 win at home to FC Koln this evening.

As picked up by Rene Maric in his as usual, excellent, tactical analysis of the game Bayern Munich decided to totally overload the left side of the pitch.  Rene’s analysis can be found here.

I produced a Player Positional Tracker (PPT) for this game and I think it neatly shows how Bayern approached this game, and it compliments Rene’s article.

For anyone that isn’t aware, our PPT is produced using Opta “on the ball” events.



(click on the image to open the PPT in a larger window)

We immediately see how Bayern tilted their offensive moves towards the left side of the pitch and how narrow they were on the right side.

Koln’s right side of the defense were faced with Ribery, Gotze and Alaba all attacking them.

For the first 30 minutes, Bayern’s right sided attacking players, Robben and Muller’s dots on the PPT were very small; this indicates a lack of passes or shots (i.e. attacking involvement).

Attacking Contribution Metric and Man United’s reliance on Di María

Short version: Angel Di María is the player that his club have relied upon most for his attacking contribution so far in this Premier League season.

Long version: Please read on

Many years ago the only individual player performance stats that we had access to were goal scoring records. Then someone decided it would be a neat idea to give credit to other attacking players and we began to also record the assists, ie the player that set up the goal. These stats are great, but as only approximately one in every ten shots is scored we inevitably lost a lot of detail as these performance counting stats only included the sample of shots that were scored. Why should the final shot from the striker influence whether or not the creative midfielder was awarded the assist or not for his through ball? To a large degree, the actual finish was outside of his control after all.

In relatively recent times things have improved for those that like to count things. Thanks to Opta (other brands may also be available) we now have a proliferation of sites that list the total number of shots and key passes that players make during each individual game and also cumulatively across a season. By stepping back one level from the old goal and assists metrics we can now credit players for their attacking output, regardless of the outcome of the final shot.

We know that not all shots are created equally, but given that there is a certain level of randomness in whether or not any individual shot actually results in a goal this increased level of transparency of individual attacking contribution can only be a good thing.

However, if we wish to accurately measure Attacking Contribution why stop at just the shot and the key pass? Doing so means that the player that played the penultimate pass gets no recognition at all, at least as far as the stats are concerned, and what about the player that made the pass preceding that?

Attacking Movements

Using detailed Opta event data I can join together the sequence of events for each shot that was taken and I can map out the complete attacking movement. These moves range in length from zero passes before the shot to the 51 event attacking move that Tottenham achieved against QPR earlier this season; a move that ended in a Nacer Chadli goal.

Using the information derived from these moves I want to have a go at creating a more comprehensive Attacking Contribution metric. This metric will go farther than counting just shots and key passes and can help us objectively measure the attacking importance of any individual player to their team. We have no need to just award “attacking points” to the shooter and the maker of the final pass. As with most of these metrics we’ll start with undertaking attacking analysis, as inevitably trying to analyse defensive contribution will be a much more difficult piece of work.

Data Rules

I needed to decide on a cut-off point in determining which actions to count in my Attacking Contribution metric. Although I want to go farther back in the chain than the guy who made the final pass, it is a tough sell to suggest that the player who made the 10th last pass in the move should receive credit for his part in the move. It’s an arbitrary cut-off but I decided to permit the final four attacking events in a move to contribute towards Attacking Contribution; this allows for the shot plus the previous three attacking events (pass, take-on or ball recovery).

For this measure I didn’t want to place different weightings on the extent of the involvement in any given attacking move. Very simply, if a player was involved in the final four attacking events in a move that led to a shot then they were awarded an Attacking Contribution. It is obviously possible for a player to be involved more than once in a move, ie they play a one two before taking the shot, but each player was only awarded one Attacking Contribution per move. After all, I simply want to measure how many moves each player could be said to have been involved in.

I am conscious that this analysis can only use the data that I have access to. Although the Opta event data is very detailed it only covers “on the ball” actions, which will be fine for 95% of this analysis. However, it will be unaware of the player that made the step over that sent the defender the wrong way or the supporting forward who made the unselfish run to pull the defenders out of their shape. I don’t imagine that these “oversights” will significantly impact on the findings in this analysis but I wanted to address that point now.

The premise of this metric is that it shouldn’t just be the shooter and the player that makes the final pass that receives Attacking Contribution credit, as is currently the case.

This post will serve as an introduction to my Attacking Contribution method; I have a few ideas related to this metric that I would like to tease out and analyse in the near future but I’ve got to start somewhere and I’ll keep the numbers in this piece fairly simple.

2014 Premier League Attacking Contribution

As a means of illustrating and working through this metric let’s look at the first seven games of the 2014/15 Barclays Premier League.

Here are the 15 players that have had the greatest Attacking Contribution in absolute terms:


With 22 key passes and 7 assists it’ll not surprise anyone to see that Cesc Fabregas has been the player that has had the highest Attacking Contribution during the opening seven game weeks of this new season.  By looking at the total number of minutes that each player has played we can convert these values to Attacking Contributions per90, this method of normalisation means we can easily compare players regardless of time spent on the pitch.  However, I’m not going to dwell on this aspect right now.

What I do want to spend some time on is describing how I see this metric being most useful: Which player contributes most to their teams’ shots?

Attacking Reliance

To assess the attacking impact that a player has I looked at their individual Attacking Contribution numbers as a proportion of the total shots that their team had while they were on the pitch. By doing this I’m not actually trying to measure the effect that a player has on their team’s attacking output, ie if the player was missing I’m not suggesting that his team would see their shots total drop by x shots. Instead, I am quantifying the proportion of shots a team takes that goes through the player, in other words it looks at to what extent a team relies on a player. How much of a team’s attacking game revolves around player X or player Y?

In this analysis I used a cut-off of 50% of minutes – a player has needed to be on the pitch for at least 315 minutes so far this season.

By dividing a player’s Attacking Contribution by the number of shots his team took whilst he was on the pitch I then arrive at an Attacking Reliance %. This Attacking Reliance percentage informs us of the proportion of attacks that the player is involved in (as defined by the final four attacking events of the move) or how much their team has relied on them in an attacking sense. The table in descending order of Attacking Reliance% currently appears as:



Now we get a different looking table, and one that seems to make sense. Fabregas has the highest absolute Attacking Contribution value, but despite his sublime performances Chelsea have had a sufficient volume of shots for them not to be overly reliant on the Spaniard.

High Reliance Players

We can see that even though he has only been with Man United for a very short period of time Angel Di Maria is having a hugely important contribution to their attacking output with an Attacking Reliance figure of 56%. Compare that with United’s other big name signing / loanee Falcao; even if I set aside the 50% minutes rule in this data set he still wouldn’t appear in this list. The Colombian striker has been involved in just 40% of United’s attacking moves. Given his price tag he’ll want to be quickly increasing that value.

The reliance that United has had on Di Maria is the highest in the league, just pipping Christian Eriksen who himself posts a rounded Attacking Reliance value of 56%. Despite struggling and appearing to be out of favour for large parts of his first year as a Tottenham player, the Danish attacking midfielder is now showing everyone his true worth. In fairness, it’s worth pointing out that some analysts were ahead of the curve on his ability.

Ted concluded that piece with “This might be controversial, but based on the rarity of that type of performance and how he’s performed over his career, Christian Eriksen is quite possibly one of the best attacking passers in the Premier League already”. 

Although Graziano Pelle has received the majority of the plaudits down on the South coast it is interesting to see that Dusan Tadic actually has had a greater involvement in Southampton’s attacking moves than the Italian striker. In fact, even James Ward-Prowse has a higher Attacking Reliance value than Pelle, who for the record has posted a value of 42%.

Swansea’s twin attacking threat of Gylfi Sigurdsson and Bony complete the list of players that posted an Attacking Reliance value of greater than 50%. So all a team has to do to stop Swansea is to stop Gyfli and Bony. Why did no one say that before? (insert sarcastic emoticon)

It is unusual for a team to have two players with such high Reliance values, but obviously these things happen so early in the season with a team that has had the second lowest number of shots in the league. In North London, Danny Welbeck will be pleased with his start to life as an Arsenal player with his involvement in 48% of Arsenal’s shots that have occurred while he has been on the pitch.

One other player that is worth mentioning is Riyad Mahrez of Leicester.  He has played just shy of 400 minutes this season, but more shots have gone through him while he has been on the pitch than any of the other Leicester players, including better known players such as Jamie Vardy and Leonardo Ulloa.


An Attacking Reliance figure for any individual player of 50% is massive, at least in Premier League terms. Over the last four full seasons only eight players achieved a value of this scale over the full 38 game season (and no, I’m not going to name them today, remember I said this was just an introductory article to the concept).

I’ve said it many times before, but one of the aims of my analytical work is to be able to objectively measure what our eyes see. In this regard, analytics won’t always provide ground breaking findings but it will allow us to quantifiably assess certain impacts, which may in turn, be used as inputs in subsequent applied research. This introductory analysis falls into this category.

In future articles I intend to undertake further analysis so we can see if we can learn anything more from Attacking Reliance figures.
Does a high reliance on individual players effect how successful a team is?
Does it matter if players with a high Attacking Reliance value leave the club?
Do we even have enough examples to be able to test this?

At this stage I don’t have the answers to the above questions, but I hope that’ll change in the near future.

Chelsea v Arsenal PPT. Where was Arsenal’s right side attack?

Chelsea 2 vs 0 Arsenal

Chelsea continued their great start to the season with a commanding victory at home to Arsenal.  They managed to take the lead through a Hazard penalty and they did what Mourinho teams do so well; totally stifled the opposition whilst carrying a terrific attacking threat due to the pace (of thought as well of feet) in their side.

I asked ThatsWengerBall to give me his thoughts on the game via the lens of the PPT, and his comments appear below the gif.

However, I wanted to mention the one facet of the game that was really noticeable with this PPT; Arsenal’s total abandonment of the right side as an attacking option.  Up until the point Oxlade-Chamberlain came on and provided width on that side, Arsenal didn’t have anyone in that area of the pitch during the match.  Watch the entire gif to see what I mean.

Ozil was the most right sided player, but he Cazorla, Welbeck and Wilshere were all primarily in the centre of the pitch.  The lack of Arsenal players in that right side was so noticeable as to make me presume it was a pre-defined strategy for Wenger.  If so, it changed immediately when Ox was brought on.

Definitely a strange one to play so many attacking players in the centre, especially against a Chelsea team that is so solid up the middle.

(Click on the image to open in a larger window)


That’sWengerBall’s comments:

  • The central/left area of the pitch was very congested with Arsenal’s offensive players throughout the match. Wilshere, Cazorla, Alexis, Özil and Welbeck all occupied positions very close together which had both positive and negative effects on their game.
  • Arsenal played to their strengths, almost turning their offensive game into a five-a-side style match. With little room in the centre of the park, the five aforementioned players exchanged tight angled passes and attempted a very high number of take-ons (40 between them).
  • Whilst this successfully negated Chelsea’s physical advantage (the average height of their starting XI was around 4cm taller than Arsenal’s) and proved effective at moving possession into the final third, they struggled to provide the killer ball as there was so little space that every pass had to be inch perfect.
  • Chelsea’s offensive play was a little more balanced, with Hazard targeting the inexperienced Chambers on the left and Schürrle or Costa acting as an outlet on the right. Whilst this proved effective at stretching Arsenal’s defence, Chelsea’s midfield 3 were unable to provide much support due to the pressure provided by Arsenal’s midfield overload. Oscar, Fabregas and Matic could rarely be found on the ball in the final third of the pitch and Chelsea only managed to complete 85 passes in that area compared to Arsenal’s 143.
  • The shape of the game changed a little from the 70th minute. Wenger brought on Chamberlain who instantly provided width with his shuttling runs down the right hand side; however Mourinho knew he had the upper hand with the goal advantage and brought on Mikel to shore up the defence.
  •  ­Neither side massively impressed going forward, but in the end two moments of individual quality – Hazard’s dribble and Fabregas’ pass – gave Chelsea the three points.


Player Positional Tracker: Arsenal v Crystal Palace

Arsenal 2 vs 1 Crystal Palace (16th August 2014) A few things I noticed are listed below, but I am sure that people will have their own opinions on what the viz shows.

  • Other than Gibbs (and then Monreal), Arsenal were very much orientated towards the right side of the pitch. Cazorla was notionally on the right side of midfield, but the Spaniard played very centrally. Palace facilitated this as Puncheon (their right midfielder) also playing narrow and central
  • Chamakh played exceptionally deep during the second half – he was behind his midfield for large parts of the second half
  • Arteta played a very disciplined role. He never moved outside the centre circle on this image (Note, we are not suggesting that he didn’t move outside the centre circle all day!!)
  • Cazorla and Ramsey played very close to each other, with Cazorla always just in positions that were slightly closer to the Crystal Palace goal Click on the viz to open in a larger window

Smart Use of Substitutes Can Make A Difference

Following on from Daniel Altman’s excellent piece on the scoring rate of substitutes I thought I would undertake my own analysis on the impact of substitutes.

The methodology I will use is slightly different to that employed by Altman in his article. I will use the Big 5 European leagues for last season (2012/13), and I will study the goal scoring rates for all players that scored at least 6 league goals last season.

The use of this filter gives me a list of 268 players that scored a combined total of 2,782 goals in 617,331 minutes of playing time.  This equates to an average scoring rate of 0.41 goals Per 90 minutes for our sample of players.

At this stage it is a well-documented fact that more goals are scored in the second half of games than the first half, and the apportionment in the Big 5 leagues last season was no different with just 44% of all goals scored in the first half and 56% in the second half.

The following is the distribution of goals in 5 minute time intervals for the 5 leagues last season:


We can see that, generally, the goal scoring rates increase in line with the time elapsed during the match.  For my purposes, the minutiae of the goal scoring rates isn’t important, instead we just need confirmation that this trend does exist in my data sample.

In his piece, Daniel Altman found that forwards coming on as substitutes scored at a higher rate than starting forwards.  But when we consider that more goals are scored in the second half than the first half then this is no great surprise.  Substitutes will spend a greater proportion of their playing time in the second half (when goal expectation is higher) compared to the first half than a starting player.

So what do we take from this?

The fact that substitutes have a higher scoring rate means that you can’t directly compare Goals Per90 figures between players that regularly start and those who make frequent substitute appearances.  Very simply, the substitute will have his numbers inflated and we would expect his Per90 numbers to drop in the event that he was handed a starting position.

However, Altman didn’t stop there and he found that “fatigue among forwards was a more powerful force than fatigue among defenders”.  That sentence struck a chord with me and I wanted to investigate the general phenomenon of fatigue in footballers a little further.

Hierarchy of Goals Per90

We have established that the longer a match goes on the greater the goal expectation.  This is one of the reasons why substitutes score at a higher rate than starting players.  So, by extension of this logic we would therefore expect players who are substituted to score less Per90 than players who played the full 90 minutes.

Not only would the substituted player be swimming against the tide of playing at least as many first half minutes as second half minutes when the goal expectation is at its lowest, but the fact that he is substituted may also indicate that he hasn’t played a great game thus far.

That second suggestion certainly won’t be true all the time.  The player may be injured, withdrawn for tactical reasons or just tired but it seems reasonable to assume that some of this cohort will have irked the manager enough with their performance to be substituted.

Even ignoring the suggestion that the substituted player has been having a less than stellar performance,  due to the increasing goal expectation it is reasonable to assume that the hierarchy of Per90 goal scoring rates would rank as follows:

  • Substitutes_On
  • Full 90 minutes Players
  • Substitutes_Off

Now we’re finalised our hypothesis, how does that compare with what actually happened last season?

Each game that our 268 players took part in last season was divided into the 3 categories: Substitutes_On, Full 90 and Substitutes_Off and I totaled the number of goals and minutes that the group of 268 players as a whole racked up in each category.

Big 5 Leagues 2012/13


As expected, substitutes coming on scored at the highest rate of our three groups.  This group scored at a clip of 0.65 Goals Per90, however players that played the full 90 minutes actually posted the lowest Per90 numbers of 0.38 with the players that were substituted off sandwiched in between at a rate 0.42 Goals Per90.

I think this is a super interesting finding and it appears that Daniel Altman was spot on with his suggestion of fatigue being a big issue in the rate that forwards score goals.  My sample doesn’t specifically just include forwards, but as it includes the leading goal scorers it will obviously be forward biased.

It looks like the fatigue factor is so strong that it is even able to overcome the fact that more goals are scored in the second half than the first half.  We have shown that a player who starts the games and is withdrawn scores at a higher rate Per90 than a player who completes the full 90 minutes.

When you think about this, it is common sense.  Players tire and it’s better to replace them with fresh legs, but I’ve never seen the impact of tiredness quantitatively assessed before.  I have no doubt that clubs and organisations like Prozone have data that records the physical drop off in player performance due to fatigue but I am surprised that the impact is so strong for goal scorers that it outweighs the benefit of playing the entire second half of a game with its increasing goal expectation.

I’m sure that if we analysed the actual minutes that each player played and their scoring returns for those minutes we could remove the second half scoring bias and calculate exactly how much more likely a fresh player is to score than a player that has played the entire game.  However, I’m going to stop short of these calculations in this article as that would require another level of data analysis.

I am conscious that the above findings are based on just one season of data, so to give me some comfort as to the integrity of those findings I looked at each of the 5 league separately to see how they performed individually.






Encouragingly, all 5 of the leagues follow exactly the same trend.  The substitutes coming on comfortably post the highest Per90 scoring rates.  This group have the parlay of being fresh as well as spending proportionately more of their playing minutes in higher goal expectation periods of the game.  The players that were withdrawn have a slightly higher Per90 figure than the footballers that played the full 90 minutes with the benefit of freshness outweighing the back ended scoring bias.

I therefore feel that we can conclude that, not only do substitutes score at a higher rate than starting players but that the players who are subbed off score at a higher clip than their teammates that play the full 90 minutes.

What are the implications of this?

I can think of at least two implications.  The first is in terms of comparing players’ scoring rates it was presumed that substitutes’ scoring rates were inflated due to the nuances of the back ended time they spent on the pitch.  Daniel Altman confirmed this in his article.  However, we also need to be equally aware of players who were substituted off as they too will tend to possess higher Per90 performances than players who play the full match duration.

The second impact is much more important.  Unless there is a large difference in quality between the starting 11 and his substitutes any manager that doesn’t use all 3 substitutes are giving up some expected value.  And by “using substitutes” I don’t mean introducing them in the 85th minute or in injury time to simply run down the clock.

I find myself agreeing with Altman’s almost throwaway suggestion that players should be substituted early in the game.  Not only do we get the boost of the player coming on having fresh legs but we also reduce the negative impact of the fatigue of the substituted player as the change is being made earlier than “normal”.

I realize that managers may need to hold a substitute back to cover the chance of injury later in the game, but leaving that aside there really should be no reason why managers don’t ensure that they empty the bench in enough time to get the full benefit of the fresh player.

When are Substitutes used?

After establishing that it is important that managers properly balance the trade off between ensuring they can finish the game with 11 players and ensuring that they obtain maximum benefit from the use of their substitutes I found myself wondering how subs are currently used.

Here is the data from the first 20 Game Weeks of the 2013/14 Premier League season showing the percentage of possible substitutes that have played a minimum amount of minutes.

2013/14 Premier League (Weeks 1 – 20)


The blue plots are the first subs that were used by Premier League managers.  50% of all first substitutes played at least 30 minutes.  The noticeable drop off at the 45 minute mark is interesting; and this clearly shows the reluctance to substitute a player in the final minute of the first half.

The red plots represent a team’s second substitute.  50% of second substitutes play less than 20 minutes, and only approx 15% of second substitutes play at least 30 minutes.

We can see from the green plots that, in only 50% of the time does a third substitute play 6 minutes or more and 1 in 5 managers wait until the 89 minute to make their last change.  In fact, during the first 20 weeks in the Premier League there was a total of 98 possible substitutes that were not used.  I know the managers have a desire to finish the match with a full complement of players, but there is a trade off where this prudence has the opportunity cost of not making maximum use of fresh legs against a tiring opposition.

Other Positions

In this article I have concentrated on scoring players, primarily forwards.  Perhaps fatigue affects forwards more than other positions, but it’s more likely the case that we are better able to measure a goal scorer’s output and thus comment on their performances.

Would it be far-fetched to assume that a central midfielder would suffer less fatigue than a forward?  I don’t think so, and I assume that the clubs would be in the position to know how much physical fatigue each player suffers during a full 90 minutes. But are they in a position to be able to quantify how much that level of fatigue actually affects the chance of his team scoring a goal or conceding a goal?  I have my thoughts on this, but I just don’t know.

Am I advocating that players should be substituted on the 30th minute, the 45th minute or the 60th minute?  At this stage I cannot answer that.  As stated above, I would need to undertake more detailed analysis to assess the fatigue impact on a minute by minute basis to arrive at a definitive answer.  However, this analysis has shown that the fatigue impact is large enough to overcome the difference in the scoring rates between the two halves, so with that in mind there is really no reason for a manager not to avail of all of his available substitute opportunities.

Indeed, the use of substitutes is just another facet to the game that good managers will use to their advantage whilst poor managers will not realise the tactical advantage that smart substitutions could be able to give them.

EDIT (16/01/14 at 10:39)- A few comments has suggested that there may be a forward bias in the players that are substituted off.  Here is the split of only the starting forwards in my sample:


Even within this starting forwards group the players that are substituted have a higher Per90 rate than the forwards that play the entire 90 minutes.  Any further granular analysis than this would involve the identifying of individual players to see how they perform when substituted off compared to when they played the full 90 minutes.  But I would be concerned that we would be slicing the data very thinly at this point.

ADDITIONAL EDIT – (16/01/14 at 12:40)

To eliminate the data contamination that has been suggested may arise from players with a higher Goal Per90 figure being more likely to be substituted than those with a lower Per90 number I divided my data set into two groups.

I ranked all 268 players by their Goals Per90 figure and divided the table in half, thus creating a top half that includes all the marquee strikers and a bottom half that included players that scored 6 goals but who weren’t prolific goal scorers.

Even when looking solely at the bottom half of this table (so the players that aren’t prolific goal scorers), this group of players also show that they have a higher scoring rate when they are subbed off than when they play the full 90 minutes.


Arsenal’s Premier League Shots

Arsenal’s mid August crisis seems so far away at this stage.  At that time they had just lost their opening game of the new season at home to Aston Villa and Gunners’ fans were disappointed at Wenger’s typical Scrooge like dealings in the transfer market.

Then Mesut Ozil signed on the dotted line and all has gone swimmingly for Arsenal since then.

Following their comprehensive 2-0 win over Liverpool on Saturday evening, Arsenal now sits 5 points clear of the chasers at the top of the Premier League.  Their critics would say that they have faced an easy set of fixtures so far; and they would have a valid point.  The current average league position of the teams that Arsenal has faced has been 13th, which compares with the average position of 10th for Chelsea’s opponents. Still, Arsenal can only beat the opponents they face on each match day, and in they have done that 8 times since their surprise upset to Aston Villa.  Their only less than perfect league result has been away to West Brom at the start of last month.

I’m keen to get a look at what the shots in Arsenal’s games can tell us about how they have performed and whether their league leading position after 10 games is justified.

Arsenal’s Defence

Although it’s been Arsenal’s attacking talent such as Ozil, Giroud, Cazorla and Ramsey that has received most of the plaudits I’ve been seriously impressed by the Arsenal defensive performance.

Here is the Shot Chart for the shots that Arsenal has conceded in the opening 10 games of the 2013/14 Premier League season.  And for those unfamiliar with these Shot Charts I am also showing the template that defines the boundary between the four zones that I use:


Shooting Zones


Arsenal has conceded 9 goals this season, but the two penalties scored by Sunderland and Aston Villa are not included in the above chart.

The concession of 125 shots is not elite; in fact the average EPL team has conceded 129 shots.  However, what Arsenal has done superbly is limit the amount of dangerous shots that they give up.  Their concession of just 31 shots (3 per game) from the Prime Zone is the best in the league; Tottenham, Man City and Everton are next best in this measure with 35 shots.


The result of preventing shots from good locations is that the average goal probability per shot allowed by Arsenal (at less than 7%) is the lowest in the Premier League.  I posted the following image in this look (link) at Roma, but it’s worth publishing here even if the figures do not take the latest round of games into account.


Roma AvgExpG Big5


Out of the 98 teams in the Big 5 leagues, only one team, Roma, forced teams to take shots where their average goal probability per shot was less than that allowed by Arsenal.  That metric must bring tremendous satisfaction to the team and their coaching staff.

I am measuring the average goal probability by using the ExpG measure created by Constantinos Chappas and me.  Some outline details about ExpG can be found in this article, but as we use this metric for betting purposes we’d prefer not to reveal the full details of the calculation method.  Incidentally an approach similar to this ExpG model seems to be used by Prozone and Joey Barton recently published some of Prozone’s stats on QPR.

At least we seem to be in good company…….

As well as owning the best average ExpG value allowed per shot, Arsenal have the lowest aggregate ExpG value conceded in the league.  This suggess that the Gunners are defensively sound and means that, on the whole, Arsenal’s low goals conceded total of 7 (excluding those 2 penalties) is deserved.  Although there are four teams that have conceded fewer league goals than Arsenal I would contend that the Goals Against column for those teams (Chelsea, Spurs, West Ham and Southampton) are much better than the shots they have given up would suggest.


The most extreme example of this is Southampton where they have conceded just 4 league goals.  Using the ExpG value for every shot conceded I ran a simulation which replicated the 97 shots that Southampton has faced this season 10,000 times.  In only 2.03% of these simulations did Southampton concede 4 goals or less.

Perhaps Southampton are doing something different where the probability of a team scoring against them is less than the average team in our data set but I don’t think so.  This incredibly low probability suggests that the Southampton goals conceded number is going to see some regression in the near future as the variance that they are currently getting the benefit of will turn on the Saints.

Who knows?  Perhaps Asmir Begovic’s goal for Stoke against the Saints on Saturday is an indicator of the “bad luck” that may be ahead of Southampton.

Arsenal in recent games

What’s even more impressive about Arsenal’s defensive performance is that they have improved as the season has gone on.  In 3 of their opening 4 games Arsenal conceded more than 1.00 ExpG (that is our estimation of the number of goals that a team should score given their shots); and their opposition in these games included Aston Villa, Fulham and Sunderland – none of which could be categorised as strong opposition.

However, in each of their last 6 games their ExpG against has been less than 1.00 and they conceded just 4 goals in those 6 games, so it’s not by luck that opposition teams are feeling frustrated after facing Arsenal.

I have seen Mathieu Flamini singled out for praise upon his return to Arsenal’s defensive system, and our ExpG numbers would corroborate that fact.  It’s probably no coincidence that he missed Arsenal’s opening two games of the season (when Arsenal shipped 2 of their 3 highest defensive ExpG figures).  The introduction of the French man has corresponded with a tangible decrease in the chances that Arsenal have given up.  

Arsenal Going Forward


Arsenal’s 142 shots represent the fourth highest volume of shots amongst teams.  Liverpool has also had 142 shots, and that pair trail behind Trigger Happy Tottenham, Chelsea and Man City in terms of efforts on goal.

Arsenal is very careful with their shooting locations, with no shots so far from the Very Poor Locations zone and 44% of their shots come from the Prime Zone which is well above the league average of 37%.

That Prime Zone figure of 44% is bettered by just Man City, West Brom and West Ham; with all of those teams have more headers than Arsenal.

The significance of this is that a team that has headers making up a larger proportion of their total chances would expect to see those chances originating from closer to goal than shots.  The trade off here though is that headers are converted at lower rates than kicked shots from all spots on the pitch.

All of the above means that Arsenal, although very good, from an attacking point of view have not been exceptional when analysed through the lens of our objective ExpG measure.

Man City, Chelsea and Liverpool (by virtue of their excellent average shot probability) all post aggregate ExpG values in excess of Arsenal’s number.

For the record, we have Man City with an ExpG of 8 higher than Arsenal, and Chelsea and Liverpool both at 3 goals higher than Arsenal at this stage of the season.

It’ll not surprise anyone when I therefore contend that although Arsenal has been a joy to watch this season they have over-performed in scoring 21 goals from their 142 shots. I processed Arsenal’s 142 shots through my simulator and on just over 10% (10.13%) of the simulations did Arsenal score at least 21 goals from the shots they took on. So, the over achievement of Arsenal in front of goals is not quite as significant as Southampton’s in stopping the goals being scored but, for my money Arsenal’s current goal tally of 21 goals (excluding the Giroud penalty) is somewhat inflated given the shots they have taken.

Aaron Ramsey

One of the stars of the season so far has been Aaron Ramsey with his 6 Premier League goals.  The Welsh midfielder has significantly upped his performances this season, probably due in no small part to the increased confidence that finding the net brings with it.

However, Ramsey is a perfect encapsulation of how Arsenal has scored more goals than the shots they have taken would suggest.

Using the same simulation methodology as above I have ascertained that Ramsey would score 6 goals less than 1% (0.85%) of the time based on the shots he has taken this season.

Even without access to advanced metrics, we know that shots are scored at a rate of approximately 10%.  Ramsey’s average shot location is certainly no better than would be expected for the league as a whole.  This simple logic would dictate that Ramsey would have been expected to have scored 2.30 goals from his 23 shots, not 6.

I want to be clear that our ExpG model uses much more complex inputs than laid out in the preceding paragraph, but even those simple numbers can give a sense of Ramsey’s over achievement in terms of putting the ball in the net.

I’m including a plot of all Ramsey’s shots this season and I spent some time trying to think of whose shots I could compare against.  I eventually settled on comparing the shots that Ramsey has taken this season with those taken by the same player last season.



The current season shots are the green dots, with the red and yellow checked dots representing the shots taken by Ramsey last season.

I think it’s fair to say that the general shapes of the shot locations are roughly similar from last season to this.  It is therefore surprising to see that last season Ramsey only scored 1 goal from his 46 shots, yet this season he’s shooting the lights out with 6 goals from half as many shots as last season.  You’d barely believe that these shots were struck by the same player.

I’d suggest that the true Aaron Ramsey conversion rate is somewhere between the two extremes, but the above image is a powerful reminder as to how much variance can exist when analysing individual players’ shots due to the relatively low numbers taken per season.

Interestingly, Ramsey’s variance of 6 actual goals against his 2 ExpG actually explains the majority of Arsenal’s attacking over achievement.


When we combine our attacking and defensive ExpG values for the entire Premier League, we rank Arsenal in third place overall, behind Man City and Chelsea.

Defensively they have been superb ensuring that teams shoot from unattractive locations, but the fact that Arsenal currently tops the league is partly due to variance in my opinion, and unless the Gunners create more and / or better attacking chances I would expect them to come back towards the chasing pack.

As a result of Arsenal’s relatively gentle start to the season in terms of opposition faced, it is inevitable that their strength of opposition will toughen up between now and Christmas.  In the event that goals start to dry up for the North London club the media will probably latch on to the fact that Arsenal have found it tougher as they face better opposition. In my opinion, this will be misguided as Arsenal is due a goal scoring regression regardless of who they face in their upcoming fixtures.


Another way of visualising the “luck” element that Arsenal has benefitted from this season is through PDO.  PDO is a concept that has been taken from ice hockey and it is supposed to measure luck by adding together the % of shots that a team scores and the % of opposition shots that they save. In summary, the league average is 100, and a number greater than 100 would suggest that a team has been lucky.

Our own Ben Pugsley has been keeping a record of PDO (amongst a host of other stats) over on his Bitter and Blue blog, and he has updated the stats for GW10.  His stats tables can be found here.

Arsenal tops the PDO table after 10 weeks.  Now I’m not totally convinced by the merits of PDO as it has at its core a belief that all shots for and against are equal, and I have shown that Arsenal allow poorer quality shots than anyone else in the league.

Still, even with that proviso I think it is useful to show that there is a measure other than our ExpG that shows that as good as Arsenal have been they are perhaps in a little bit of a false position.

I think it’s important for Arsenal fans to recognise this; they can certainly bask in the warm glow of being league leaders but be aware that the shooting performances suggest that there are currently one or two better teams in the Premier League than the Gunners.

Chelsea’s Striker Options

Given the huge amount of attacking talent currently residing at Stamford Bridge I wonder how Jose Mourinho is going to decide on which four attacking players he will field.


I assume that he will use a back four and will play two holding / central midfielders which will then allow him four out and out attacking players.

The widely held belief is that he will play three attacking midfielders, link men or “just off the shoulder” forwards.  Those 3 positions will probably be filled by some combination of Mata, Hazard, Oscar, Moses and the new signings of Schurrle and De Bruyne.  And the attacking talent has been assembled even before we consider the possibility of Wayne Rooney signing for Chelsea.

Such a formation would then leave room for just one traditional striker, and at the moment it would seem that this position will be contested by Lukaku, Torres or Ba.  It would appear that this position is Romelu Lukaku’s to lose but I wanted to take a look at the three strikers’ stats as well as visuals of their shot locations and placements from last season to see if this is indeed the correct decision.

Striker Options

Romelu Lukaku seems to be holding pole position in this battle right now, but at just 20 years old is he ready for such weight and responsibility to be placed on his shoulders?  Yes, he had a terrific season last year but despite the fairly large transfer fee Chelsea paid for him (£19m) perhaps he was something of a surprise package to the defences he came up against last season, might they be better prepared this season?

Demba Ba didn’t have a great first 6 months at Chelsea, in fact it went pretty awful for him with just 2 goals since his move in January from 46 shots.  That’s the sort of conversation rate that makes the current Fernando Torres look, well, like the Fernando Torres of old.

Torres doesn’t need me to write much more about him, suffice to say it appears that El Nino’s best days are well behind him at this stage.  Although the fact that he played in approximately 75% of all Chelsea’s available minutes last year suggests that Roman Abramovich may not feel the same way. At this stage it does look like his time at Chelsea is running out as there has been a lot of chatter concerning a return to Spain.

To help put some context on how the 3 Chelsea strikers performed last year, I thought I would take a look at their performance from a statistical point of view.

Player Statistics


The above stats are for the entire 2012/13 Premier League season, so Demba Ba’s figures include both his time at Newcastle and Chelsea.

All the figures, with the exception of ExpG and ExpG Eff, should be both obvious and well known to readers of this post so no explanation will be necessary.

Lukaku’s Per90Shots on Target value of 2.11 is pretty special and at more than 4.3 ShotsPer90 he certainly kept defences busy.  Demba Ba was even more impressive with the amount of shots he took but unfortunately for him he lacked a little accuracy which then reduced his SoT value. Torres’ numbers are really subdued.  Despite playing more minutes than Lukaku and Ba he had substantially less activity in all outputs (shots, shots on target and goals) and he rounds it off with just 1 shot on target per90, which is a very poor return for a top level striker.


The new metric introduced in the summary box, ExpG , is the number of Expected Goals that we** expected a league average player to score based on the type of chances that the players attempted.  The inputs to this measure won’t be disclosed, but we find that it is fairly accurate and allows us to compare the quality of chances created and then the efficiency with which they were finished.

The ExpG Eff metric is  = Actual Goals / ExpG where an ExpG Eff of 1 represents an average player, a value greater than 1 represents above average finishing and less than 1 below average

**We refers to Constantinos Chappas and I. Constantinos can be followed on Twitter @cchappas

From a Chelsea viewpoint it is perhaps worrying that Lukaku is the only one of the trio whose actual goals tally exceeded their ExpG value.  So whilst the finishing skills of Torres and Ba were very poor, with an ExpG Eff of 0.73 and 0.88 respectively, even Lukaku’s 1.05 (as the best of the trio) was not exceptional by Premier League standards.

As a means of comparison; Van Persie finished the season with an ExpG Eff of 1.15, Walcott 1.40, Berbatov 1.19 and even Suarez earned 1.08.

In fact, of the top 12 Premier League scorers last season only Dzeko (at 0.84) had a worse ExpG Eff ratio than Ba and Lukaku. Interestingly, Wayne Rooney who has been strongly linked with Chelsea this summer doesn’t look like he’ll be the answer to their lack of a clinical finisher either as he posted 1.06 last season.

Shot Visualisations

In order to provide the bare statistics with some context I had a look at the shooting locations that the players were faced with and the placements of their non-blocked shots.



The shot location images I use in this piece have been taken from the subscriptions section of Fantasy Football Scout website.

I certainly wouldn’t encourage players to take speculative, often wasteful long range shots, but the almost total absence of long range shots for Torres appears indicative of a player that is very low on confidence.  He also struggled to hit the target (green dots) from many shots that were outside the width of the 6 yard box.



The above image shows the shot placement from the striker’s Point of View with the red balls signifying goals.

Looking at the shot placements it would appear that Torres strongly favours shooting toward the right side of the target.  Aside from that there was an unhealthy attraction towards the centre of the goal.  His lack of accuracy and the amount of easy saves that opposition keepers were allowed to make would have contributed to his awful ExpG ration of 0.73.

Demba Ba



We can see a lot more activity on Ba’s image than the Torres one, with a particular penchant displayed  for attempting long range efforts


On the whole, Ba seemed to have two types of shots.

Most of his on target shots tended to be very low ground shots, which at least is preferred to shots that arrive at the goalkeeper a few feet off the ground.  However, he seemed to lack appropriate accuracy control when he attempted to put some elevation into his shots.



Lukaku’s shooting appears to be the happy medium between Torres’ lack of activity and Ba’s overzealous shooting.

He has a decent smattering of long range shooting, but the highlight of that image for me is that he displayed great skill in ensuring that shots from the right side of the pitch generally hit the target.  Undoubtedly this is due to the fact that he favours his left foot and thus the right sided shots give him the best angle, but the amount of green dots on that image is admirable.



If I was being critical of Lukaku’s shooting its that he fired too many shots toward the centre of the goal at heights that were favourable to the goalkeepers.

A rough count gives me 19 shots in the central region that didn’t stay along the ground, and only 2 of them were scored.  That shooting pattern will certainly reduce a player’s conversion percentage rate.

Perhaps that might explain why although good, the Belgian youngster’s actual goal tally compared to his ExpG was not exceptional by Premier League standards last season.


Based on the statistics from last season and the three strikers I have considered I don’t see any reason why Lukaku shouldn’t be the starting centre forward for Chelsea this season. Torres can be discounted entirely.  His finishing of the chances he had was very much below par, but this is compounded by the fact that he didn’t get himself in the position to be taking shots anywhere near often enough.

Ba just didn’t do enough last season to suggest that he is ahead of Lukaku.  Yes, he had more shots but his average ExpG per shot was 25% less than Lukaku.  The lower average shot ExpG is caused by attempting more difficult shots which suggests that Ba was less prudent in his shot selection. This also comes across clearly in their shot location maps.

As a result of Ba’s more speculative shooting, Lukaka posts better Shots on Target and Goals per 90 than Ba.  But the clincher for me is that Ba didn’t even convert his chances at the average player rate of 1.00 wheras Lukaku slightly exceeded that threshold (1.05 vs 0.88).

It will be interesting to see how Lukaku progresses this season.  There is no doubting that he is a handful and he should improve considerably with maturity, but he will need to. In my opinion, a club with the expectations of Chelsea should have a main striker who is capable of putting away their chances at a rate that vastly exceeds that of a league average player.  Perhaps Lukaku will develop into that player, but if not, it’s important for Chelsea that they have someone playing at the top of the pitch who can.

Near or Far Post Shooting

In a previous article which can be found here I did some research on the percentage of scoring shots and headers that a shooting player can expect to achieve given a specific shot placement. As it was my first attempt at looking at shot placements I grouped all shots together but the difficulty with stats and data is that you can never just take the first metric at face value as further analysis can be undertaken, and inevitably this second level of analysis can provide interesting insights that are missed at the higher level of data review.

In order to refresh memories, here is the scoring percentage for each shot placement zone for all shots and headers:


Remember that we are looking at the goal mouth from the point of view of the striker. I now want to undertake some further analysis to see what other information we can learn, and to do this I am going to look at shot placement based on which area on the pitch the shots or headers were struck from. I have divided all unblocked shots and headers into three pitch areas (right, left and central) as laid out in this image below:

Pitch Sides

The boundaries of the three zones have been deliberately chosen to ensure that approximately 50% of shots in the sample fall within the Central Shots zone, with the other 50% being split almost equally between right and left sides.

Central Shots Zone

Let’s have a look first of all at shots which were struck from the Central zone.


No surprise to see the general “shape” of the scoring rates heatmap pretty similar to the one at the top of the article which is for all shots.  The main difference is that the scoring rates are higher across the board, hence the increased level of “redness” in this plot.  As we are looking at the shots from the best positions (straight in front of goal) this would be in line with our expectation

Shots from the Right

We’ll now cast our eyes at shots which came from the right side of the pitch as defined in the image above.


Now it gets interesting!!!!!

The above image shows the scoring rates for shots taken from the right hand side of the pitch and immediately a clear pattern jumps out.  As expected there is considerably more blue and less red on this image than the previous heatmap due to the fact that we are now looking at attempts from less attractive shooting locations. However, that’s not what is so intriguing about this heatmap.

The heatmap is extremely unbalanced, with all the red and orange zones concentrated onto the left side. The imbalance is so great that if we divide the plane of the goal into thirds, the average conversion rate for shots that hit the target in the left most third (Far Post) is 32%, the central third 7% and the right third (Near Post) is 14%.

As seen in my previous piece and logic would dictate, it would be expected that shots struck towards the centre of the goal would have the lowest scoring rate; but a conversion rate of 2.25 times higher for on-target shots aimed towards the Far Post than than those aimed for the Near Post appears hugely significant to these eyes.

Shots from the Left

And what about for shots from the other side of the pitch, the left?


The exact same pattern, only in reverse, emerges.

From this left side, the Far Post third (right) has an on target conversion rate of 30%, 8% for central third and 15% for the left third. This means that Far Post on-target shots from the left hand side of the pitch are converted at twice the rate of Near Post on target shots.  This ties in pretty neatly with the finding from the other side of the pitch.

At this stage I think it’s safe to conclude that on target shots towards the far post (third of the goal) has twice the success rate of near post shots.  Even without going any further, that strikes me as a pretty darn important piece of information.

Point of Order: For the rest of this piece, Far Post is defined as any shot where the ball would cross the plane of the goal line in the Far Post third of the goalmouth or wider.  Whereas Near Post is the opposite, it would cross the goal line either in the Near Post third of the goalmouth or wider. Also the remainder of this piece will concentrate on just the shots taken from the right and left sides of the pitch as I want to investigate in greater detail the apparent Near and Far Posts phenomenon.

Far Post is Superior

So what does this mean?  My first thoughts are that the Andy Gray cliché of “he should have went across the keeper there” is correct. However, I’m only going to give him half marks as I believe his assertion was based on the fact that, if missed, a shot across the keeper has a chance of being parried, allowing the attacking team to pick up the rebound and have another strike at goal.  A shot missed on the narrow side does not have this luxury. Not for one second do I think that Andy Gray was aware that on target shots to the far post are scored at rates of 2 to 2.25 times more than those shot towards the near post.  At least if he was aware of that fact then he, along with everyone else in football, kept that particular nugget very quiet.

Possible Reasons for Discrepancy

1 – The first possible explanation for this difference is that I’m only looking at shots that are on target, ie in this analysis I have ignored shots that were wide or high of the target. Perhaps looking at goals as a percentage of all unblocked shots is required as it may be more difficult to hit the target with cross shots than near post shots.

After investigation, it turns out that this was indeed the case as 68% of all Far Post shots missed the target, compared with 64% of Near Post shots.   However, that small difference isn’t anywhere near sufficient to explain the difference in goals scored as a percentage of unblocked shots.

After including missed shots, 9.9% of unblocked Far Post shots are scored, whereas the rate substantially falls to 5.3% for Near Post unblocked shots.  This means we end up with a final ratio of unblocked Far Post shots being scored at 1.8 times the rate of Near Post shots. So after ruling out the difference being attributable to off target shots we are still left with a significant unexplained difference in terms of the scoring rate for Far and Near Post shots.

2 – Could it be that goalkeepers are overly concerned with getting beaten at their near posts?

There is no doubt that it looks bad for a keeper if he is beaten at his near post, but perhaps they are trying to guard the near post at the detriment of the cross shot? At this point (with no access to goalkeeper positioning at the time of the shot) I don’t have any way to either prove or disprove this possible explanation, so unfortunately I have no other option than moving on to my next possibility.

3 – Another possible explanation for the difference is that I have so far excluded blocked shots from this analysis (as we never know where they will cross the plane of the goal).

Due to the fact that a cross shot has to travel through the central area of the pitch it certainly seems likely that shots aimed towards the Far Post have a greater chance of being blocked than those targeted towards the near post.  But is the difference in the rates that Near and Far Post shots are blocked enough to explain the near twice as often conversion differential?

This could be quite a difficult question to answer as we have no way of knowing where the shots would have crossed the goal line had they not been blocked.  However, I have been spared some potentially impossible mental gymnastics as even if EVERY blocked shot was a Far Post shot (so none of the blocked shots were destined for Central or Near Post!!) the scoring rate for all Far Post shots would still exceed that of Near Post shots. That really is something. So although that is good news, as a numbers man I have an innate desire to quantify effects and so I’m going to try to make an educated guess at the location in the goal where blocked shots were destined for.

First up, what’s our split of non-blocked shots:


As stated above, I would assume that Near Post shots are likely to get blocked less than Far Post shots, but I would assume it would be reasonable for Central shots to be blocked at the same rate as Far Post shots.

Having established this, let’s then assume that Far Post and Central shots are blocked at twice the rate of Near Post shots (this is only a guess, but seems reasonable to me and I need to pick a number).

This blocked shots weighting combined with the volume of non blocked shots results in an assumed distribution of the Blocked Shots as follows:

Far Post               53%

Central                 21%

Near Post            26%

Total                     100%

I will therefore split the Blocked shots in my data sample as being destined for Far Post, Near Post and Centrally in the ratios of 53%, 26% and 21% respectively.

At this stage, I want to point out that the only purpose of the preceding couple of paragraphs is to approximate the number of blocked shots for each of the goal zones (Far and Near Posts and Central) as the analysis cannot be properly completed with some attempt at apportioning blocked shots.

Yes, some of my assumption can be challenged but I don’t think that I can be that far out in the approximations I have used; and importantly certainly not enough to change the core findings of this analysis piece.

Conversion % of All Shots           

Armed with an approximation of blocked shots for each goal zone we can now reach a conclusion which takes into account the percentage of all shots which are scored from the sides of the pitch (the areas denoted in the second image in this piece) depending on whether the ball would have ended up Near Post, Far Post or centrally in the goal.

Remarkably, 6.8% of Far Post shots were scored, this compares with just 4.4% of Near Post shots.  As raw numbers, both of those conversion rates appear fairly small, but don’t forget that we are dealing with shots that are struck from less attractive locations on the pitch (ie away from the central strip of the pitch).


What I have laid out in this article appears to be quite fundamental. When shooting from less attractive positions, the player shooting has a conversion rate which is more than 1.5 times better for Far Post attempts than for Near Post attempts.

If this fact wasn’t impressive in its own right, when this is parlayed with the chance of a Far Post shot being parried and the rebound scored from then the advantage is even greater than the basic 1.5 multiple as calculated above.


The question I haven’t been able to answer properly is why this phenomenon exists in professional football when clubs have access to both better data and bigger brains than mine?

I don’t think it can be due to variance as my sample has a huge amount of shots, it contains every shot taken in the Big 5 Leagues during the 2012/13 season – that’s almost 50,000 shots.

After undertaking the work for this article the only conclusion I can arrive at is that it’s due to Goalkeeper positioning.  I have taken account of most other things, ie the difficulty of hitting the target and the apportionment of blocked shots. Could it really be that keepers are so conscious about the “Pride of the Near Post” that they over compensate?  I am unable to coherently put forward any other possible reasons.

In order to gauge reaction to this piece I sent a draft to David Sally and Chris Anderson, the co-authors of “The Numbers Game”.  David made the point that a higher success rate for Far Post shots could be indicative of another aspect of the way goalkeepers play.  If they were slow to come off the line then due to basic geometry they would be more exposed to Far Post shots than Near Post efforts.

As alluded to in my preamble to this post, you can look at a facet of the game using just the headline measurements (conversion % for all shots in this instance), we can then go one level deeper into the data (slice the data by pitch sides) but even this may not be enough.  Chris Anderson made the point that I should probably further divide the data into shooting distances.  This would involve going yet another level deeper into the data. Perhaps I might further subdivide the data in a future article so that I can see the impact of shot distance on this Near and Far post phenomenon.  However, for my money that lack of further slicing of the data doesn’t diminish the importance of the findings laid out here.

As an aside, this clearly demonstrates why the basic match stats information is so lacking in detail to give fans a proper understanding of what has happened in a game.  Despite using data in a format that I hadn’t seen before (placement success rates), then going one level deeper, I find myself in the position where I could go another level deeper to try to complete our understanding of this quirk.

Whatever the reason, there is no getting away from the fact that that shooting Far Post seems to have a significantly increased higher goal expectation than shooting Near Post. In a game of such small margins were teams try to gain from any advantage where possible let’s see if clubs and players learn from this and we begin to see either a greater proportion of shots being fired towards the Far Post or keepers minding their Near Post just a little less in this coming season.