What follows is a synopsis of my presentation at the OptaPro Forum which was held in London on Thursday 6th February.  This article was first published on the OptaPro blog and can be found on this link.

This analysis was only possible due to the data provided to me by Opta.


Expected Goals

The use of Expected Goals (ExpG or xG) as a metric in football is becoming more widespread.  Even though all current versions if this metric are proprietary and use varying calculation methods, the aim of any Expected Goal measure is simply to assign numerical values to the chances of any given shot being scored.

The ExpG model that I and Constantinos Chappas developed produces a goal probability of approximately 3% for any shot that is struck from a central position outside the penalty area.  Over the past year there has been recognition that shooting from long range is sub-optimal and by doing so a team is merely giving up other, more lucrative attacking options – think Tottenham Hotspur and Andros Townsend.

However, although I will admit that I had jumped to this conclusion in my own mind I was conscious that the alternative options open to the player in possession had never been evaluated before (at least not publically).  This desire to establish baseline conversion rates for the different attacking options available to a player who was 25 – 35 yards from goal formed the basis for my Abstract submission.

Opta very kindly granted me access to the detailed match events for the English Premier League 2011/12 and 2012/13 seasons so that I could undertake this study and present my findings.

Those who are interested in the methodology I used can scroll to the bottom of this article, but for the sanity of any casual readers I will go straight to the findings of this study.

How many times was each option chosen?


Figure 1: Number of Opportunities for each FirstAction


Take ons were attempted much fewer than any of the other possible attacking options.  With the exception of internal passes, all the other FirstActions were attempted between 11% – 18% of the time.  At least part of the reason why there were so many internal passes is that some of the passes that were destined for forward central, wings or the corners would have been intercepted within the rectangle.  As I’m using the end co-ordinates of the pass, and intentions can’t be measured, these passes fall into the internal pass category.


But how often was a goal scored from each option?

As each possible attacking option has not only a chance of the team in possession scoring, but also the move breaking down and the opposition quickly countering I wanted to look at the net goals scored.  It seemed reasonable to assume that the choice of attacking option would have a bearing on how likely the opposition were to score.

To calculate the net goals scored figure for each option I deducted the number of goals scored by the opposition from the number of goals scored by the original attacking team (both within 30 seconds from FirstAction taking place).


Figure 2: Net Conversion Rates for each FirstAction

Shooting is good?

Much to my amazement, the choice of shooting was actually the joint most efficient attacking option for the player in possession to take.

I had certainly expected that a forward central pass would be one of the more efficient attacking options, but due to the lowly 3% success rate of shooting from this area I had expect shots to appear much further down the table.

Eagle eyed readers will have noticed that the net conversion rate for shots of 3.9% is much higher than the 3% I quoted at the start of my piece.  Was I wrong in my initial understanding?

In my dataset a goal was scored directly from the initial shot in 2.9% of cases, however this was further supplemented by goals being scored from another 1.2% of initial shots due to secondary situations, ie rebounds or players following up.  From this figure of 4.1%, a value of 0.2% was deducted to reflect the amount of times that the opposition scored within 30 seconds of the initial shot.  And so we arrive at a net conversion rate of 3.9%.

Another surprising aspect is that, on average, a team only scored 1 in every 40 times that they had possession of the ball in the area under analysis.  Without having any real knowledge, I had expected the number to be higher, but I guess it shows that our perception and memories can be misleading – hence why we should use data to aid us in our decision making processes.

What is the significance of these findings?

If these results can be taken at face value then no longer can we criticise a player for “having a go” from outside the area.  He’s actually attempting one of the most efficient methods to score from his current location.

The findings are even more important to weaker teams.  It appears that the option where the stronger teams have less of an advantage over the weaker teams is actually the option with the highest expected value (along with the forward central pass).  I say that shooting is the option that should favour weaker teams because those teams are less likely to possess a number of players that can thread a well weighted through ball or play an intricate pass.  They are also likely to struggle to attack in sufficient numbers to create an overlap down one of the wings or to have as many players in support of the ball carrier as the stronger teams will have.

As well as it being logical that weaker teams could benefit more from this knowledge than stronger teams, I was able to demonstrate this by ranking the teams based on average league points per game and splitting the sample into two halves – Top Half and Bottom Half teams.

Figure 3: Net Conversion Rates for Top and Bottom Half Teams

As expected, Top Half teams had a higher conversion rate than their Bottom Half equivalents across all FirstActions.  However, we can see that the drop off between the Top and Bottom Half teams is at its lowest for the Shot option and also that a Shot was actually the most efficient option for Bottom Half teams; whereas Forward Central Passes were the most efficient options for the Top Half teams.

Statistical Rigour

I wanted to satisfy myself that the differences in the conversion rates for shots over the other options (excluding forward central passes) were statistically significant.  I also excluded backward passes from these tests as I don’t think players choose a backwards pass with the expectation that their team will score a goal from it.

The Null Hypothesis used was that there were no differences in net conversion rates between the proportions.


Figure 4: p-values for significance in Net Conversion Rates

It can be seen that the Net Conversion Rates for shots are significantly different than corner passes, internal passes and wing passes.  The only option that didn’t reach the statistically significant threshold was shots compared to take ons, and it is my opinion that with a larger data sample these proportions would also emerge as significantly different.

At this stage we have demonstrated that shots from outside the penalty area are just as efficient as forward central passes, and more efficient than the other possible options.  However, I need to address the fact that there could be bias within the net conversion rates of shots.

Possible Sources of Bias

I am aware of four possible sources of bias that could be at play here which could artificially inflate the conversion rates of shots over other options.

  1. Team Quality
  2. Score Effects / Game State
  3. Lack of Defensive Data
  4. Natural Selection

I will briefly discuss each of these in turn and address them where possible.

1 – Team Quality Bias

We have already seen that Bottom Half teams convert shots at a higher rate than other options and that Top Half teams also convert shots at a relatively high rate.  There are statistically no significant differences in how Top and Bottom Half Teams convert shots.

2 – Score Effect Bias

It is accepted that the styles and tactics teams use vary depending on the scoreboard.  A team that is trailing are more likely to attack in numbers and a team that is leading may remain more compact.  It could be possible that shots are being attempted, or converted, at different rates depending on the current score line.

To investigate if this was the case I temporarily removed the Opportunities that occurred when there were two goals or more between the teams.  I then analysed the remaining Opportunities by looking separately at Opportunities which arose when the game was tied and when the game was close (ie tied or just one goal between the two teams):


Figure 5: Net Conversion Rate at Close and Tied Game States

Shots in the entire sample were converted at 3.9%, this is the same conversion rate for Opportunities arising when the game is tied and almost the same for Opportunities occurring when a team is leading by just a single goal.

It appears that shots are converted at broadly similar rates regardless of the current match score, and so there is no bias attributable this source.

3 – Lack of Defensive Data

The Opta dataset is very comprehensive in relation to on the ball events, but unfortunately I was not given any data that could help me ascertain the amount of defensive pressure on each Opportunity.

It could be possible that players shoot from Opportunities which have the greatest chance of their team scoring a goal and they only take other options such as passes to the wings or the corners when a shot is not possible.  Conversely, there will also be occasions when a player could take a shot but opts instead to play a ball for an overlapping runner or attempts to thread a through ball inside the penalty area.

I do not have the data to be able to form an opinion on this either way, but am making the reader aware that this could be a potential source of bias.

4 – Natural Selection

In analysing this dataset I do not have knowledge of the tactics that each team attempted to use on match day or the instructions that were handed down by coaches and managers to the players.  The final potential source of bias identified is the possibility that the only players that attempted to shoot from as a FirstAction were elite shooters (think Gareth Bale last season).

A player that is poor at long range shooting could be instructed not to shoot from an Opportunity or to always seek out the elite shooter.  If this was the case, then the 3.9% Net Conversion Rate for shots that was observed in my dataset wouldn’t be representative of the entire sample of Premier League players.

I would counter that by saying that we know that it’s not just elite players that shoot.  There will have been long range shots taken during the last two Premier League seasons by players who are not skilled in shooting.  So this figure of 3.9% will already be diluted (to some unknown extent) by the conversion rate of non-elite long range shooters.

Even if non-elite shooters are expected to have a conversion rate below the average of 3.9%, the magnitude of the buffer in conversion rates enjoyed by shots over the alternative plays of wing, corner and internal passes and take ons are sufficiently large to suggest that taking a shot may even be the optimum FirstAction for non-elite shooters.


The purpose of this study was to establish benchmark conversion rates for each possible attacking Opportunity given a defined area of the pitch.   I knew that I couldn’t capture all the information that was existent for each individual Opportunity but given the extent of the dataset used in this analysis I assume that I have obtained a representative sample on a macro level.

Given the visibly low conversion rates from long range shots I was surprised at just how efficient (relatively speaking) this option was.  This reinforces the fact that it is not enough to simply know the success rate for any option; we must also be able to reference that against the opportunity cost or success rates of the other possible options.

I am not suggesting that players should shoot on every attack; however I have demonstrated that we should be wary of criticising players for attempting to shoot, especially those in less technically gifted teams.  This study has shown that where players have opted to shoot it was, generally, the most efficient option open to them.

Armed with the information in this study it is no surprise that Tottenham had the highest conversion rate of their Opportunities over the last two seasons.  Gareth Bale would certainly have contributed to the success rate last season, but the North London side converted their Opportunities in both seasons at 3.8% and Bale did not have an exceptional shooting performance during the 2011/12 season.

The logic and methodology used in this study could be carried out on other areas of the pitch and thus benchmark conversion rates could be established as required.



I followed the flow of individual match events and created possession chains.  For this analysis I was only interested in possession chains which had an attacking event (ie pass, shot or take on) take place within the boundaries of the red rectangle as displayed in Figure 6.  Where an attacking event did take place within the rectangle I labelled this an “Opportunity” and it forms part of this analysis.

The boundaries of the rectangle in Figure 6 can be described (in Opta parlance) as:

80 ≥ x ≥ 67
65 ≥ y ≥ 35

In plain English, I was concentrating on Opportunities which occurred 23 – 37 yards from goal and in the central third of the pitch.

Over the two Premier League seasons there were almost 24,000 such Opportunities to analyse.



Figure 6: Rectangle showing boundaries for Opportunities

For my analysis I decided to have seven categories of attacking options based on the FirstAction carried out by the player within the rectangle.  These were:

  • Internal Pass  (red)
  • Corners Pass (yellow)
  • Wing Pass (black)
  • Forward Central Pass (blue)
  • Backwards Pass (orange)
  • Shot
  • Take on

To aid identification the colours noted above relate to the colours of the zone boundaries shown in Figure 7.


Figure 7: Boundaries of Five Passing Zones

To determine whether a goal was scored from each Opportunity I took the time of the FirstAction and allowed a period of 30 seconds to elapse to see if the attacking team scored a goal.  I decided to use 30 seconds in an attempt to allow fluid passing movements to have a reasonable chance of concluding whilst trying not to contaminate the analysis with events from subsequent movements.

The reason that I chose a time based cut off instead of following the move until the team lost possession is that a clearing header by a defender does not necessarily mean the end of an attacking movement as the ball could drop at the feet of the original attacking team.  Creating logic to determine when possession was really lost is challenging and objective, and so I avoided this method.

Free kicks were excluded from this analysis.

  • http://blogs.columbian.com/portland-timbers/ Chris Gluck

    So you offered this up as your own idea at the #optaproforum? Wow…

    I read many of those presentations and the one that struck me as being the best was the one by Robb Carroll – Context – Context – Context… even did a Podcast on it…

  • http://blogs.columbian.com/portland-timbers/ Chris Gluck

    A few causes for concern I see…
    1. Technical flaw – You have excluded crosses, throw-ins, switches and long balls originating from outside the final third. Part of that technical fault rests with how OPTA categorizes successful crosses – OPTA does not define them as successful passes – but for some reason an unsuccessful cross IS quantified as an unsuccessful pass?!? In addition, OPTA does not quantify a Throw-in as a successful or unsuccessful pass when from a definition standpoint a throw-in is a legal ‘pass’ in soccer as the ball clearly travels from one player to another.
    2. Technical flaw – A defensive clearance as defined by many of us who have experience in coaching the game is a ball ‘cleared out of the danger zone’ by a defender that does not, in turn, result in immediate pressure being applied by the team who had possessed the ball before it was cleared. If OPTA analysts are ticking the box ‘defensive clearance’ and the ball does not leave the danger zone / final third then the ball was not a ‘defensive clearance.
    3. The “IDEA” of tracking and associating the ‘creation of goal scoring opportunities’ with shots taken and then goals scored is not NEW and did not originate with you…

    • Conor

      On a scale of 1 to 10, how pompous did you set out to be with your post?

      I never comment on things like this but you were so irritating that it prompted me to reply. Is there a point to your post. You claim to be pointing out flaws in the piece, however, it seems that you haven’t actually read the method behind it.

      1: The author specifically wants to analyse the efficiency of shots taken from the zone 23-37 yards from goal in the central third of the field. He is comparing these to the options available to a player receiving possession within this zone, the options that you have considered excluded from the piece are not appropriate to analysing this problem.

      2: The author already notes a different approach that he could take in his method, ie using a different method for defining the end of a phase of play. Personally I like his current method, to me it comes down to a choice between being a little less accurate so that the investigation of the problem is less time consuming. My intuition would say that little has been lost by following the timed phase of play method, it is also more objective than trying to determine when an attacking movement has ended.

      3: I cannot speak with any knowledge as to the material getting delivered at an Opto Pro Forum. What I would say is that this article is a step forward for the Statsbomb site. I feel that the points made in the article http://statsbomb.com/2013/08/shooooot-a-paradigm-shift-in-how-we-watch-football/ have been an influence on the overall attitude of this site to shots from non-prime locations. This latter article adds more context to the numbers and may have completely flipped the perception of this sort of shooting going forward. That is quite a progression. Is there anywhere else in the public domain where this conclusion has already been drawn? Regardless of whether the idea behind the article is unique it is still of value if the results have never been presented previously.

      • http://blogs.columbian.com/portland-timbers/ Chris Gluck

        Aye, normally I would just offer congrats on some great research like we see on a regular basis on other statistical analyses but this idea to evaluate ‘goal scoring opportunities created’ has been my own research in the MLS for the last 18 months or so – so it’s very annoying to see it pop up here with data that was provided freely – I have asked OPTA to provide me current data discussing the same information and they refused to provide it. And that initial request was made to OPTA just about 12 months ago; so to be honest I don’t care if you understand my view or not.

        • Conor

          I apologise for the start of my post, given I had no knowledge of the context surrounding your post I regret being so adversarial.

          My reason for the comment was I felt the faults were being targeted at the piece and was not aware of it really being subject towards the faults with Opta, seems to be a common experience for those dealing in this analysis. A pet hate of mine is unconstructive criticism and this was how I perceived your comment at the time.

          One thing I would say, this is certainly not intended as a troll, is that from learning a small part of your reasons behind the comment I don’t think I’d personally be frustrated. The piece whilst bringing up a very interesting point is narrow in the scope of what it set out to achieve, it seems like there is still plenty of room for the research you’ve been planning. Personally I’d be encouraged by the fact that the above piece can be written using information that is freely available, although I do understand annoyance at paying for a service that isn’t up to scratch.

          • http://blogs.columbian.com/portland-timbers/ Chris Gluck

            In offering some additional thoughts on this stream – on the contrary I am very pleased to see that a relationship exists – it proves my thesis on midfielders and the lack of publicly available knowledge to better improve the statistical analyses involved in soccer. Furthermore this piece also provides a great example that soccer analyses is NOT moneyball – it’s a completely different game. FWIW my research on midfielders value in influencing outcomes from games began in late 2012. It was at that time that I approached OPTA for data in their playground to offer up points specifically discussed here by Colin. As correctly noted my angst is not with the very good methodology used by Colin, his work always shows strong methodology – the angst is that there is even more to garner from the data that OPTA collects but they simply refuse to offer it up without payment. My email trail includes speaking with Carlotta Romero Azner, Simon Farrant, and Simon Banoub beginning 4-29-13. In using the OPTA playground data their policy is that you, as the writer, waive copyright to the data if that information is used to create an article that has value. here is an excerpt of my email stream with Simon Farrant…

            “I am a writer/blogger for the Columbian Newspaper in Vancouver, Washington – I would offer it up as part of my continuing blogs about the Portland Timbers… here’s a link to my blog site to confirm… http://blogs.columbian.com/portland-timbers/
            “If you take the time to read through my previous articles you might notice that I am developing a pretty powerful relationship with my Possession With Purpose data points – my sight and my research has copyright and as appropriate OPTA will be referenced as the source for the data they provide…”

            “If you review my initial queries for other data some time ago I was quite keen to gather ‘red-zone’ (final third) data on penetrations, as well as ‘linked passing combinations’ that show cradle to grave movement of the ball when a team scores… ”

            “The other data set I am interested in is “plus/minus” for ‘time on the pitch relative to goals scored for and against – this would be one of my weekly statistic updates like they track and offer up for the National Hockey League…”

            “Look forward to additional correspondence as I’d like to author a book at the end of this year about those statistical outcomes… my current R2 for my PWP line is .987 – that is a pretty powerful relationship between time of possession, penetrations of the attacking third, creation of goal scoring opportunities, defensive (opponent clearances), shots taken, shots on goal and goals scored…”

            So with an R2 of .987 that is very significant and there is simply so much more that can come from analyzing penetrations and creation of goal scoring opportunities…

            I probably should have included this in my original comments but my angst was too strong – so kudo’s to Colin for getting free data – my route offered by OPTA required me to pay for it AND then have no copyright for it…

            all the best,


      • Colin Trainor

        FWIW Conor, I thought there was nothing wrong with your original reply.

        • http://blogs.columbian.com/portland-timbers/ Chris Gluck

          lol… that is no surprise coming from you colin…

          • Colin Trainor

            Just didn’t want Conor to feel bad when your initial comment seemed to take umbrage at my methodology. Rightly or wrongly, the area I had to work with was approved by the OptaPro Forum judging panel and so that was the subject of my study.
            Conor was merely pointing that out to you.

            It now transpires that there may have been other reasons for your angst. But those aren’t the fault of Conor or myself.

      • http://blogs.columbian.com/portland-timbers/ Chris Gluck

        Conor, If you’d like to read my Possession with Purpose – an introduction and explanation you see it here on AmericanSoocerAnalysis… http://americansocceranalysis.wordpress.com/2014/01/15/possession-with-purpose-an-introduction-and-some-explanations/ in there i speak specifically to goal scoring opportunities created and how those relate to overall team performance… my PWP Indices for attacking and defending… as noted elsewhere more needs to be done looking at defensive activities – that is my special project this year in the MLS.

        Best, Chris

        • Conor

          I really like the breakdown of the processes of a possession.

          Have you looked into how predictive the different steps are. It seems like this is a nice way of extending from PDO. PDO measures step 6, PDO isn’t considered strongly predictive. I can’t remember seeing an article quantifying how predictive PDO is, it also seems like it’s strength is also team specific in that Manchester United under Fergie and Barcelona in recent years have had High PDO’s that looked to be based on strong fundamentals. How strong predictively are the corresponding measures for each of the other steps? Who are the teams fundamentally better than the rest in the certain steps and who looks to have gotten lucky so far?

          I think it’s an interesting structure for comparing the route that has led a team to their sucess. I’d be interested in reading a comparison of Athletico Madrid vs Barcelona, on the same number of points in La Liga but have gotten there in very different ways.

          It would be interesting to look into linking it pace of play http://statsbomb.com/2013/11/pace-and-margin-for-error/ to your methodology in some way also. The index uses percentages for each step, adding in pace would give a new level of context to the index. It would be an interesting study to try to come up with an optimal formula for Pace per PWP. My intuition would say that as PWP increases a team should look to play faster. The rational being that better teams want to reduce the amount of chance in a result.

          • http://blogs.columbian.com/portland-timbers/ Chris Gluck

            Conor, Thanks for taking time to review my latest update to PWP; I will do my best to answer your questions and provide additional thoughts… they are not easy questions to answer 🙂

            I took some time to read through the article by Ted as I had not read it previously. Very interesting, and as usual from Ted some thought provoking information. I have looked into how predictive the teams in MLS are with respect to those my statistics and would offer these examples as comparison.

            Before starting I will offer my overall view of statistical analysis and foot that may help set the conditions for answering your questions…

            There are huge variations in the game and how it is played – in my view to have context (value) the relationships of the basic statistics of the game need to point out what content is needed to support the analysis on team performance relative to points garnered in the league table.
            In short I see this game as a consolidated combination of various individual events collectively analyzed like the team is an amoeba – there are instances where an individual player will certainly influence behaviors and outcomes but the amoeba approach on PWP shows well. (R2 = .9000)

            In offering that, PWP is “strategic”, a 3-5-2 versus 4-3-3 versus or ‘direct’ versus ‘possession based’ are “operational”, and statistics on individual strikers is “tactical”.

            Mixing direct attacking team individual goal scoring statistics to individuals who play more possession based attacking versus direct attacking (or 3-5-2 versus 4-3-3) is mixing apples and oranges as the service and approach to the striker is simply not the same unless you can clearly define (statistically) where and how often a team penetrates and generates GSOs…

            What I discovered was that the teams who had highest percentages in Step 4 (shots taken divided by average pass completed in the final third) were Chicago, Philadelphia and Dallas were also more likely to play direct attacking football across the pitch as opposed to a more ground based attack like Portland, Real Salt and LA Galaxy. What that is saying is that those three teams had more shots with fewer passes in the final third – quantity of shots versus quality of shots…

            The three teams with the lowest percentages in Step 4 were Portland, Real Salt Lake and LA Galaxy; fewer shots with more passes completed (speaks to quality shots occurring more frequently)… especially when looking at the end result (those three teams were superb in the standings last year finishing 1st, 2nd and 3rd out West).
            Of note is that neither Chicago, Philadelphia and Dallas, who had the three “best” percentages in PWP Step 4 made the playoffs.

            What does all that mean? For me it’s pretty clear – the volume of passes inside and outside the attacking third – not the speed of the ball traveling the length of the pitch show a stronger relationship to points garnered in the league table.

            But… the ‘pace’ as defined by Ted cannot be discounted when it comes to associating Direct attacking football with Ties.
            Portland finished top in MLS last year with 15 ties (volume of passes inside and outside the attacking third) but Dallas finished tied for 2nd with 11 ties and Philadelphia finished 3rd highest with 10.

            So in that respect the traveling of the ball by the ‘direct attacking teams’ showed similar results as what Ted offered yet the compelling evidence in looking at who finished first versus worst is more clear when viewing the volume of passes completed inside and outside the final third versus the ‘quality’ of the shot as opposed to the ‘quantity of the shot’. At least for me it does…

            Other additional evidence is provided here: In looking at Dallas, Chicago and Philadelphia (direct attacking teams) they all finished 15th worst or worse in volume of passes completed in both areas. By Ted’s definition their pace in traveling up and down the pitch should have been quicker (ball movement end to end) but none of those three teams finished in the playoffs.

            Conversely – Portland , Real and LA were top three in passes completed both inside and outside the attacking third – for me that represents quicker play but more deliberate play in moving the ball – so yes it’s about penetrating the attacking third but the pace, as defined by Ted, is not (to me) as critical as the deliberate possession-al ball movement with shorter, quicker passing to create and use space.

            Each individual step on it’s own (excluding goals scored – R2 ~ .6000) had very little individual correlation (R2 < ~.2500) to points garnered in the league table – but when collectively analyzed together their relationship was extremely strong.
            What is disappointing for me is that if physical penetration counts were made available (and their locations) – as well as GSOs (and their locations) the analysis (R2 should hover near R2 = ~.9876 as opposed to just an R2 of .9000. Funny that I’m not satisfied with an R2 of .9000 but it is what it is.

            As noted by Ted recently, and myself, numerous times here across the pond in the US, the level of detail and granularity provided by Opta lags considerably and 'gets in the way' of more accurate statistics that others can use to better understand the differences and better predict outcomes.

            By the way – I have not used PDO – I like my R2 from PWP and will like it even more if I can get PA3 (penetrations into the attacking third) and more precise GSO’s. So I don't measure luck if that is what PDO is for – I measure the propensity that consistency of purpose will get you better results than inconsistency – to me that Amoebic approach on understanding the value of PWP speaks to consistency and therefore 'luck' has no value in being measured. 🙂

            Team wise Real, Sporting KC and Portland had the highest possession percentages, passing accuracy was Real Portland and Montreal in that order, shots taken was Chicago, sporting and la galaxy. Shots on goal was Chicago, Real and LA, Goals scored was New York (supporters shield winner) Real and Portland… the bottom third were mostly Chivas, DC United and Toronto though DC united actually showed high accuracy in passing but simply didn’t have the goal scorers to ‘finish’…

            Bottom line at the bottom – I hold steadfast that predictive modeling should not occur until the baseline data source has reached a target for sampling; for me predictability, with any value in the MLS, can only occur when all the teams have completed at least 15; ideally that number should be 22 but I’m willing to begin predictability after 15 games.

            I also won’t take last years information to statistically predict the new year information given the shite scheduling approach in MLS – Portland hosted New York last year but this year that game is played in New York – so home versus away from year to year is completely different. Even more so Portland played Seattle in Seattle twice last year – this year they play them at home twice and away once. One year in MLS is simply not statistically relevant (at the outset) to the next year.

            I have often wondered what it would be like to work more on some collaborative ideas with some others who post here; perhaps that happens in the future – perhaps not?

            At the end of the day I am hopeful that gives you a better view on the internal relationships and answers your questions/additional thoughts.

  • John Frasene

    Great piece, it makes you wonder if there are under-valued players out there who routinely hit the target from distance. I think I saw that you mentioned Cabaye on Twitter. Those types of players may be able to create a significantly higher expected goal value than the average player when shooting from distance by their shot placement and rebound chances.

  • staty

    Very interesting piece. Apart from the interesting findings, its is additionally well done that yoou critiqued your study yourself.

    The discovery that shooting from distance isn’t always wrong fits the result of a simple correlation analysis: In 2012/13 in Europe’s 4 major leagues, the correlaton between shots from outside of the box and ponits was -0.22 – not very far from 0. Shots from outside and points correlate with 0.43 (shots from box: 0,68) Indeed, shooting from outside isn’t counterproductive.

    There are two interesting points that came into my mind:
    1. How probable is a turnover 30 (?) seconds after the “first action”? This could be important because retaining the ball can also be a kind of success, if you can’t achieve a goal.

    2. What are the values in other zones, e.g. the halfspaces or on the wings? Especially, the total conversion rates in those spaces are interesting. Are they different from the center?

    If you haven’t got the data to investigate that, just forget my questions. If you have the data, you can either do a study or you can also forget the questions.

  • Michiel Derksen

    Great piece, well done. One suggestion for improvement: goals scored and goals not conceded have been shown to have a different value in terms of expected points from a game. One could take this into account by giving a heigher weight to opposition goals which would reduce the effectiveness of options that are more likely to result in a counter-attack (intuitively that would be the internal pass).

  • Errorr

    This meshes nicely with the previous post about conversion of shots to goals v. ExpG. I commented on that post that I thought that the extraordinary success Suarez had converting shots to goals might actually suggest he is forgoing bad shots and instead possessions are ending with no shot whatsoever.

  • Pingback: Two Chelsea Fans Stabbed in Turkey Ahead of Champions League Game Against Galatasaray: Nightly Soccer Report | The football blogs by BetPowers()

  • http://americansocceranalysis.wordpress.com Matthias Kullowatz (@MattyAnselmo)

    I really appreciated this piece, Colin. I enjoyed your logic and methodology.

    I was hoping to pick your brain about that 1.2% chance of scoring after the initial shot from distance. Obviously that can come from rebounds back into play, or rebounds out to corners and throw-ins that subsequently lead to goals. Did you get a sense, or do you have the data to answer, the following questions?

    1) How much of that 1.2% comes from rebounds back into the field of play? Or perhaps more importantly, what proportion of all subsequent shots come from rebounds back into the field of play?

    2) And what fraction of rebound shots come within 1 second, 2 seconds, 3 seconds, etc. of initial strike?

    Even if you just have an intuitive sense from staring at computer screens until your eyes bleed, I’ll take it 🙂

  • Pingback: Expected Goals voor dummies()

Improve Performance and Productivity in Your Club:
State-of-the-art Football Analytics