The Unbearable Likeness of Arsenal

sad-gunnersaurus Arsenal have been a tremendously consistent team over the past decade. However, now the tide seems to be turning against Arsène Wenger. Many fans would happily trade in that consistency for a few more league titles. If you're the fourth best team in the league over 12 seasons, how many league titles should you expect to win? If variance can swing Leicester a league title, you would think it could have thrown a few Arsenal's way, too. This is the question that Chris Anderson (writer of The Numbers Game) asked on Twitter last week: Fortunately, there are a couple of different ways we can try to answer this. One way is to use bookies' odds to estimate the probability of a team winning a match. Then, using these probabilities, we then simulate an entire season and estimate what the probability of a team winning the league really was. There are some problems with this method (see the appendix); however, it does allow us to get some ballpark numbers.

Zero league titles

So what is the chance of Arsenal winning no league titles since 2004? By the method described above, the probability of Arsenal winning 0 league titles from 2004/05 to 2015/16 (inclusive) is estimated at 14%. With a crude simulation like this, it’s probably not wise to read too much into the exact numbers. Nonetheless, this does suggest that Arsenal have been somewhat unlucky not to bring back a few titles over the last 10 years. So how many league titles should we have expected Arsenal to win?  

12 top four spots

It seems unfair to consider the bad luck Arsenal have had and yet not consider the good luck they have benefited from elsewhere. If failing to win a league title or two is unlikely given (the bookmakers’ opinion of) their overall performance, how unexpected is continued qualification for the Champion’s League?

The exact estimate here is <1%. When a number gets that low, it’s difficult to get a meaningful interpretation. I think the take-away message is that finishing in the top 4 consistently, even when you’re Arsenal, is very unlikely and requires large doses of both skill and luck.

St Totteringham’s Day

According to Wikipedia, St Totteringham’s Day is “held on the day when Arsenal have gathered enough points to be mathematically assured to finish ahead of Tottenham in the league table”. Celebrated for generations without fail, there have been a few close calls in recent years (not to mention what will happen this year). st-totts

Although the individual probabilities remain high, the combined probability of Arsenal finishing above Tottenham on all of the last 12 seasons is just 15%. It is of course possible that there are some factors that the odds may not account for. However, even if you believe that the odds are missing an adjustment for Spursy-ness, it still represents an unlikely achievement. I don’t think it’s at all surprising to conclude that Arsenal were unlucky to miss out on a couple more league titles. I think most football fans could have imagined Arsenal taking the league at times in 2015/16 or 2007/08. It’s tempting to wonder what could have happened had everything gone their way just once. Old stars might have stayed; new stars might have joined. You can see how a bit of luck at the right time can kickstart a chain reaction of improvement (looking at you Chelsea 2011/12). Ultimately, looking at it like this misses the point. Most Arsenal fans aren’t disappointed by one missing title. Instead, it is the sheer repetitiveness of each season. Every year the same mistakes are made on and off the pitch. At least in Groundhog Day, Bill Murray learned a thing or two each time he woke up. A bit more silverware and the resulting improvements in the playing squad would have helped paper over the cracks. But football is (slowly) modernising and those who fail to learn from their mistakes risk falling behind.


1. Figures

We can abstract the charts above a step further and look at the finishing probabilities of the team in each season since 2004/05. You can view the same charts for all Premier League teams on github.



It looks like 2011/12 marked the starkest change for Arsenal. It was in the preceding summer that Fabregas and Nasri left.

Manchester City


Money, what money?

Manchester United


I wonder what happened circa 2013/14?



2. Nerd nonsense


  • Take bookmakers’ home/draw/away odds
  • Convert odds to probabilities.
  • Using prematch probabilities, simulate each game 10,000 times.
  • For each simulation:
    • Assign points to home and away teams (3 points for a win, 1 for a draw, 0 for a loss).
    • Add up the points and sort to create a league table.
    • In the event of a tie, teams on equal points are ordered randomly.
  • For each season:
    • Sum the number of times a team finished in each position.
    • Divide by the total number of simulations (10,000) to estimate the probability of finishing in each position.

By taking the bookies’ probabilities game-by-game, in-season effects are, to some degree, taken into account. This means that in each of the simulations, all the injuries, managerial changes, suspensions and the like happen at the same time as they did in real life. This means that the resulting probabilities grossly underestimate the amount of uncertainty in a season of football. To use a specific example, the simulation does account for the possibility of Vardy, Mahrez or Kanté having a long term injury in 2015/16. In the same way, season-to-season changes aren’t accounted for. If Arsenal had won the league in 2010/11, it may have made keeping hold of their (and acquiring new) star players over the next few seasons that much easier.


Data was originally sourced from The odds used were from the following bookmakers:

  • 2004/05: Bet365
  • 2005 to 2012: Betbrain average
  • 2012/13 to 2015/16: Pinnacle closing odds.


The code, data and figures used for this analysis are on github.

Men on Posts and Starting Fires

I mentioned on Twitter recently that while I try to avoid disagreements when I am in a room with traditional football people, the one thing that is most likely to set off an argument is the topic of men on posts. Today I want to explain why that is the case, while covering a variety of other topics along the way.

Men on Posts

I swear to you, this topic comes up almost every week on highlight programs and game commentary. It is perhaps a bit less prevalent than discussion about the failings of zonal marking here in England, but it's an old favourite for the back-in-my-day commentator crowd. In 99.99% of the cases, it is also nothing but dead air and might as well be replaced with any other cliche that also gets spouted by the same commentator crowd. (We need better, smarter commentators, but that was a topic for a different day.)

My perspective on men on posts is that I almost never use them for defensive set pieces. There are a lot of reasons for this, but the basic principle is that I prefer active defending to passive and this takes one or two players completely out of the play where their only job is to act as last resorts. Now this isn't to say that I would never put men on posts. There are specific teams and situations where they are beneficial, but those are fairly unusual.

However, my preference for set piece defense isn't usually what starts the argument. Once the subject is broached, the conversation usually goes like this.

Me: "How do you feel about men on posts when defending set pieces?"

Traditionalist: "Oh, I would always have a man on [near¦far] post and sometimes a man on the other one."

Me: "Why?"

Trad: "Because [reasons]."

Me: "Okay, but how do you know?"

And this is where things invariably get awkward because usually they "know" because someone taught them this was the correct way to do things. Or possibly some anecdotally negative experience like, "we didn't have men on posts in this game, and the opposition scored a goal in the corner," changed previous behavior and now they protect against that scenario.

The problem here for someone like me is that when analysing most topics in football, I start back over at base principles. How do I know something? Well, I studied it. I typically take a large amount of qualitative and/or quantitative data, break it down, and then look at the outcomes to see what's there. Then I ask follow-up questions and pick at the results some more until I am comfortable I understand what I'm seeing.

This doesn't mean I am right. It's not about being right. It's about being knowledgeable in an area that is important*. And it means I have a foundation upon which to have conversations. Conversations and arguments tend to illuminate what you do and do not know, and highlight areas for further investigation. This is important, especially in football which, if we're being totally honest, is a game that we really don't understand very well right now. This includes most of the ranks of professional coaches around the world.

*important to the performance of your football team, at least. In the greater scheme of crazy world events, understanding set piece defending matters not a nip.

It also doesn't invalidate knowledge learned from years of working on the pitch. It just means that if you believe a thing to be true, you need to explain how you came to those conclusions, and the reasons need to hold up to scrutiny. If they do, great. If not, let's study the issue and see if the accepted wisdom the you believe to be true is correct.

So yeah, when you ask questions about how someone "knows" a thing, and maybe question the validity of that knowledge, you can cause problems. But the fact of the matter is, we should be doing this constantly inside of clubs because it leads to valuable research that can change behavior and develops more effective styles of play.

A goal in the Premier League is worth something like £2M. How many of those do we leave on the table because someone's knowledge is outdated or just plain wrong? (For what it's worth, on defensive corners, my players have shit to do instead of loafing around, leaning against goal posts. We save that sort of behavior for useless analysts, as it's the footballing equivalent of mooning the queen, donchaknow?)

A Good Question?

Someone noted over the weekend that Manchester City seem to prefer outswinging corners these days to inswingers. This is notable for two reasons.

First, a few years ago under Roberto Mancini we were told that City started using only inswinging corners because someone in the team had done a study and found that inswingers were more effective at generating goals.

Second, this switch to outswingers seems a direct contradiction to research previously done by this exact same team.

Odd, no?

James Yorke started poking around the data a little bit, as we tried to figure out what data they looked at to come to whatever conclusion it was that changed their behaviour. This lead back to a far more important problem that is often overlooked:

What question were they trying to answer?

It certainly doesn't seem to be "which delivery is more likely to score goals?" since that either leaned toward inswingers or was inconclusive, depending one what data was used.

However, what James did find was that outswingers were far more likely to be completed to a teammate. So if they were trying to answer the question of "which delivery is more likely to let us keep possession?" then outswingers would make a lot of sense. Given this is a Guardiola team, maybe that's what he wanted to know, especially since he is typically far more concerned about defensive shape when attacking than corner production.

Is that a very valuable question to bother answering is another issue entirely. Given elite corner execution can produce expected values per corner of .06 to .08, while average corner values are .025 and average possession values for most teams are in a similar or even lower range, I'm not so sure.

This is where the difference in counting and percentage stats comes into play versus stats that attach value (like the xGChain passing networks from StatsBomb Services). As football analytics matures, it moves more and more toward the value end of the spectrum, since that uncovers behaviour and strategies we really care about. Failing to incorporate these elements into team research can result in suggestions that actually makes team performance worse.

I'm not sure this is what happened at City - as I said, we're guessing at literally everything while we wonder why they are doing what they do. It's just a concept to keep in mind when generating research projects and then applying them to team behaviour in the future.

English Coaching and Commentators

Circling back to the commentators we hear on Sky, BT, and BBC every week, it frustrates me that the people talking about the game now were generally players that grew up in and played a style that has been completely refuted by the modern game.

The traditional English style of play Does. Not. Win.

If it did, we'd see far more English managers present in the Premier League, and dotted around Europe's elite. What we actually see is a complete dearth of English managerial talent throughout the ranks of the football league. The Premier League gives zero fucks about this, but it is worrying to the FA and generally to the lower tiers of the football league as well.

I've asked questions about how coaches in England are supposed to learn more successful styles of play, and the only real answer seems to be to beg, borrow, and steal internships either at teams with successful foreign managers (extremely difficult to do, even with elite contacts), or learn a language and do your coaching education abroad. Good luck with that in a post-Brexit environment!

This circles back to FA coaching courses, which have been revamped (again) in the last year. I did the class days for England level 2 badges almost exactly a year ago, and while I generally liked the process they used to teach you how to think about coaching, I thought they were also lacking in certain areas. The section on pressing was largely ineffective and dismissive, where the instructors were telling us it was fad-ish and existed before. Technically this was true, BUT

  • That ignores the fact that the current iterations of pressing come in many varieties and are substantially different than what you saw from the 70's through the 90's
  • Pressing variations really matter for evaluating top level tactics and play, which means they really matter for top level coaching
  • The instructors, who were otherwise quite good, displayed no real understand of this particular topic. Or really of shot locations and effectiveness. Which, if we're trying to train and develop better coaches and in turn better players, is probably a big deal.

Maybe this type of subject material doesn't matter at level 2, and I was expecting too much, or maybe English coaching education is still struggling dramatically to overcome decades of ineptitude to catch up with modern times. I honestly don't know.

Which finally leads me back to the current crop of commentators. Aside from Carragher and Neville, who clearly put a lot of research and work into their craft, the commentators currently discussing football on television generally don't understand modern tactics. How could they, when the tactics they were brought up playing were bad, and the coaching education failed to correct for that?

Nor do they have an analytical mindset, which would help to educate viewers on the reality of the game versus the perception. They commentate on games in 2017, but were almost exclusively trained in England, and brought up playing a style that almost doesn't exist any more at the top levels of play.

So what are they there for? The occasional interesting anecdote about mentality and what players feel like before a big game? To provide a constant stream of footballing cliches that provide no insight and are rarely relevant to the moment at hand?

We get nothing of interest from so many talking heads on television. No funny anecdotes about current players or managers. No tactical insight. No statistical insight. No points about technique and detail about what a player could or should have done better.

Half of the matches I and many other viewers watch each week have foreign commentators. I almost never feel worse off because of it. And THAT is a take away that should shake everyone involved in the production side of football, right up to the top levels of Sky and BT Sport.

Tough times for Paul Pogba?

World record transfer fees demand world record performances. Paul Pogba’s first season in the Manchester United first team took turn for the worse last week after been outplayed by N’Golo Kante against Chelsea in the FA Cup then succumbing to injury in United’s perfunctory dismissal of FC Rostov. He will now get time to rest up on his bed of money, ponder new hairstyles, make videos and reflect upon his disastrous season and how he hasn’t lived up to the demands his fee placed upon him. I mean look at this:

pobga disappearing

Pogba’s outputs this season are around half that he put up for Juventus last year, where he was a big contributor and behind only Paulo Dybala for goal contribution.

That’s what Man Utd thought they were signing, but instead they got half a Pogba.

Would they be in sixth... sorry, er... FIFTH place if he’d have stepped up? No.

Would they have meekly exited the cup if he had stepped up when his nine remaining teammates needed him against Chelsea? No.

Would Ferguson have retired if Pogba hadn’t betrayed the club that nurtured him and disappeared to Italy as soon as the money got waved under his nose? No.

Would Ferguson have built a new team around Pogba, with Paul Scholes playing alongside him, now able to play into his mid-40s because of the energy that Pogba brought to the midfield? Yes.

Would Ryan Giggs be waiting by the phone hoping ITV call over the international break? No.

Would Paul be the most popular baby name in Manchester by now? Yes.

Of course there are other narratives available and if you’ve made it this far vigourously nodding your head up and down until that last bit when it went a bit weird, then it’s possible that you’ve not read a StatsBomb article before.

A bunch of ill-conceived narrative supported by surface stats isn’t our style. So apologies, but here’s the real deal. Parts of Pogba’s game are actually thriving in Manchester. He is no longer part of a truly dominant team in a league, yet while his goal contribution has suffered, his expected goal contribution (from a-shooting and a-creating point of view) looks just fine. Even allowing for some model error, it would be hard to say that his performances have not deserved more:

pogba arriving

How about that? Our old friend variance has stepped in. If this chart looked like the first one then perhaps we would have a problem but, well: it doesn't. If we break it down, he's taking around three shots per game, which across his career is behind only his last season at Juventus, and while his expected goals per shot rate has been on the low side (0.071 per shot), he's never hit a high rate here and that's above last season (0.065).

Indeed, he's a player who may have a decent claim to have earned the right to deploy shots from range, as he's notched 16 times from distance across the last five seasons, against an expectation of around eight. He's never going to be an optimal shooter who focuses on close range--his position dictates that--and may well be good enough from further out to carry on. And he does get in the box, especially as a threat from set plays; he's not Andros Townsend.

Paul Pogba - Premier League - 2016_17

It is also clear enough that Mourinho is happy enough for Pogba and Zlatan Ibrahimovic to take the lion's share of the attacking work in this side. Despite a broadly more defensive role for United than for Juventus,  Pogba to Ibrahimovic (19) is the most common key pass in the league this year, and Ibrahimovic to Pogba isn't far behind (14). This chart also reflects where he is most usually positioned.

pogba to ibra

Creatively, he's hitting numbers ahead of his time at Juventus. His key pass rate of a shade under two per game is career high, and his passes into the final third have risen to around 22 per game ahead of no more than 16 while in Italy.

At least part of this is a function of seeing more of the ball and at times being asked to play a more disciplined midfield role: Man Utd Pogba is getting through nearly sixty passes per game compared to Juventus' Pogba's rate of nearer forty. Both Juventus 2015-16 and Man Utd 2016-17 are 500+ pass per game teams, yet Pogba is now more involved and it hasn't decreased his attacking involvement, the opposite is true. Oh and he's logged more completed throughball shot assists this season than any other player in the big five European leagues too (9).

The only area in which his game has declined is in output, half of which is inevitably outside his control; he can't affect whether his team mates finish the chances he makes. He has three assists in the league--all to Ibrahimovic-- yet the chances he's created can be valued closer to six goals.

Assists can be a notoriously volatile measure and we've seen clear examples before. One being Christian Eriksen's 2014-15 season in which he recorded two assists from 84 key passes in over 3000 minutes of play. Sure enough, his subsequent seasons have seen the outputs to his creativity return, and Pogba's Italian tenure shows him consistently log assists from a decent volume of chances.

On top of that, the whole team has been struggling to hit a positive skew from it's shooting. United take a ton of shots--17 per game--but are the lowest scorers of the big six with just 42 goals compared to Man City's next worst 54. Nobody in the team is running super hot with goals, not even Ibrahimovic who is only slightly ahead of expectation. Antony Martial and Henrikh Mkhitaryan are slightly ahead too but each has played limited minutes. Jesse Lingard's net-busting effort at the weekend finally saw him break his league duck this season and it feels like Mourinho's inability to settle on an attacking unit beyond Pogba and Ibrahimovic--except perhaps to reject Rooney--may have had an effect on the impact of his support men.

Other teams have enjoyed the hot form of their attackers; Chelsea have Diego Costa and Eden Hazard, Tottenham have Harry Kane and Dele Alli, Liverpool have Sadio Mane, Arsenal have Alexis Sanchez and Olivier Giroud. All have shown great form this season but have each landed a mile over their expected goal rates. Nobody at United has to that extent.

That's football.

Paul Pogba turned 24 last week.

United paid the premium to get an all round midfielder for his prime years and at a new club with a new manager, his first season has been solid. Expectations of huge output may go alongside his fee, but fail to understand his strengths and what type of player he actually is. He isn't a one man attack like typical world record transfers and he never will be.

The squad is still in transition and it's likely that the summer will see another rash of big money talent through the door. Ibrahimovic may have been a sticking plaster for their attack, but central midfield is locked down with Ander Herrera looking a good bet to continue alongside Pogba in the seasons to come.  Next time an outlet runs a negative Pogba piece, or some stats get listed and distorted, recall that it's probably not his actual performance that is driving the hit, but a wide variety of prejudices.

The truth is he's doing just fine.

__________   @jair1970

StatsBomb Podcast: March 2017 #2

[soundcloud url="" params="auto_play=false&hide_related=false&show_comments=true&show_user=true&show_reposts=false&visual=true" width="100%" height="450" iframe="true" /] Ted Knutson and James Yorke field questions from listeners and talk about new metric xG Chain Downloadable on the soundcloud link and also available on iTunes, subscribe HERE Thanks!   Theo Walcott radar gif:   walcott_radar_gif_12-17_360   Ross Barkley, shot maps:   Ross Barkley - Premier League - 2015_16 Ross Barkley - Premier League - 2016_17  

StatsBomb Goes to Sloan Sports Analytics Conference

For 2017 I was invited to attend the Sloan Sports Analytics Conference as a panelist on their soccer panel. Despite being a huge fan of the conference, I had never attended before as a combination of busy work period and frequent child births meant this was a really complicated time of year. For 2017 I had intended to go anyway, as two of my friends castigated me for not attending the year before, but it was great to be invited on a panel. What follows is a report of my trip. It will likely end up a bit fanboy, a bit analytical, and a bit reflective on how far the world has come since I originally got involved in sports stats back in 2005. What it should not come across as is a humblebrag, as I know exactly how fortunate I have been to get to where I am, and I understand exactly how tiny my own contribution is to the greater sports + stats universe. Aside from eSports, soccer is the sport that makes up the tiniest proportion of the conference agenda, and I was very lucky to be along for the ride. Thursday  dmorey_belichick_part2 This is me at 7 in the morning on my way to Heathrow. It's a follow-up from a joke I made that we should be able to dress like Bill Belichick as an homage to his greatness, since the conference is held in Boston. Morey is a) one of the pioneers in basketball analytics, b) one of the two co-founders of the conference along with Jessica Gelman, and c) a charismatic ball buster who happens to also be the long-time GM of the Houston Rockets, a team whose current style clearly illustrates the dramatic shift in how analytics has changed how teams play basketball. Anyway... the reason why I'm showing this picture is this. While working on an interesting tracking data in football/soccer problem that came from an email from my panel moderator Andrew Wiebe, I left my suitcase on the train. And I didn't realize until the doors were locked on my next train to the airport. Which meant the only clothes I had for the trip at this point were ones that would have left me dressed like Bill Belichick. Awkward. Fast forward a bit and I am half awake and about to board the train to my terminal at Heathrow and I see this person, very confused about which side of the train she should use to go toward immigration. emrata_oscars There were many celebrity sightings at Sloan but none of them were professional models. Boston Yay, Boston! Oh shit, I have no clothes. I may have to go to my panel dressed as mighty Bill. So the first thing I go do is buy a weekend's worth of clothes, and then I grab some food before heading to the speakers' networking event. I am not a great networker, especially when it comes to a room of people I have never met before. Guy who is good with numbers and data is slightly uncomfortable in a room full of strangers? Complete shock, I know... So I mostly stood around and people watched, which is also a fun activity. Shane Battier and Luis Scola were in attendance, which was amazing because they actually tower over everyone they might talk to. Puny humans. I briefly said hi to Daryl Morey, but since I was not dressed as Belichick, I assume he had no idea who I was. I did know Hendrik Almstadt though, since we'd talked before and he was on my panel. He has fascinating stories about his time both at Arsenal and Aston Villa. We mourned a bit of Arsenal's season and considered what would be interesting to discuss tomorrow on the stage. I went to grab another water, as I am not drinking at this point because I am desperately trying to stay awake, and when I came back he was talking to someone else. "Ted, this is my friend Sam. Sam, Ted." "Sam...?" I instinctively reach down to flip his name tag over since it was facing the other way. "... Oh, Hinkie! Uh, yeah, wow. I didn't know what you look like in person, somehow, but I've been following all your work for basically ever. Nice to meet you." I am nothing but awkward social interactions. I also talked a bit to Sam Ventura, who is a professor at CMU, a Stanley Cup winner with the Pittsburgh Penguins, and who also does some cool stuff with modelling NFL play-by-play and R. Sam teaches data visualization and is not exactly a fan of radar charts. He gave me some stick about that - mostly fair, though I did actually study data vis pretty deeply before developing these and I know they have weaknesses - and we talk a bit about how he remains so impossibly productive. Hockey Sam is good people. Since I don't know anyone else and am now terrified of alienating everyone I look up to in the sports world, I bounce off to the Soccermetrics meetup Howard Hamilton organizes every year. It's a great place to meet and/or catch up with dorks from the soccer/football stats world. This year's crowd is loaded with luminaries that you may or may not have heard of, but who I keep fairly close track of because they are generally really fucking smart. Guys like Chris Anderson, David Sally, Blake Wooster, Devin Pleuler, William Spearman, two guys from Second Spectrum, Ian Lynam, Mitch Lasky, Daniel Stenz, Padraig Smith, and Sloan rock star Luke Bornn (plus many others I am assuredly accidentally omitting). Luke might be a guy you haven't heard of, but every year he and his graduate students churn out amazing papers here at Sloan, including what remains to me one of the coolest pieces of sports research I have ever seen, and one of the main paradigms for how I view both basketball and football. As if being an actual genius weren't enough, he's also incredibly nice, funny, and good looking. In other words, he is utterly hateable. He is also Canadian, which makes that impossible. Some people... The Conference - Friday Expected Attendance: 3500 people. For a Sports. Analytics. Conference. In America, the nerds have won. Coming to Sloan as an American is likely a very different experience to coming here if you are European. As an American, there are sports and media celebrities absolutely everywhere, to the point that it's almost overwhelming. This was amplified for me because I was lucky enough have a pass to the Speaker's room. This is a quiet place away from the crowds in the halls and presentation rooms where speakers can have some quiet time to chill out, talk to friends away from public ears, or review their notes in a panic. It's also almost constantly filled with celebrities. At one point there was a table of guys shooting the shit about NBA that included Morey, Zach Lowe, Celtics AGM Mike Zarren, Tom Haberstroh, and former MIT Blackjack Team and current Twitter head of analytics Jeff Ma. (Ma is the basis for the main character in the book Bringing Down the House and the movie 21.) So much fire power. Sleep_study_alcohol_slilde_sloan_2017 I saw a few panels on Friday and managed to catch three really great talks, including Seth Partnow's Truth and Myths of the 3-Point Revolution in Basketball, an awesome Sleep Science talk, and Ian Lynam's highly entertaining talk about ridiculous incentives in English Premier League contracts. I missed the Silver Asks Silver panel because it was on at the same time as mine and Moneymind: Overcoming Cognitive Bias, which I was told was awesome. Thankfully both of these will have video posted at some point in the future. The Soccer Panel: Hendrik Almstadt, Daniel Stenz, Padraig Smith, Ted Knutson, Andrew Wiebe (Mod) sloan_soccer_panel_2017 Hendrik and Padraig either currently are or used to be Sporting Directors. Daniel has worked for Koln, Union Berlin, Vancouver, and the Hungarian National Team. Wiebe works hard in media for MLS, and has moderated this panel before. Feedback from the panel was really strong. One of the dangers of putting people currently employed on this panel is that they don't say anything. So they might know a lot of cool stuff, but are terrified to answer questions with any insight. That didn't happen here. I was really impressed with the level of detail both Smith and Stenz discussed recent or ongoing projects, including how Colorado uses data in recruitment. Hendrik and I were happy to talk about these things (Hendrik currently works for the PGA), so maybe that gave Padraig some freedom to express himself, but that often never happens on other panels. Some key points:

  • The marriage of tracking data and event data will be a key to unlocking the future. We need both data sets and we need to be able to analyse them as part of the complete picture of what is happening in a game. I also feel very strongly that we need it not only for games, but for training data as well. The reason for this is that football games only produce a moderate sample of data for us to analyse, but training happens constantly. You'll have to be careful about what training data you include as having valid incentives, but it will dramatically increase the sample size and our ability to evaluate player skill sets as a result.
  • Right now data analysts work for clubs but often are wedded to the coaching staff. Their direct line of report is to the manager, which means if the data is starting to say uncomfortable things about performance, or the coach disagrees with it, it often gets muted or is considered wrong. That's definitely not how it should work. That's especially true because analysts tend to build a large store of institutional knowledge that is valuable to keep inside of a club, and not be chucked out every time the club changes a manager or a head coach.
  • A combination of football people asking questions and quants then finding the answers, then sending them back to football people for a sense check, is another key to unlocking useful things in the sport. One of the problems we see again and again when researchers without game expertise get involved is really brain dead studies like the one Garry Gelade discussed at the OptaPro Forum, or mistakes made in interesting studies that ruin their credibility. Tyler Dellow flagged one from this past weekend in a hockey research paper where it had inverted which side certain players play on - an extremely easy mistake to make in programming. Unfortunately this type of mistake means the study would be blown apart the moment you brought it to a coaching staff, regardless of how good it was.
  • There is still plenty of low-hanging fruit out there for football clubs to take advantage of, they just need to open their minds and talk to the right people. It's very straightforward to create the edge in recruitment, and the money it saves by avoiding mistakes is massive. If I can figure out the set piece edges, so can plenty of other people. I'm not that smart, and it's not that difficult, which is why I have historically avoided the topic. The same is true in so many other areas. Football is too big and too complicated a sport for one person to be an expert in all the different facets of the game. Because of this, clubs need to constantly seek out new information and perspectives.

I thought Wiebe did an excellent job moderating, and feedback from the audience and Twitter was that people were genuinely excited by what we said. I guess that's a success. Friday evening I hung out with more soccer people including seeing some of the guys at US Soccer again. Federation challenges are really interesting, and there's only a moderate amount of overlap with what we do at the club level, so it's always intriguing to discuss what their future might look like. Saturday When I go to the U.S. these days, my body never adjusts. Too many years of children and 6AM wake ups mean I never sleep past 8:30 here in England, which means I rarely sleep past 4AM when I visit America. Thus I had breakfast early and was chilling in the speakers' room waiting for the gambling panel to start. While most of you probably know me from my work on football analytics, my "real" expertise is gambling. I have spent the vast majority of my time since 2005 in and around the world of sports betting, including 8 years at, and two years inside of Matthew Benham's Smartodds operation. (We worked on football and for the football teams, but we sat in the middle of a giant gambling operation.) I started at Pinnacle in early 2007, just after they left the U.S. market. While I was there, we either created or completely rebuilt all of their non-U.S. sports departments, and I fought for a year to create the Live Sports department, which means I was overseeing development and application on the initial models for Live MLB, NBA, Soccer and Tennis. I am likely one of the only people in the world to work for a long time inside one of the giant discount books as well as in and around one of the world's biggest betting syndicates. In gambling, unlike in football stats, I'm a strange sort of unicorn. The funny thing is, no one knows this. Pinnacle people rarely travel, and we never have a forum to talk about our work. Thus when you sit down next to even seasoned gamblers like Ma and introduce yourself, you can expect a deluge of really interesting questions. Example: How much volume and profit did Pinnacle have the year before they left the U.S. Market? Example2: Why did the stupid owners get arrested? I also talked a bit to ESPN vet Chad Millman and Joe Brennan Jr, who is apparently a fan of the StatsBomb podcast. My takeaways from the gambling panel

  • Sports betting is a game of skill. This is especially true when the vig is low. (My words, not theirs.)
  • The U.S. still has no idea when/if/how they will legalize sports betting. This remains baffling to me since the NFL spreads are talked about in every major news organization in the country, but you still can't bet on them outside of Vegas.
  • Because of the above, the probability that they legislate in a way that makes it possible for a discount (translation: low price like a Pinnacle, IBCBet, SBOBet) sportsbook to exist is fairly low. Margins in Vegas are 5% on average (-110/-110 in American odds). Margins at discount books are often 1.5 to 2.5% and they take all action, which means they tend to earn considerably less than that per dollar wagered. Because of this, if you tax them in the way of sin taxes and take a percentage of gross revenue, you make it impossible for them to exist in your jurisdiction without a dramatic price hike. Europe is a mess of disjointed gambling laws that typically screw customers in favour of tax receipts, which in turn drives them toward better priced options which are quasi-legal where the government then does not get tax receipts. Because Vegas exists already, there's a chance the U.S. may end up with something sensible that allows competition, innovation, and is customer friendly, but I wouldn't hold my breath.
  • Because the U.S. doesn't have legal sports betting, the sports data feeds are pretty poor. The reason behind this is that there is no value equation available to them to make it real-time. Gambling would fix that because the amount of money books can lose from past posts to people with better/faster TV feeds is stupidly large.
  • Internet sports betting is a tech industry. There is no other tech industry where the U.S. has simply chosen to be absent.
  • In-running betting typically increases fan engagement. In Asia, in-running volume can often be 3-5 times what the volume is on pre-game (they didn't say that but I know this from experience). Worried about falling viewership numbers on TV for various leagues in the U.S.? That could potentially be halted or reversed with legalized gambling.
  • The other thing that legalized gambling would do is drive massive amounts of advertising dollars straight to the networks and sports teams/leagues themselves. We saw it briefly with Draft Kings/Fan Duel, but that would be a drop in the bucket compared to what would happen with an open U.S. market for sportsbooks.
  • No one realises how massive eSports gambling is just now in the rest of the world, or could be in the U.S. if it goes legal. The audience for these events is already huge, and it crosses over to people traditional sports often don't, which is a new market segment for sportsbooks.

Maybe if I'm lucky, I'll get to go back to Sloan next year and be on the gambling panel and tell stories about painful, sometimes expensive lessons in modernizing on of the world's biggest sportsbooks for the digital age. So yeah... I started the day shooting the shit with Jeff Ma, which was very cool. I also had the briefest chat with Boston and U.S. sports media legend Jackie MacMullan. Jackie's one of the few female voices we saw on sports TV in the early days, and was on ESPN and at the Boston Globe even when I was a kid. Her work has always been superlative. Now that I have some perspective, I appreciate her even more than I did when I was younger, since I suspect she had to go through a lot of miserable bullshit along the way simply because of her gender. It's hard to convey what these people mean to you in a five-minute conversation, but I feel lucky to have talked to her at all. The same is true for Brian Kenny, one of the journalists who pushed baseball stats constantly throughout his career. He's a voice I listened to for so many hours watching Baseball Tonight on those long, hot summer nights, and I got a high five from him in the green room. Shortly after that, Mark Cuban walked through the halls literally hushing conversations as everyone whispered, "OMG is that Mark Cuban!?" I was also able to converse with contemporaries like Bill Barnwell, Bill Connelly, and Tyler Dellow (mc79hockey), who I was told is the only panelist on the weekend who definitely had a higher usage rate than I did. Sloan is a smorgasboard for meeting cool, smart people from the students through the titans. When I was in high school, I was not one of the cool kids. I was a good athlete, but also sang, acted, competed on academic teams, liked comic books, science fiction, and had an odd, goofball sense of humour. It was fine, but there were very few people who just "got" me. I went to a high school out in the country, and at some point I wondered if this was what the rest of life was going to be like. Then I visited the Naval Academy for a week as part of a science camp/recruitment visit. There I met so many other bright, funny, allegedly weird people that I stopped worrying that I'd never fit in. I now had proof that I'd likely find a comfortable social group when I went to college, and I did. As odd as it sounds, Sloan was a lot like that for me. It's filled with like-minded people who love the application of stats, data, and sports. It's also filled with incredibly smart, eloquent people talking about their research in these areas. Many of these people tell war stories about the difficulties they faced bringing their research to their game. They (generally) overcame. In American sports, baseball is largely viewed as being mature from a stats and data perspective. The big topics have largely been solved, and the edges are mostly in the margins and application now. Basketball is more complicated from an analysis perspective, but they are tackling fascinating tracking data problems and hoovering up some of the brightest minds from Harvard and MIT every year. American Football and Hockey are slower on the uptake and more challenging from a cultural perspective, but they are moving now as well. Football/soccer is very much in the difficulty phase. It's hard to get in the door. It's hard to get past the skepticism to even have a conversation. Football lifers look for any possible reason to pick things apart because they don't want to have to deal with this different perspective on the game, because they don't understand it and they don't want to understand it. Not everyone is like this, obviously, and the times, they are a' changin'. But change has been surprisingly slow in the last five years when the value of the work is almost slap-in-the-face obvious. That's why this trip to Sloan for me was absolutely amazing. Talking, arguing, sharing stories with bright people in other sports once again showed me that things will be okay. Eventually. Talking to Seth Partnow and @causalkathy about data versus theory paradigms was great. Finally getting to meet Dellow and hear stories from inside hockey and bleeding on stage during his panel was also great. Hearing from Devin Pleuler about some new initiatives Toronto FC have in the off-season, and then having Devin walk into the bar with his Sloan Hack-a-thon trophy for NBA tracking data was genuinely special. I had one of the best weekends of my life in Boston, and I fell in love with sports analytics all over again. Thanks Daryl Morey, Jessica Gelman, and all the students at Sloan for creating and producing a fabulous conference that I hope to attend over and over again. --Ted Knutson @mixedknuts Post Script I was walking to dinner with my friend Worth Wollpert and we saw Mike Zarren from the Celtics getting ready to the leave for the day. Zarren is one of the most generous guys in the analytics sphere when it comes to lending his time and expertise to people - even weirdos from completely different sports - and no one I have ever met has had a bad word to say about him. In fact, a lot of people you never expect are connected to Zarren because he helped them. In short, Mike Zarren changes lives. So, we stopped and I went out of my way to mention that to him, even if it made him feel just a bit awkward at the time. Sometimes you have to make a point to tell people they are a good egg, and to keep up the good work. Sorry Mike! PPS Special thanks to Paul Carr for helping me get on the panel in the first place and also for guiding me through all the stuff he does at ESPN. It was great to meet you and all your hair.

Arsenal's Defence and Monaco's Crazy Attack

There’s a difference between using statistics to inform and actually performing analytical tasks. When this column was weekly it was predominantly the former, and even now that it is more occasional, it feels like few others have really taken the baton and tried to draw out trends, traits or differences that become clear when you stare for long enough at a wall of data. Gamblers will have, but they aren’t prone to writing blogs about the edges they have discovered. Regardless there are still stories that present themselves on a week by week basis, and provide greater insight as to how or why teams and players are doing whatever they are doing, and why it may or may not continue. So here’s a couple that have caught my eye just recently.

  1. Arsenal’s defence AKA simple truths

The average team in the Premier League has allowed 4.3 shots on target this season.

Arsenal in the first half of the season allowed 3.6 shots on target per game and conceded a goal a game meaning they were tied 7th for the shots and tied third for the goals.

Since that point, they’ve played seven and gone 3-1-3 while allowing 1.7 goals per game and 5.6 shots on target per game. These rates, over this relatively short period rank 15th and 18th respectively. Petr Cech has come in for some criticism, which is perhaps harsh given that he’s 34 and his workload has just gone up by over 50%. The volume is the problem, even in this short period, especially when held against rivals.

We can spin it back across the season. If 4.3 is the average rate of shot on target allowed, a top side is doing badly to exceed this with any regularity. Arsenal have allowed five or more shots on target on 12 occasions this season. In comparison to the rest of the top six, this is standout bad, the rest have allowed that on between four (Man City) and seven (Tottenham) occasions.

Also the trend is poor. It’s occurred seven times since the start of December in just 13 games. Arsenal are simply not protecting their goalkeeper adequately to an extent that they need to if they are to match their ambition. Again rival teams kill them in comparison. Man City last allowed a team to land five shots on target against Leicester on the 10th of December, Man Utd last did on the 4th of December against Everton. Chelsea have twice in three months both against top four rivals, Liverpool three times er… not against top four rivals, and Tottenham four times, three times against rivals and once against Hull. Arsenal have suffered against all and sundry. This seven game stretch has faced visits to Chelsea and Liverpool (both games qualify) but also they've failed to adequately protect Cech in the Watford defeat, the draw at Bournemouth and the insane late win against Burnley. These teams shouldn’t be laying a glove on Arsenal’s defence, yet they are and turning their porousness into results.

There's also the small factor that as the team has become unable to restrict the opposition, the volume of total shots in their games has increased. At the halfway point, Arsenal games featured around 25 shots per game and they had 60% of them. Since then, they are averaging nearly 28 per game and they have only 54% of them. This implies again, to some degree, a lack of control and the dial moving in the wrong direction.

Yet another factor is in their overall shot profile for distance. Last season Arsenal took the closest shots in the league and allowed shots from the furthest away--it looked as though talk of expected goals influencing their on-field play was legitimate. This season, they still take the closest shots of any team in the league, but their opposition are taking their shots from an average of 1.5 metres closer. That's huge. They now rank 17th here and only Bournemouth, West Ham and Swansea are worse by this measure. So: the opposition are getting closer, which allows them to get shots on target more frequently and Arsenal are conceding more goals. Their current goals against rate of 1.2 per game is only exceeded once by Arsenal in the last 16 seasons, in 2011-12. Who is to blame? Francis Coquelin? He usually gets it. What about new players? Has Granit Xhaka done well enough? Is Shkodran Mustafi a reason? Has the manager lost sight of his tactics? Is there a wider malaise?

In another season, the traditional mid-season blip--and this is at least the third season in a row Arsenal's performances have dropped off at some point between the halfway point and the end of the season--might not be so vital. This time round though, the quality of their rivals and their ability to rack up the points means that this top four race is the most keenly contested in years. Arsenal need to pick up quickly, or an ignominious fifth or sixth place finish is going to loom large. The knock on effect from that would likely be a summer of turmoil.

2. Monaco

The big numbers story across this Europe is the insane jet powered Monaco attack. They’ve now scored 72 league goals from 378 shots, a rate of 19% and added on a perfect ten from ten penalties. Here’s a list of teams that have scored above 15% in a big five European League since 2010-11:

big teams conversions

A few quick takes here: this is from a list of 686 club seasons. Just TEN of them landed at over 15% conversion. Five of these seasons are Barcelona seasons (their other two ranked 11th and 32nd), so Lionel Messi era. Then you have two Real Madrid years, so Cristiano Ronaldo era. Then Zlatan Ibrahimovic's 38 goal PSG season and Bayern Munich's all conquering 2012-13 season. The idea here is pretty strong, right? Mega clubs firing on all cylinders powered by world stars.

Monaco are different, a well drilled, fast team who are structurally very strong and well competent within their league. But: they don't take that many shots (~13 per game). If we divert into an expected goals model and drill down to a per shot basis to allow comparison to a part season, they are ahead of every other team in this 686 team sample, one of only two teams to be more than 0.05 goals per shot ahead of expectation in either direction, Tito Vilanova's 2012-13 Barcelona team, the other. And they aren't just slightly ahead of that mark, their per shot xG overperformance is at 0.08 per shot. Even allowing for some generous error bars here we're into unchartered territory. Monaco are pulling up alongside a newly discovered sun and its seven planets. Adding to the confusion their expected goal per shot rate is lower this season (0.10) than last (0.11), when they broadly matched their rate. So we have a team that takes a non-exceptional volume of shots, with a non-exceptional expectation of scoring them.

Let's go back outside the model. If we're going to be generous there are good aspects here. We can see here that they have a spike of shots around 10 metres out from goal, and a preference towards shots from closer in, compared to the opposite extreme, Angers:

density ligue 1

But there is nothing here to imply a multi season outlier. Even if we presume that models are missing aspects of Monaco's play: they play with great speed, their wingers are excellent crossers of the ball, Kylian Mbappe might be a world class talent, Radamel Falcao may well be a world class poacher, it doesn't explain all of their overperformance. It can't. Even if we presume that all these factors accounted for just half of it, they would still be one of the top ten over performing teams out of 686 regardless of whether you model it or not.

Would any other team on that original list above ship five goals at Man City? They would all probably score three! That's a bit tricksy, but it does focus the idea around what Monaco have achieved: it seems hard to believe that it is anything more than the mother of all positive goal skews.

Now ~700 club seasons is a reasonable but not gigantic sample to compare. The more seasons we witness, the more likely we will see weird outliers at both ends. Twenty years from now when we have nearly 3000 club seasons to look at, we'll likely spot a few more like this Monaco team. This does not mean that simple shot analysis is non-informative, nor does it mean an expected goals model is faulty (beyond it's obvious limitations). It just means that sometimes shit just happens, and a football season is short enough that a full reversion may well not take place.

Leicester City's 2015-16 season is the worst and best example for many things. Among others--and whether this is good or bad will depend on your perspective--it gave the general public belief that anything can happen in sport way above and beyond any realistic probability, but it also absolutely cemented the idea that extreme positive skews can occur for long enough in the space of a season to see a team home. It again highlighted why these type of stories are fascinating, because the reasons why these clubs bounce forward are locked up in the intangibles. We see the outputs, and we know that there are elements of luck involved, but no coach can plan for an extreme skew.

And this is where the Monaco style season is a slight hoodwink, because it distracts us from other more reliable stories.

A manager can make plans and hope to improve his team's shots structure. He has some degree of control there. That growth is more sustainable over time if achieved. What of Atalanta? Under Gian Piero Gasperini, the club has gone from a team that never recorded a positive shots total across five seasons and finished between 11th and 17th each time to battling for a European slot, buoyed by solid metrics. They've taken 58% of the shots in their games and can even consider themselves slightly unfortunate having undershot the defensive end of their expected goals volumes by a small margin. The return might not be so shiny, but it looks as though their improvement is potentially sustainable rather than just a skew. Monaco are definitely one of the best three sides in France but their goal total makes them look like world beaters, Atalanta look like a top five team in Italy because they are probably a top five team in Italy.

The message here is to try and take a wider view. It's the same message this site has been putting forward since its inception in 2013. We're looking to round out the picture and reveal aspects that might otherwise be concealed. Twenty years ago the analysis of Monaco would have started and finished with "they are a brilliant attacking side". Now we are armed with a ton more information and techniques that can challenge whether or not that statement holds or whether they will sustain. The same discussion revolved around Marcus Rashford last season when he blew onto the scene with a glut of goals. Like Monaco he had all the outputs but mystifyingly few of the inputs: the shot volume just wasn't there, and seven goals from his first 26 career shots (26% conversion) was deceptive. Sure enough a season on, and he has recorded a further three goals from 33 shots (9%), a far more realistic output for a young winger/forward type that matched his shot profile. Outlying players and teams are fun because they almost invite the challenge of analysis and cases like these occur across any league. Finding reliable signals are the challenges of analytics but when a team like Monaco is so far off the charts and the reliable metrics aren't singing the same song, you can be pretty sure that they will come back down to earth eventually.

"But Leicester" does not invalidate that.


Find me on twitter @jair1970