Arsenal Aren't This Good. Nor Are Dortmund. Here's Why
Once in a while, we here at StatsBomb have to be the unfortunate bearers of bad news. We prefer to deliver good news like, "check out this great young player" or um... James Milner. Or "wow, this team/player has been crazy dominant and waiting to explode." But sometimes we have to step in and combat narratives espoused by really smart people, especially during the early season.
I've got my battle gear on right now, so here we go.
Arsenal Are Not This Good
Look, I'm an Arsenal fan. Have been for ages. And contrary to Twitter opinion, I actually enjoy seeing them win. But... it's our duty in statsland to be honest about where we think they stand now and in the future.
"The Gunners have won a million games in a row across Europe and the Premier League, and the Unai Emery project is going swimmingly!"
Not so fast, my friend.
The above is a chart of expected goal (xG) difference vs actual goal difference from 2012-13 until now. We're using Opta data here to give a big fat chunk of history across all the Top 5 leagues for analysis purposes. The teams in blue are current season teams and the teams in red are historic ones.
Now note the blue teams way above the line like Borussia Dortmund, Arsenal, Sampdoria, Hertha, Alaves, and to a lesser extent PSG. Those teams are outperforming Expected Goals at a level that has never been done in the data. Ergo they will almost certainly revert to the mean.
Arsenal's expected goal difference thus far in the 18-19 Premier League season is right around zero. Their actual goal difference through eight matches is +9. That's a gap of over a goal per game, which is massive. Arsenal won't continue to win this many matches unless their underlying performances somehow dramatically change for the better.
Now xG isn't perfect. In fact, the limitations of traditional xG are one reason why we we built StatsBomb Data. Recent teams that have had major overprformances in xGD include Monaco 16-17, Juventus 17-18, and almost all iterations of Sean Dyche and Favre teams. However, at this point across many thousands of team seasons in the data set, we have some strong ideas of what levels of overperformance are and are NOT possible.
Scoring on 25% of your shots like Arsenal are doing currently just does not happen across a full season of matches.
The table lies. Especially in the early season. Football has a lot of randomness and a lot of variance, and eight games isn't remotely enough sample for all the "luck" to shake out.
Skeptical people can certainly ask why should they care, and it comes back to the fact that xG is a much better predictor of future performance than almost any other early season indicator like points, shot difference, goal difference, etc. One thing that is very obvious is that Arsenal's performance right now is unsustainable.
I don't want to beat a dead horse about this, and Goodman covered the bigger issues well in his piece last week, but if you are an Arsenal fan, I strongly suggest you savour the thrashing of Fulham this past weekend. Partly because there were some amazing goals and winning always feels good, but largely because there's almost no chance they will sustain anywhere near this level of results for the rest of the season.
And if they do, boy am I sure I will hear about it from fellow gooners.
But What About Borussia Dortmund and Lucien Favre?
Here's where things get weird. Lucien Favre has a long history of high performance in the table with his teams spoofing how good xG models expect them to be. He's done this consistently enough at Gladbach, then Nice, and now Dortmund that I believe his style of play basically exists in all the holes of naive xG models.
However... while Dortmund's underlying numbers are comparatively better than Arsenal's, even a Favre team will regress from these current numbers in the future (especially on the attacking side). One point hugely in favour of Dortmund's title challenge though is how Bayern have plummeted on early season numbers. When Ancelotti was let go, Bayern's numbers looked massively dominant. Now? Merely very good. It's possible the German giants are, dare we say it, almost human? They may finally be vulnerable to a challenge from both Leipzig and Dortmund this season unlike at almost any point since the last Klopp title.
Would I bet on it? Eh... But at least it's possible, and to be honest, any time Bayern aren't 10 points ahead of the pack during a season feels like at least a minor victory for the Bundesliga and its fans.
I'm totally not trying to be a buzzkill to fans in either red or yellow this international week, but the noise on Twitter was such this weekend that I felt I needed to step in with a small dose of realism. xG is far from perfect, but when it comes to evaluating sustainable performance in football, it's still better than pretty much everything else humans have concocted thus far.