Seattle Sounders vs FC Dallas: A case study on using data to prepare to administer the eye test
Over the weekend the Seattle Sounders beat FC Dallas 4-3. It was a rollicking affair that went into extra time with Jordan Morris completing his hattrick and scoring the winner in the 113th minute. Afterwards, Steve Zakuani said this about the match.
That wasn’t particularly in line with what the statistics of the match suggested. StatsBomb, for example, had the expected goals at 2.98 for Seattle to 1.59 for Dallas. Zakuani didn’t agree with the stats.
Given his clarification I don’t think there’s anything particularly wrong with what he’s saying about stats here. Football is a game of tiny sample sizes, all good stats work is a matter of interpretation and context. Doing analytics right is a matter of improving the guesses in the guesswork. Beware of anybody selling absolute certainty about anything.
But, it did get me thinking about what using stats in concert with your eyes really means on a granular level. What does it mean to reconcile an assessment of this match with stats that tell a dramatically different story. It can’t just be a matter of saying, “stats are good unless I disagree with what they suggest, then I’ll ignore them.” Usually when I write about a match or a team, I make sure to have watched them. It’s exceedingly rare for me to form a strong opinion based on numbers alone. Frequently, however, I’ll be watching things after I’ve already seen the stats for the match. That is, I’m filtering my eye test through what the stats have primed me for, letting the numbers set my expectations. Today, rather than watching the match and writing about it, I’m going to write about what that process looks like. Given the stats I have at my disposal, and Zakuani’s interpretation, one which was echoed by Matt Doyle who knows more than more or less anybody about the goings on in MLS, let's walk through the process of marrying that assessment to the numbers we have at our disposal.
Okay, enough throat clearing. By xG, this match was extremely not close.
It was a dead even affair until the 74th minute at which point Seattle really pulled away. The mechanics of what happened seem pretty simple. Jordan Morris scored a massive chance from a corner.
That was followed quite quickly by two pretty good chances from open play. First Victor Rodriguez got this shot off.
Then Raúl Ruidíaz had this shot blocked from point blank range.
From that point on, Dallas had 11 shots, but not a single one, not even Bryan Acosta’s equalizer was worth even 0.10 xG. Even though the game was tied until late, from an xG perspective that span from 73-78 minutes is what sealed Seattle’s claim to having “played better” from an xG perspective.
So, when watching the match I’d want to keep an eye on those chances and see if they were, in fact, as strong as xG believes they were, I’d also want to look at Dallas’s flurry of attempts and see if maybe xG was missing something, and if perhaps Dallas’s chances were more threatening than the calculations appeared to show. In a single game, and especially when looking at a single chance, the error bars around football analytics’ most fundamental building block can be big. Still, that overall xG gap is huge, and I’d expect to walk away from watching the match thinking that even if it wasn’t perfect, it more or less captured a real gap in the value of chances created by the two sides.
However, that’s only the beginning of the story. Next, I’d want to look at the teams’ post-shot xG. That is, rather than the measure of the chances created, I want to look at a measure of the quality of the shots that the teams actually took. This is a less predictive measure of a team’s future performance, but it will more closely resemble the actions that took place on the pitch. There are good reasons to prefer xG to post-shot xG when analyzing a team, but given that what we’re trying to do here is both big picture analysis, and garner a small-picture understanding of what went on on the pitch (regardless of its predictive power for the future), it is helpful to look at both.
And here something wild jumps out. Seattle’s post-shot xG of 2.92 is fairly in line with their standard xG, but Dallas has a crazy different total. Their post-shot xG is a massive 3.14, way above what their standard xG looks like. On average, Dallas might have created a lot less than Seattle did, but they did the absolute most with it.
Matt Hedges 63rd minute header pops out as a gorgeously headed ball from the kind of chance that is rarely scored.
Paxton Pomykal's effort just before the final whistle is similarly notable.
So, how do we reconcile those two numbers. Given the chances created we’d expect Seattle to win this match a lot, somewhere around 72% of the time. However, given what both teams did with those chances it’s reasonable to look at this match and think this easily could have fallen in the other 28%, and Dallas, given how well they struck the actual chances they created, were actually unlucky not to get a little bit more from this match. The analysis then when watching the match is examining that gap. Sometimes a team just hits the ball really well for 90 (or 120 minutes) and there’s not a lot to analyze, but other times there will be reasons that those specific chances appear to be better than xG understands. That’s a major question to answer when watching the match.
And this is only looking at shots. You can repeat the same kind of analysis for passing and possessions. We know, for example, that Dallas had more of the ball, completing 568 passes to Seattle’s 457 and completing them at a higher rate, 81% to 77%. So, even before watching the match, the question becomes, if it’s true that Seattle created the better opportunities, why is it true? Were they doing something specific on the defensive side of the ball to foil Dallas’s ability to turn possession into good shots? Or was the problem one of Dallas’s own making? Or neither. Maybe Dallas simply got unlucky with a number of incisive move finishing passes just narrowly missing. Again, this is a specific area of the game to watch closely, determined by looking at numbers beforehand.
After looking through the numbers I’d approach watching the game and determining who played better the following way. Going in I’d expect that Seattle would be pretty pleased with their performance. The fact that they walked out of the match with a pretty strong xG edge suggests that they did the kinds of things in this match that are repeatable match to match and week to week. The one major caveat I’d have is that I’d want to watch the defensive side of the ball closely to see if the divergence between Dallas’s xG and post-shot xG was due to mistakes they were making.
Dallas, on the other hand, is a trickier question. The first thing I’d be looking for is their finishing. I’m looking for reasons that their post-shot xG diverged so heavily from their normal xG. There may not be any, it may be they just got hot and it almost bailed them out in a match were they were decidedly the worse team. On the other hand, it’s certainly possible that a handful of the shots they got off were, in fact, deceptively good attempts. The fact that, on average, xG evens out doesn’t mean that you can’t gain fleeting edges in a match (or have other teams gain fleeting edges against you) thanks to either smart tactical planning or winning individual matchups, or any one of the hundreds of things that go on in a soccer match. Of course, that stuff evens out over the long run, but we’re not analyzing the long run.
Secondly, I’d want to watch Dallas’s possession in the final third. Determining whether they played well will rest heavily on figuring out why they managed to create so few shots that were actually highly valued by xG. Teasing out the answer to that question, and whether to blame Dallas, credit Seattle, or simply shrug and raise your hands to the soccer gods, will go a long way to determining exactly how highly to value Dallas’s performance.
I admit that I’m skeptical that when I watch this match I’ll walk away thinking Dallas played better. I think that by far the most likely conclusion to draw is that Dallas’s somewhat sterile possession and hot shooting combined with Seattle missing a handful of good but not great chances to give the impression of a match that was much closer than it was. That’s not a definite conclusion though, it’s a hypothesis formed by data. Forming a strong hypothesis is important, but you also have to then go and test it. And that’s what watching the games is for.