StatsBomb Conference 2021: Speakers, Panels, Research Paper Winners, and more…

We’re into September, which means football is once again in full swing. But that also means the StatsBomb Conference is only about five weeks away, and it is going to be fantastic. Today I am going to fill in the details around the event, and also explain just a few reasons why many people think this is the best football analytics conference of the year.

The event details 

  • Date – Friday 8th October 2021
  • Location – Stamford Bridge, London
  • Time – 9.00 a.m to 6.00 p.m.

This event will be in person at Stamford Bridge on the 8th of October. As we did in 2019, much of the conference will either be broadcast publicly, or the talks will be posted on YouTube in the weeks after the event. However, not all talks will be available either live or online, as some of our speakers only have permission to give their talks to the people present in the room at the time.

Basically, if you want to make sure you see all the talks, you’ll need to show up. And we know from last time that the food should be great, so you’ve got that going for you too.

What to expect from the StatsBomb Conference 2021? 

  • Your host – Ted Knutson
  • Key talks from StatsBomb experts – Nicole Kozlova, Ukrainian National Team player and StatsBomb Data Science Intern + one talk from our Data Science Team
  • Industry speakers (see list below) 
  • Networking ops – around 600 attendees from across the football ecosystem 
  • Research paper competition

A stellar line-up of industry speakers 

  • Dr Ian Graham – Head of Research (Liverpool FC)
  • Daryl Morey – President of Basketball Ops – (Philadelphia 76ers)
  • Harry Moyal – Deputy General Manager (Olympique Lyonnais) 
  • Mladen Sormaz – Head of Football Analytics (Leicester City)

I continue to be amazed at the quality of people who agree to speak at our event. Ian, Harry, and Mladen are all giving their own talks, while Daryl will be in more of a fireside chat format. I’ll ask him about his beginnings in basketball analytics, his interest in uh… “soccer” and his famed ‘launch and squish’ preferred style of play.

Not yet on the speaker list is REDACTED, Head of REDACTED at a Champions League club, who will be announced very soon.

Directors of Football Panel

  • Victor Orta – Director of Football (Leeds United FC)
  • James Cryne – Director and Co-Owner (Barnsley FC)
  • Will Kuntz – SVP of Soccer Operations & Assistant GM (LAFC)

We’re also excited to have three key executives from some of the most forward-thinking clubs in world football, who will discuss their experiences in driving an analytics-based culture within their organisations and how the industry is changing.

Research Paper Competition ​​​​

Following an influx of highly impressive submissions for the research paper competition, we are pleased to announce the following speakers and the title of the papers that they’ll be presenting on the Research Stage:

  • Devin Pleuler – Tempo: another expectation model?
  • Maaike Van Roy, et al – Optimally disrupting opponent build-ups
  • Javier M. Buldú – The quest for the right pass: quantifying player’s decision making
  • Hadi Sotudeh – Introduction of potential counter-attack
  • Soumyajit Bose and Manas Saraswat – Turning with the ball and decision-making under pressure
  • Samer Fatayri, Kirill Serykh, Egor Gumin – How to save efficiently: what drives the goalkeepers’ decisions?
  • Max Odenheimer – Splitting GSAA: finding the best shot-stopper for your team
  • Juan Camilo Campos – Determining the phases of the play using graph convolutional networks

COVID Info

What measures are we putting in place? The event will feel safe with reduced touchpoints and lots of ventilation. We are really excited that this is an in-person event and a chance to actually see people again.

  • All attendees must be double-jabbed or provide proof of a negative test taken at most 48 hours before the event
  • We will be providing StatsBomb branded masks upon entry – though the use of these inside the venue is up to the individual’s discretion
  • Detailed information around our Covid policy will be sent to attendees prior to the event*

Basically, the conference will be brilliant and – unlike most conferences in the football analytics space – ours is open to the public. We hope to see you there!

Tickets available here: The StatsBomb Conference Tickets, Fri 8 Oct 2021 at 09:00 | Eventbrite


 

*Stamford Bridge has their own Covid policy which we will be adhering to – the key points have been summarised above. If you’d like to read the full set of guidelines that we’ll be following before, during, and after the event, you can read them here: Stamford Bridge Covid Policy

2020 in Review… but Mostly 2021 in Preview

2020, annus horribilis. In the context of the world, this past year has been simply awful in nearly every conceivable way. In the context of StatsBomb, it’s mostly just been weird. What could have been a lost year ended up making our little data company stronger, but not in ways you’d notice unless we told you. Today I’m going to walk through this very strange year from a StatsBomb perspective and also clue you in on what is coming in 2021.

Wooo, 2020 is great!

On February 25th, we passed our sales goal for the first quarter… And then we didn’t sign another contract for 2.5 months. For a SAAS company with revenue growth baked into the cash flow, that’s a horrifying outcome. Two months of firing on all cylinders and adding new customers everywhere, and then nearly a full quarter of existential dread at a company level (and god knows what at a personal level) due to the COVID-19 pandemic. Internally, we shifted to work from home as soon as we got back from Sloan in the second week of March. We also furloughed some of our staff later in the spring, while nearly all of StatsBomb North took salary deferrals until we could find our feet financially. We cut most of the marketing expenditure (largely content publication), tried to figure out what to collect when there were no football matches being played, and began discussing with investors how we could raise enough money to survive however long *waves hands vaguely* this was going to last.

Courses

2020 was supposed to be a big year for us at conferences and teaching courses in English and Spanish. It still ended up that way, but only with regard to courses, and only because we rapidly converted both existing courses (Intro to Analytics + Coaching and Analysing Set Pieces) and new stuff (Modern Scouting and Data Recruitment) into online versions.

The Introductory course gets rave reviews and has been taken by people all over the world at this point. It is not mathy or technical, and appropriate to anyone who wants to learn a bit more about how the sport actually works. This includes coaches and professional footballers. Thanks to Pablo Rodriguez translating the entire thing, it is also now available en espanol. Set Pieces (built by Euan Dewar and myself) had 100 people sign up for a live course taught by me in April, 70 of whom already worked in professional football including half of the Premier League. I re-opened it for a month late in the summer due to high demand, but it won’t go live again until after Euros 2021 at the earliest, and maybe not at all. I believe that people who paid to learn set pieces from us deserve time to build their edges. Though teams haven’t made football weird yet, it’s finally trending that way.

The Set Pieces course is great, but the Recruitment course James Yorke put together might be our best work. It touches on nearly everything we have learned from six years of doing this professionally, and delivers new ideas and frameworks appropriate to both seasoned professionals and absolute beginners in the football recruitment space.  Ever since I founded it as a blog, StatsBomb has always had a mission to teach the world more about football. The courses are the best vehicle we currently have to do this, and we remain dedicated to this mission. We’ll look to translate Recruitment to Spanish and potentially all the courses into new languages in 2021.

May-June 2020

Back to the global pandemic… unlike so many other businesses in this year, we were lucky. The return of the German Bundesliga in mid-May followed by nearly all the other leagues in June signalled a return to (somewhat) normal from a sales perspective. Cash flow stabilised and then customer acquisition accelerated? So much so that in late summer, it was all we could do to keep up with signing contracts and onboarding new customers. This was a very good problem to have, especially relative to what looked possible in the March-April timeframe. Football data use was already accelerating across the industry, but not having scouts able to travel and be physically present in stadia added fuel to the fire. It’s a trend we only expect to continue in the coming years.

Development

From a tech perspective, this year was always going to be challenging. A lot of the work required to move the business from one that started collecting 9 competitions a year in 2018 to one that will collect 80+ in 21-22 was boring, technical work that had no direct visibility for customers. Cleaning up early tech debt consumed a ton of developer hours, as did adding a framework for basic match metadata that was completely owned and controlled by StatsBomb. New project work took up the rest of the time, and when you combine all of this with the difficulty of transitioning the team entirely to remote work at the developer AND DATA COLLECTOR levels, you end up with some tradeoffs. One of the obvious ones was fewer StatsBomb IQ updates than in recent years, which was mostly noticeable both to customers and consumers as less cool stuff appearing in their social media feeds and on the platform. The good news is that we’re through the other side of a lot of that work and IQ now has more dedicated developers than ever before. CTO Thom Lawrence and most of the senior dev team have spent Q4 interviewing rafts of job applicants, and we’ve been hiring like mad men and women. It has been hugely encouraging to see the volume of talented people interested in working at StatsBomb. Scaling from a small team of startup founders and early employees into a mid-sized dev team is often fraught with difficulty, but we’ve had two massive hiring rounds the last two years, and I continue to be surprised and impressed with the quality of people that join our company. I’ve said publicly before that hiring was always one of the things I worried about, but I worry a lot less than I used to, and credit for that is distributed across our entire company. We are a great place to work, and that is only true because of the people that work for us.

Growth

What does growth look like for StatsBomb during the pandemic year?

MRR stands for “monthly recurring revenue,” and it’s a better measure of growth for a company like StatsBomb, whose business model revolves around selling software and data as a service. The red line represents actual customer contracts, while the two other lines were projections done in January and then post-COVID in May. As you can see, we took a hit in the spring/summer like everyone else, but the recovery since that time has been dramatic. Despite a global pandemic hitting revenue for our entire customer base, we’ve managed to triple recurring revenue for a third year in a row. As mentioned above, for various reasons we didn’t release almost anything new in 2020, which means our sales strength last year lies mostly in the data launched in 2018 (we have done some minor upgrades) and StatsBomb IQ, which has been iterated since our inception as a company in 2017. It’s worth digging into this a bit further, if only to cut through the noise from other companies and people on social media. 

We created our own data because we felt there was both a need and a desire for better quality data in football. It was necessary to produce better information and predictions around recruitment, opposition analysis, and to more closely match with coach expectations for how they look at the game.

 

With zero fuzzy nonsense about how artificial intelligence allegedly delivers better insights about the game, StatsBomb Data delivers better information across the full data set and at the individual chance level than every other competitor in the space. Obviously I run the company, so you can expect me to say things like this, but detailed outside analysis agrees. Almost regardless of your modelling approach, better data yields better results. If anyone tries to tell you otherwise, they are wrong.  Better information is important in recruitment. It’s important in fantasy sports. It’s important in team analysis. And it’s important in gambling. It’s now clear in the football world that if you care at all about the quality of information you produce in these areas, you need to be on our data. And as you can see from our growth numbers, the market has responded to this.

However… football is a huge sport – we’re just scratching the surface for how big we can grow. From a customer perspective, we’re growing quickly almost everywhere now. We have paying customers at the top of the Champions League right down to English League Two. We have customers in every country in the big 5 and dotted around smaller leagues as well. We also work with the world’s number 1 ranked men’s national team in Belgium, and federation interest is increasing rapidly. But again, given all the countries in the world that play football, we are just getting started.

Note: I want to say a brief thank you to our investors as a whole, and especially Matthew Lubman and Cristian Cibrario. Luckily, they did not need to jump in and save us back in the spring when the entire world stopped playing football, but they were willing to do so, and that by itself deserves a lot of credit.

New Stuff in 2020

So I mentioned a lot of effort in 2020 went toward new projects, but projects that no one has seen thus far – it’s probably useful to fill in some holes on what they are.

  1. Significant 2020 effort went into building the infrastructure and collection software around Live data collection. This spring we will begin to deliver live data streams for a small set of leagues that will gradually expand over time. We had hoped to do that this past autumn, but the impact the pandemic had on hiring and training new collectors made it impossible to staff in a timely manner. Later in the year, we will deliver a Live IQ toolset designed to work with this new data and unlock insights and visualisation for media and gambling customers. We have already incorporated our best-in-class offline expected goals model into Live, and have one more significant upgrade we will deliver in the Spring that will make this product truly compelling alongside vis work designed for the media landscape.
  2. So much Data Science. We started partly as a data science company, and then turned into a data company, which due to resource conflicts, meant that delivery of new insights based on data science suffered for a while. In December, we hired our fourth data scientist, and they all have so much lost time to make up for. Part of their focus will be taking a lot of the models developed behind the scenes and getting them into the IQ product and API, but most of the focus of this team will be breaking ground on new research. Expect to hear more about some of the recent projects in March.
  3. American Football. We’ve quietly been working on a second sport in 2020 and expect to deliver that product to the other football world in summer 2021. I don’t want to give away much detail until we are ready to put this new data in the hands of customers, but I will say that I’m really happy with what we’ve ended up with from a data perspective. And I should be, since I’ve spent about 3-4 man months of my own work time from 2020 focused on this product.
  4. Finally, you get the big one. Which is mostly REDACTED until the worldwide product launch in March 2021. Here is what I can tell you:

 

  • It is an entirely new data product for football/soccer.
  • It’s called StatsBomb 360.
  • To my knowledge, no other company currently delivers a product like this.
  • In fact, I’m not sure the industry even realised this was a product. But it is now.
  • Current StatsBomb customers will hear a lot more about 360 Data starting in mid-January.
  • We will deliver around 3000 matches of data for this new product in the 20-21 season. That will then scale to 38 competitions worth for 21-22. (For comparison, base SB Data will have 80+ for that season. Given we didn’t cover that many leagues until our third full season of data collection, it is a huge scale up to get this product running at that capacity.)
  • With zero hype, I believe this will fundamentally change how people analyse the game. I said the same at our initial data launch in 2018, and this new data is at least as significant as that, if not more so.

So that’s three new products in market in 2021, on the back of our third year in a row of massive MRR growth. We want to become the best sports data company. 2021 will go a long way toward realising that goal.

Conclusion

2020 has been hard. Despite what sometimes felt like the unbearable weight of the world around us, I am incredibly proud of what our team managed to accomplish this year. Even if the progress wasn’t obvious to the outside world, what we’ve done in 2020 has us delivering over 600 matches a week to current customers, and has set us up for a big year in 2021. Hopefully we will celebrate everyone’s success and a return to normalcy at the StatsBomb Conference in October 2021. Or online at the new product launch in March 2020? Actually yeah, let’s do that… check our social media for the launch day and time for StatsBomb 360. You are all officially invited.

Until then, all the best,

–Ted Knutson

CEO, StatsBomb

Ted@StatsBomb.com

 

 

 

 

PostScript: StatsBomb Internal Hackathon vis or illegal soccer rave?

Hackathon prototype or new concept around air hockey?

 

Arsenal: Season Preview 2020-21

A manager change. A global pandemic. A 10th place finish in expected goal difference. Another FA Cup trophy at the end of it. And then the release of seemingly the entire scouting department plus the firing of Head of Football Raul Sanllehi to boot. What to even make of the 19-20 season from Arsenal?

Recruitment last summer was “fine,” if overpriced. Wide forward had long been a need, and Nicolas Pepe filled it adequately, even though Arsenal probably overpaid for him by £25M. Kieran Tierney took ages to get healthy and settled, but once he started playing, he quickly became a fan favourite. 18-year-old William Saliba was purchased and then immediately sent back to Saint-Etienne to continue his education. And David Luiz was brought over from Chelsea as a rich man’s Shkrodan Mustafi, in the hope that Unai Emery would never be forced to play both of them together on match day.

Combine those with the younger signings from a year before in Torreira, Bernd Leno, and Matteo Guendouzi, and it felt like Arsenal fans might have a reason to be optimistic.

Except… well.

My worry with the hire of Unai Emery was that Arsenal would sacrifice the funky attacking patterns that were a hallmark of Arsene Wenger’s era in exchange for stabilising the defence. The reality was that Emery never stabilised the defence while the attack did indeed become a hell of a lot less fun.

Along with frustrations over recruitment toward the end of the Wenger era, including a complete lack of inbound young attacking talent, one of the points I had hoped Arsenal would address with a new manager was to become more aggressive in the high press. The low block and occasional pressing style under Steve Bould (from the late Wenger years) was just effective enough for fourth, but no more than that. My hope was that Arsenal would dial up the pressure and move up the table as well.

Unfortunately under Emery, Arsenal not only forgot how to dominate the ball, they also forgot how to dominate… anything.

 

The lack of an elite defence (see Manchester United) combined with relative dross on the attacking end, meant Unai Emery was shown the door after a string of poor results in November. Speaking to one club insider this past summer, they stated that the writing was already on the wall for Emery after season 1, which made it even more strange that Sanllehi wanted to extend his contract. They also felt the summertime splash in the transfer market made little sense with a lame duck manager, which isn’t wholly unfair.

Reading between the lines, the state of Arsenal’s upper management in recent years can be described as “bumpy” at a minimum, and ranges to “complete flaming chaos with a side of potential corruption” for those who were both present and particularly descriptive of the situation.

Enter Mikel Arteta

Allegedly, Arteta was nearly hired to succeed Arsene Wenger before a change of heart saw Emery take the reins in summer 2018. A season and a half later, Arteta entered the club with his first chance at being a head coach. Before moving forward, however, it makes a little bit of sense to step back.

In spring of 2017, Arsenal did a very quiet set of interviews with top head coach candidates in preparation for Wenger leaving the club. On the potentials list were a variety of names, including Thomas Tuchel and Roger Schmidt. (I heard Marcelino was mooted as well, but I don’t know anything beyond rumour there.) Tuchel would later replace Emery at PSG and take them to the CL Final this summer, while Schmidt finished out his time in China before taking a break from football, and was recently hired to coach similarly-abbreviated but not-remotely-the-same-stature club PSV. I note this because at one time it seemed fairly clear Arsenal were looking for tactical styles that included a regular high press.

In addition to playing under Arsene Wenger, Mikel Arteta spent three seasons coaching and learning from Pep Guardiola, not only one of the greatest coaches in modern football, but also an advocate of the… you guessed it… high press playing style.

The problem for coaches taking over mid-season is that it’s really hard to find the training time to teach defensive structure changes. This is especially true when the team they take over are also involved in Europa League play and make deep runs in domestic cups – there just isn’t enough time on the training pitch to completely revamp the principles. So it wasn’t entirely surprising that Arteta’s Arsenal team played similarly to Emeryball, with perhaps a bit better tactical understanding.

This is especially true when you consider that most of the post-pandemic run-in were games where the results didn’t matter, at least for Arsenal. Combine a potential lack of motivation in the league run-in (and a focus on the FA Cup) with four red cards in the 21 matches after Arteta took over, and you get a sense that the headline averages don’t tell the whole story.

What was slightly surprising, was that the team started to put up honest-to-god results against Big 6 teams for the first time in… well, seemingly forever. Arsenal beat Liverpool in the league, then Man City and Chelsea in the FA Cup before defeating Liverpool on penalties in the Community Shield. This was new. This was different. Even if the process wasn’t exactly sterling – Arsenal were still losing the expected goals battles – the results were… good?

But are they sustainable? Ahhhh, there is the rub.

How do Arsenal get better in 2020-21?

A) Recruitment Arsenal’s squad have been an army of misfit toys for a while now. They have talent, but the elite players seem to play the same positions (Aubameyang and Lacazette), and there are almost no peak age players (24-27) in the squad. From a squad building perspective, it has been messy for years.

Possibly the biggest need in the squad this summer was retooling the centreback rotation. If Guardiola’s style is anything to go by, Arteta will require ball-playing CBs with pace. William Saliba is extremely young, but the potential answer to part of that equation. The addition of Gabriel Magalhaes from Lille looks to be another potential answer. However, given the ages, my guess is that these two will rarely play together as a pairing this year.

Which leaves you with the rest of the centrebacks. Mustafi looked mostly competent post-pandemic right up until he reverted to his tradition Arsenal form of calamitous mistakes in the last few matches. If calamity is the baseline, the brief purple patch has to be considered an outlier. Relying on him is not a thing most coaches would do by choice. Sokratis has been deemed surplus to requirements, and given his age when he arrived from Dortmund was always a stopgap measure. Pablo Mari looks to be a man mountain, but injury in the opening minutes of the season’s return meant we have basically nothing further to evaluate him on aside from being large and left-footed. It’s a start, at least?

Rob Holding is likely to be a loanee this season, lacking both the pace and the passing to be a top tier CB, though he’s a serviceable Premier League one and was bought for a pittance from Bolton.

Calum Chambers is… still under contract.

That leaves us with the variance of David Luiz. Luiz’s tenure at Arsenal has been plagued largely by mistakes and red cards, which is why it came as a surprise that he signed a contract extension early in the summer and looks set for another season at the club. Fans were left muttering, “Kia taketh, but when doth he give?” Willian on a free was presumably not the hoped for answer.

While Gabriel and Saliba are probably the future, the present looks somewhat uncertain.

Equally uncertain is the midfield composition. Convincing Dani Ceballos to return was vital – he was Arsenal’s most important midfielder after the pandemic return, and offers versatility as well as top quality production. Xhaka has the passing but lacks the legs to be one of the elite players at his position, and then you hit… well, we don’t know yet.

Torreira is likely to be sold, Guendouzi as well, and Arsenal wish Ozil would wriggle himself into a move anywhere but here. Which leaves the cupboard fairly threadbare when it comes to central midfielders. Outgoings could fund a high level incoming like Aouar (which would be fairly impressive recruitment coup), but even adding a player of that calibre means quite a few minutes for Joe Willock to gain experience. Maitland-Niles looks like RB1 if Bellerin leaves to pursue trophies and fashion in Paris, which leaves Bukayo Saka as a pacey flex-8 and almost no one else.

If Arsenal do sign Aouar, they’ll hope for a return of his 20-year-old season under Bruno Genesio.

 

My thoughts on Thomas Partey I will revisit in a postscript.

The Kids

Signing Saka to a long-term extension was extremely important for both the Gunners and their fanbase. The academy production line has gone from almost nothing five years previous to churning out attacking talent that is only a small step below the best academies in the world. Saka broke through and regularly looked like one of the better players on the pitch playing in the Premier League at age 18. Losing that type of talent would have been heartbreaking, even if Arsenal don’t quite know his best position yet.

Nketiah (21) isn’t quite ready to lead the line, but he’s probably only a season or two off and also needs lots of minutes to finish his development. Reiss Nelson (only 20) is more of a question mark, and this is especially true with the signing of Willian. My guess is Arteta felt he wasn’t going to be good enough to be the second option wide right, which means Nelson probably goes into the Sales/Loans pile to generate enough funds to fill out the squad, but I could be wrong.

Before his injury, Gabriel Martinelli looked like a wonderkid, and it is hoped he’ll continue to develop into an elite attacker even if he probably won’t be back in action until the winter.

Arsenal’s attacking youth are a bit too young and a bit raw, but are already good enough to merit either decent minutes in the league and cup competitions, or to generate enough sales revenue to fund additional transfers.

B) Defensive Style

No playmaker in the world can be as good as good counter-pressing.” – Jurgen Klopp

One of the ways Arsenal can get better as a team is by playing a better defensive style. High pressure is related to both fewer expected goals against and to generating additional expected goals in attack. Arsenal as a whole lack some creativity in attack, and I’m not sure they are going to be able to fix that in the transfer market this window.

However, in the absence of elite creative players, gegenpressing and learning to transition more and better can still help generate additional, almost free goals. The test here will be whether Arsenal have the personnel to do it, and whether Arteta can get them playing it well in time for the new season.

Even if they aren’t elite at it to start (and neither Klopp’s LFC nor Pochettino’s Spurs were), committing to this style and improving at it over the course of the season is a useful goal unto itself. And potentially necessary to help generate more goals from a team that doesn’t look like it will add creative firepower until potentially January at the earliest.

C) Set Pieces These went from a mild success under Emery to a moderate disaster under Arteta both in attack and defence. Andreas Georgson has been brought in to help improve this phase of the game and I wish him all the best. Even if he makes it so I stop wincing every time Arsenal face a defensive corner, I will consider this a success.

D) Behaviour With a Lead

This one is subtle, but it was a massive problem throughout the course of last season. Arsenal had the fourth worst expected goals difference in the league when they had a lead.

All of the top 6 teams had a positive expected goal difference with the lead. The best teams look to extend leads, not protect them. Protecting leads is a recipe for 14 draws in 38 league matches and another likely midtable finish.

To put this another way – when playing with a one-goal lead, Arsenal became relegation candidates. They had 36% of the expected goals, 32% of the total shots, but still managed 47% of the goals. Luck? Talent? Bit of both probably, with a strong worry that it won’t be repeatable should they try it again.

This is typically a tactical choice, and one that needs to change if Arsenal are going to challenge for anything useful this season.

Conclusion

An optimist would look at the FA Cup trophy and numerous victories against the Top 6 as a positive sign. They will also take hope in the idea that Arteta should put in a similar defensive system to what Pep runs at City, even if they are cautious that Arsenal may not have the players to execute that style particularly well for at least another year.

A pessimist will look at the average numbers on both sides of the ball under Emery and Arteta, and conclude that Arsenal have not been Actually Good since 2016-17 and there are no obvious signs of getting better. They will also look at the mess at the top of the club and mixed bag of recruitment as further proof that Arsenal are not ready to pull out of their slide into mediocrity just yet.

Myself, I lean toward optimism. I think the results suggest a better process is on the way, and that’s backed up by my eye test when I watch this team execute tactically now versus what it was like under Unai Emery.

A top 4 challenge is probably a little less than a coinflip, especially with the firepower Chelsea have brought in this summer, but if better tactical choices are made, a return to the top 6 is fairly likely.

–Ted Knutson

@mixedknuts

Post Script

I moved this down here because it requires some time and some nuance to explain and I didn’t want to clutter up the preview with additional nonsense.

If you prefer the podcast version of this argument, you can find it here: https://soundcloud.com/statsbomb-pod/statsbomb-podcast-july-1-q-a

I said this back in June and I stand by it. I also followed up that tweet with additional reasoning.

First, “free” isn’t free. In many cases free simply means taking 30-50% of the transfer fee a team would normally pay and adding it to the player’s wage packet. So instead of Partey making £200K a week, he now makes 300 or 350K a week. Given his age – he would be starting his contract at 27 – this means Arsenal would be taking a HUGE risk if Partey didn’t work out.

When it comes to squadbuilding, taking big risks on older players with giant wages and potentially low resale value is a terrible idea. Occasionally it works out – Aubameyang at Arsenal is one rare example and someone I would have been happy to buy at £50M but much less happy with at £70M – but mostly it doesn’t and it means you are stuck with a player that has declining output and who may become effectively unmoveable.

(There are a million zillion examples of this, but the one that always strikes me is how often Manchester City did this until a couple of years ago. Fernandinho worked out alright, but a ton of those others guys did not (Navas, Negredo, Bony, Nolito), and the failure to keep younger fullbacks around basically cost City a title in Pep’s first season at the club.)

Beyond the age, and the wages, and the squadbuilding principles, I just don’t see good data-based reasons to be excited about Partey.

If you’re splashing the cash, you really want scouting and data to both be excited about a player. That’s not output I want to spunk £50M and big wages on. Or big wages and a 5-year contract.

He was amazing against Barcelona! Great, then why wasn’t he amazing against all the other teams in La Liga last season?

Atletico are weird tactically, and his stats are not reflective of how he will play elsewhere. Again, fine… but we can ONLY judge his output from Atletico because that’s the only place he’s played since 2015. This increases the risk that the transfer goes wrong.

He’s an elite defensive midifelder! His output looks nothing like an elite defensive midfielder.

He’s NOT a defensive midfielder, he’s a box-to-box midfielder who is great on the ball. Okay, but he also doesn’t score goals or create goals for his teammates, so what are you actually paying for?

Transfer shopping is all about correctly evaluating and mitigating risk. If you are going to spend big, there needs to be very little risk something doesn’t work out, and hopefully some upside involved as well. Liverpool don’t make mistakes on transfers. That’s why they are where they are now, despite finishing 8th as recently as 15-16. If you buy young guys and they don’t work out, at least you can sell them on at a small loss and roll the dice again (Lucas Torreira). But if you buy/sign an older player on big wages and they don’t work out, you are dead. You either have to eat a ton of their wages to move them on, or they stick around until the end of their contract with declining production.

My perspective on Arsenal and transfers has been the same for years now. I think they should not be building to try and finish 4th and make the Champions League.

I think they should be building a squad to try and finish first, even if that’s one or two years down the line. And the way you do that is not via signing 27 and 28yos… you do it via taking some risk on 21-24 year olds, hoping your analysis is good, and letting them mature into elite players while improving the style of play.

The Invincibles Project and Classics Data Pack 1

As those of you who follow me on social media are aware, earlier this year we started working on The Invincibles Project. The idea behind this was to collect all of the data from this historic season to be able to look at it through a modern lens. I had initially pitched this as a follow-up project after the Messi Data Biography as something different, and another way of unlocking football’s history.

As an Arsenal fan, I found the whole thing exciting. Prime Thierry Henry! Doing things like this:

 

The majesty of Robert Pires. Taking bodies!

 

 

Dennis Bergkamp! Patrick Vieira! Jose Antonio Reyes! Kolo kolo Toure! Sol Campbell! Mad Jens!

*Highbury roars*

OMG SO EXCITING.

Cashley.

*crickets chirping*

Also as an Arsenal fan, I know that other Arsenal fans could use a little joy in their lives and this seemed like the only way we were getting anything fun out of the Gunners in 2019-20.

We started collecting this with an eye to releasing it side by side with the data set from a different red team, should they manage to finish their season undefeated. Sorry Liverpool fans, due to circumstances beyond our control, that data release slipped through our fingers. You’ll have to settle for merely a league title and one of the largest title winning margins in history.

The Problem

In order to collect data, we need to have video. It was fortunate for us that Lionel Messi has played his entire career for Barcelona, because that is one of the few teams in the world that has historic video available on the internet from pre-2010 without needing to jump through a million hoops. That doesn’t mean that getting all of the video to reconstruct Messi’s club career was easy – far from it. It was merely doable.

Arsenal? The only undefeated season in Premier League history? You would think this would be at least as simple as sourcing 15 seasons of Messi, right?

It was not.

We managed to get about half the 2003-04 season from the usual sources of football video history. And then we hit a wall. Our man in Spain and historic video expert Pablo Rodriguez then went to work, checking with various and sundry collectors that he knows who have large archives of historic, important football video. Through these wonderful people and the standard exchange of goods and services we were able to get to 32 matches of video. And then we hit another wall.

Why? Well as Andrew Mangan of Arseblog reminded me, not all matches during that time period were broadcast to TV. In the modern day, every Premier League match is broadcast to air in multiple countries, which makes it easy to grab that video and store it away on a giant hard drive. Back then? A number of 3PM matches on Saturdays were simply never broadcast. (At least to our knowledge.) Which means that the collectors would not have that video unless they somehow tapped into different sources.

We checked with Arsenal. I’ve been lucky enough to meet people that work for the club over the years, and we figured maybe they would let us have access to the video to collaborate on the data release and some cool stuff with club media. And they totally would have been…

Except they didn’t have the video either.

Someone who worked for Prozone back in the day suggested that the opponents might have those videos, as they would have been delivered by courier as part of their service. But that ran into a variety of snags, including the fact that football clubs change personnel on this end with remarkable regularity, and having the archive, being able to access it, and even knowing who to talk to was insurmountable for us.

The other problem here is the transition from analog to digital. Pretty much all archives back then were tape archives that would later need to be digitised so the match would be preserved for history. Rob Bateman of Opta tells the tale of trying to collect old Premier League matches from the 90s and being surrounded by crumbling video tape from the league’s first decade. These Arsenal matches came right at the tail end of that period, and my understanding is that the PL has started to archive its history as much as possible, but it’s still very much a work in progress.

Finally you hit the problem of a license fee. We got in touch with the archive service with a willingness to pay a fee to obtain the final six matches needed to complete the project. We were quoted a figure to license the video for the entire Arsenal season that frankly didn’t make any sense to me, and certainly eclipsed my budget for a public service project.

I wanted to get everyone a data gift to bring people some joy during the pandemic, but I didn’t want to/could not pay the price of a car to make that happen.

The Premier League itself actually showed willingness to help us out, but as you can understand, they are rather busy with other priorities right now (like restarting the league during the middle of a viral pandemic) and suggested maybe we can revisit this when the world wasn’t quite so mad? Which totally makes sense.

But I have an anniversary data release deadline, and thus here we are.

Incomplete Invincibles.

Classics Data Pack 1

To make up for my own disappointment in not being able to complete this project, I added some extra matches I thought might interest people, including non-Arsenal fans. So what you are getting today as a gift from StatsBomb is a hefty little slice of football history, wrapped in the above-named package. In addition to delivering 32 of 38 matches from the Arsenal 2003-04 Premier League season, we are also giving you UEFA Champions League Finals data from 2000-2019. The collection on those CL matches aren’t all finished, so will trickle out to the repository gradually over the next week to complete the set.

Thank you to all of the fans out there who have supported StatsBomb over the years. Thank you to our customers who buy our products and give us feedback to make us better every day.

And thanks to Arsenal for a truly magnificent season and set of memories. It would be great if we could get some more of those sooner rather than later. Information on how to access the data is here: https://statsbomb.com/academy/

A complete primer (in English and Espanol) on how to work with the data via StatsBombR is here: https://statsbomb.com/2019/07/messi-data-release-part-1-working-with-statsbomb-data-in-r/

*EDIT: A new, updated version of the R Guide can be found here: https://statsbomb.com/wp-content/uploads/2021/11/Working-with-R.pdf

The data comes with our standard non-commercial license that is usable for fan analysis and academic research. If you are a commercial entity that would like to use this data, get in touch with sales@statsbomb.com and we can have a conversation.

All the best,
–Ted Knutson
CEO, StatsBomb

*If we get video and I still run StatsBomb, we will finish this project.

We Want to Hire You

StatsBomb is currently experiencing explosive growth and has sailed through the startup phase straight into the scaling phase. 

In order to do that, we need great people.

Like you.

Why Should You Work At StatsBomb?

Because you love new challenges.

We are shaping the future of data in sport. That includes new technology, new visualisations, and completely new ways of thinking about the game. That is challenging work that changes on a regular basis, but if you are the type of person that loves figuring out new things, StatsBomb is a great place to be.

Stock Options.

Nearly every one of our employees receives stock options in their first year on the job. We believe strongly that our employees deserve a piece of the business they are helping to build, and our compensation plans reflect that.

Our revenues have more than doubled each of the last two years. If we can keep this up – and honestly, we are just getting started – it’s easy to see how this can turn into significant additional earnings over time.

Space to grow

One of the best things about working in a young startup that is growing is that new positions open up all the time. This means employees who excel have plenty of scope to move up in the organisation as the company grows.

Speaking of space, we are about to move into a brand new office before the end of the year, just outside the train station in Bath, fully equipped with a kitchen, free coffee, plus bicycle storage and showers.

Only 3 Days in the Office per week.

We hire highly motivated individuals, and in return we are able to offer huge work-life flexibility to our employees. Gathering the team together remains important, but most of our employees find quiet working days at home hugely valuable. Office hours are also somewhat flexible, removing much of the daily stress from commutes, school runs, etc that you get from more strict companies.

You want to work around incredibly talented people.

You can’t build a great company without a great team of people. We have that right now and already need more.  By joining our team, you get to work with some of the best people in the sports data field on a daily basis.  You also get to work with some of the biggest football clubs in the world.

Bath is gorgeous.

Bath is a UNESCO World Heritage city and one of the best cities in the UK for quality of life and work/life balance. It’s only 15 mins by train from Bristol, 30 minutes by train from Swindon, and an hour from Reading and Cardiff.

Job Openings

Our careers page is a work in progress, but expect to see new job postings frequently in the coming weeks. In addition to the 2 Junior Front End Developer roles and the Accounts Administrator role currently listed, we will also have full-time positions for

  • UI Designer Digital
  • Project Manager
  • Computer Vision
  • Developer Graphic
  • Designer Junior
  • Data Scientist
  • Quantitative Football Analyst

If you have ever wanted to come work at StatsBomb, now is the time. Even before a job description, if you think your skill set fills these titles, please send a CV to careers@statsbomb.com. To fill these roles, you will need a valid UK work permit and to work in Bath three days a week.

Ted Knutson

CEO, StatsBomb

StatsBomb Elevates Their Industry-Leading Football Data Spec Yet Again

On May 9th, 2018 StatsBomb announced our new product, StatsBomb Data. Our football data features massive upgrades to the event data world including

  • Location of defenders and goalkeeper on every shot
  • Defensive pressures
  • Passing footedness
  • Pass Height
  • Ball Receptions

And so much more… On release, StatsBomb Data ended up with 60% more events per match than the competition. Our data is currently collected across 22 leagues and we plan to double the number of leagues we collect over the next 18 months. On the StatsBomb IQ side, we spent much of the last year unlocking the power of StatsBomb Data inside our analytics platform. Customers now have information about player and team defensive pressures where none existed before.

We also released an entire module focused on objective information that helps evaluate Goalkeepers, previously a problematic area of player analysis. Are you going to buy or sell a goalkeeper this summer? Then you really need to be on StatsBomb IQ. StatsBomb Data represents a paradigm shift in the football data industry.

Having been around this industry since 2013, you almost never see significant upgrades in event data specs, but we packaged a decade worth of innovations into our launch product. But that was what we did last year… What have we done for you lately? Not content to already have the best data in this space, we introduced new upgrades.

Shot Impact Height

You know those crosses that are too high, but the attacker goes for the shot anyway and it glances off the top of his head as it’s vaguely looped toward goal? Those look the same in the data as a standing header with perfect contact. They won’t look the same with StatsBomb Data. We have added a z-coordinate to the start of shots so you’ll be able to tell at what height the shooter made contact. By doing this, we get more useful information about each individual chance and another small variable that we think this will improve expected goal model performance. Those of you out there whose jobs do not involve improving the performance of expected goals models are probably like, “Whatever! This is boooooring.” I feel your pain. So how about this?

Goalkeeper Ragdolls

Introducing GK position information on shots has paid huge dividends when it comes to evaluating individual GK performance and positioning. However, we looked at what we were collecting and found a way to improve the information provided about goalkeepers in a massive way.

 

 

These are officially termed ragdolls because they are based on the dolls you see in ragdoll physics demos, but throughout design and development we have affectionately nicknamed them skellingtons. They capture the GK position at the start of a shot and at the point of a save/potential save in a way no company ever has before. And we will capture this information on every shot in every league we collect, from the English Premier League all the way down to League Two. These new upgrades plus a couple of other minor ones rolled out at the same time will give our data set twice as much information per game as our competitors. There is no extra charge on the new stuff to StatsBomb Data customers.

These upgrades to the data spec will start rolling out as part of our normal data delivery in March, and will extend backwards through all of our historic data. I think it’s been clear from the start that we’re a bit different from the other data companies out there. Our mission is to find innovative new ways to analyse and visualise the game, and provide our customers an edge over the competition. Watch this space – we’re just getting started.

Ted Knutson CEO, Co-Founder StatsBomb ted@statsbomb.com

Introducing Goalkeeper Radars

If you pay attention to our social media, you know that we recently released the new goalkeeper(GK) module on our analytics platform StatsBomb IQ. This past weekend, phase 2 of the module went live, and included in that release were an awful lot of things, not least of which were the long-awaited GK radars.

Today I’m going to discuss what we’ve done with the GK metrics, why they differ from what you might see elsewhere, and why this is something people in football really need to care about. (Note: For those of you who want to know more about the framework we have chosen to analyse GKs, please check out my intro piece here.)

StatsBomb Data is Different

I have been working with player data in football since 2013, but I never bothered to do much work with GK data. It’s not that I didn’t think GKs were important – obviously they are. The problem was that I felt the data we had access to didn’t add much insight into the job GKs actually do. Primary jobs for GKs consist of:

  1. Stopping shots
  2. Claiming crosses and high balls
  3. Distribution

When I was designing the data spec for our new data, I went around to most of the smart people I know in football and asked them how we could improve football data without widespread tracking data. We ended up with a long list of upgrades to what our competitors offer, but probably the most important element across everyone’s list was the position of the GK on every shot. And the reason for this was that a big part of the GK’s job is simply being in the right place to have the best chance of saving any particular shot.

Think of what you often hear in commentary when David de Gea is playing.

“It’s not really a save, the ball just hit him and bounced off.”

“Another shot right at him.”

“Great reflex save from de Gea, but again the ball was right at him.”

Being in the right position to make saves for a keeper is a huge skill, but you can’t measure that if you don’t have the data.

So we collected it, along with the position of all the defenders in the frame when a shot is taken, and we call them Freeze Frames.

 

 

(Credit for all the data science heavy lifting in the GK Module goes to Derrick Yam, who did great work on this on.)

Once we had enough shots, we were then able to investigate where GKs generally should be positioned on shots from any particular location in order to make a save and put that information into a model. We then use that model to evaluate each GK on each shot and produce two shot stopping metrics.

GSAA% – Goals Saved Above Average Percentage: How the Goalkeeper performed versus expectation. Calculated as: (PSxG – Goals)/Shots Faced

Positioning Error – How far from the optimal position for facing a shot the Goalkeeper is (on average).

The next two metrics we produced focus on GK activity around the box.

CCAA% tries to answer how active are GKs at gathering claimables – high balls and crosses into the box that could be claimed.

The claimables model first defines the likelihood of a pass from and to a particular location being claimed and then evaluates GKs based on their activity. (This is made easier because StatsBomb Data also includes pass height as you wouldn’t generally expect GKs to claim ground passes.) Busy GKs that come off their line to claim lower xCL balls are graded higher than those who are consistently rooted to the goal line. The reason is because claims have some level of value in cutting out opposition chances, and GKs can be rewarded and penalised based on this activity.

(Note: There are a lot of additional technical details behind the scenes here that are only available to StatsBomb IQ customers right now.)

For GK Aggressive Distance we wanted to look at how active are GKs generally at moving off of their goal line to do football things? We investigate the distribution of the distance from goal for goalkeeper actions that are not passes, saves or claims. This includes clearances, interceptions, tackles and ball recoveries. This shows the presence a goalkeeper has further up the pitch and measures their defensive contribution in a manner more common to field players.

Finally, you get to the distribution metrics. Admittedly, these are as more stylistic profiles as opposed to telling you whether a player is strictly good or bad at a skill set, but we chose these because we liked the insight they deliver in this area. In real world analysis, we produce something like twenty different distribution metrics in this area to dig deeper.

Pass into Danger% – Percentage of Passes made where the recipient was under pressure or otherwise in Danger.

Positive Outcome Contribution – How frequently is the player involved in sequences that soon resolve with a Postiive Outcome.

Combine all of those into a visual plot with the outside ring as a top 5% cutoff and the inside ring as a bottom 5% cutoff and you get this:

 

 

If you have watched these GKs quite a bit over the years, these really do feel “right” in terms of profiling their skill sets. De Gea is great at stopping shots, but doesn’t do that much with regard to coming off his line. Lloris is a solid shot stopper who remains very busy around his own penalty area.

What about Chelsea’s Kepa, who Derrick analysed early in the season as being largely average in most of our metrics?

 

 

And with our data, we now have detailed GK metrics for every league we collect, from the Premier League right down to League Two. Or MLS. Or Poland. Or your academy…

Goalkeeping is Unsolved

I hinted at this a little in my Barcelona presentation, but from talking to teams around the world, I get the impression very few understand goalkeeping from an analytic and training standpoint, and almost no one is closing the loop with regard to data driven coaching. I’ve been working with football data for nearly six years now, and it took us until now one to build a framework we liked to evaluate GKs analytically. Because of this, there are just so many things we don’t know.

  • How do GKs age? What does the age curve look like?
  • Does shot stopping ability – which appears largely stable – increase, plateau, and decrease at certain times?
  • Are shot stopping and positional error negatively correlated to claim activity and defensive aggression?
  • How do GK skills transfer from lower quality leagues to higher ones?
  • How do they transfer across top leagues?
  • Our model thinks David de Gea saved Manchester United thirteen goals more than an average GK would have last season. Is that type of elite performance sustainable?

And that barely scratches the surface. Not knowing things in sport is dangerous. It throws a random factor into every decision you make that could be tremendously costly down the line. But ignorance becomes way more dangerous when it shifts from “no one really knows these things” to “we’re the only ones who don’t know these things.” If your opponents have better info, and you are the only sucker left on the block…

We designed StatsBomb Data to allow coaches and analysts to ask questions they never could before. And with StatsBomb IQ, we deliver powerful, easily understandable insights to answer those questions.

We’re not just here to stop teams from making mistakes, though data is super useful for that. We are here to deliver info that makes teams better in every area of the game. Recruitment, self-analysis, opposition scouting…

And now goalkeeping.

–Ted Knutson

ted@statsbomb.com

@mixedknuts

PostScript

For good or for ill, next month is the five-year anniversary of the first player radars I ever created. For those who want a design history and defense of the visualisation format, relevant links are below.

The first terrible introduction article.

Understanding Radars for Mugs and Muggles

Defending Radars CASSIS Presentation – RADAR WARS. (Also an excuse to poke fun at Luke Bornn and Daryl Morey)

New Radars on StatsBomb Data

Explaining xGChain Passing Networks

(Editor’s Note: This was originally published on the StatsBomb Services blog, but the URL was lost in a server move. We have re-published it here so it can be referenced in future work.)
Some of the work we need to do on the StatsBomb Services side involves teaching people how to use what we create. If it’s not practically applicable and/or can’t be taught, then it’s just a piece of art, not analytics.
Today I’m going to discuss passing networks, with a specific emphasis on the xGChain passing networks you’ll find on the StatsBomb IQ platform and also on our Twitter feed.

What is a Passing Network?

It’s the application of network theory and social network analysis to passing data in football. Each player is a node, and the passes between them are connections.

The first time I saw them used in football was either a presentation by Pedro Marques of Man City at the first OptaPro Forum, or Devin Pleuler’s work at Central Winger on the MLS site.

 

 

We also used them at Brentford to do opposition analysis, specifically to find which players we might want to aggressively press whenever they get the ball, or looking at valuable connections between players we wanted to break.

The application is simple.

  1. Look at a bunch of recent matches for a club and you will often start to see patterns of play and interesting details you care about.
  2. Investigate a little further in the data to find usage information
  3. Go to the video and see what shakes out.

In many cases, analysts only have time to watch and analyse the last 3 matches of opposition on video. Using the passing networks gives them quick info in an easily digestible format that doesn’t cost them an extra 10-20 hours of video time.

Before we go any further though, I think it’s important to speak about the limitations of passing networks. These are a tool and meant to be part of an analytics suite to help you analyse games, but like any tool, you need to understand their weaknesses.

First, each node consists of the average location of a player’s touches. If they switch sides of the pitch regularly, their average will look central, even if they never touch the ball in that area. This is a limitation of the vis and why we ALWAYS use video to back stuff up. On the other hand, if you want to stay data-based, you could use things like heat maps, or even dot touch maps for every place a single player touched on the pitch to get more accuracy. This is a bit like using shot maps to supplement aggregate data in player radars to get a clearer picture.

The second limitation is that this info is an extrapolation of what actually happened. Did the fullback pass 15 times to the left wing, exactly along the path in the vis? No, of course not. That information is also easily visualized, but it’s just not contained here.

The third limitation is that these don’t actually explain that much by themselves. They take snapshots of actions through a match and combine them into a bigger picture. It’s like a movie where you only see 20 of 50 scenes without seeing the whole thing. Sometimes, you’ll end up with a clear idea of the plot. Other times, you are going to be really surprised when your friends start talking about the whole Verbal Kint/Kaiser Soze thing. They are still useful, but this is another reason why – in practice – we almost always pair this analysis with video work to complete the picture.

Design Stuff

Right, so we have passing networks. Some people do them vertically. We do them horizontally.

 

 

Why?

For starters, most humans are accustomed to looking at football matches left to right. High angle tactical cam footage from behind the goal is quite useful if you can get it, but the vast majority of the audience views football in a left to right perspective.

The next thing you notice is that we stack ours on top of each other. This happened as a bit of a happy accident where I noticed a pressing team had a map very high up the pitch. I then put the map from their opponent underneath, and voila! we had a fairly clear view of territoriality in the touch maps.

If you take a step back, it seems fairly obvious, right? There are two teams on the pitch, and each of their actions impacts the other one, so visualize both together. However, actions between two teams aren’t always linked. The shot locations of one team don’t have any impact on the locations of the opponent. Passes do though, so at least in my opinion, pairing them as part of this vis makes sense.

We also have them both going the same direction, which seems to strike some people as odd. All I can tell you is I think the territory element is much clearer if they go in the same direction, but people are welcome to test their own implementations and judge for themselves.

What else do we have… ah yes, the big difference: colour.

 

 

With passing networks, there is a real danger of adding so much information that your vis basically becomes unusable. It’s an incredibly info-dense visualization to begin with, so adding more elements is likely to make understanding what you are trying to display harder instead of easier. I think Thom walked this tightrope perfectly, adding the extra xGChain layer of data while still leaving it interpretable, and to be honest, totally gorgeous.

That said, it may take looking at these a number of times before you become comfortable with what they are trying to display. The same caveat was true of radars and shot maps, and is another reason why analysis blends elements of art with data science.

The xGChain Layer

First you need to understand what the xGChain metric is, and to do that you should read Thom Lawrence’s intro piece here. So any time a player is involved in a pass in the possession, they get xGC credit, and then we sum up their involvement over the course of a match and colour their node based on that.

Why?

Because this allows us to take the network vis beyond basic counting stats and starts to examine the value of a player’s contribution to the match. Because the colour scales are tied to the 5%/95% cutoffs I started back with the radars, you also get an easy reference for whether a player’s attacking contribution was pretty great (RED), pretty poor (GREEN), or somewhere in between.

We also start to get a sense of how non-attacking players are contributing to valuable build-up play in a way that just makes sense (at least to me).

Quick Reference

  • Size of node = number of touches
  • Thickness of line = number of passes between two nodes
  • Colour of node = linear scale from green to red (.6-1.4 xGCh based on 5%/95% cutoffs)
  • Colour of line = the total xGChain of possessions featuring a pass from A->B (0-.5 values based on 5%/95% cutoffs)

We Still Use Numbers

On Twitter, you will generally see just the visualization. This is mostly due to the limited, bite-size nature of the format. However, on the StatsBomb IQ app, Passing Networks also include all the individual and combination numbers you see below.

 

 

The combination of the vis and the numbers represents the whole of the analysis. The vis gives you basics, the numbers specifics, but both are still constrained by the limitations of this visualization format.

Examples

 

 

In this one you see Liverpool pushed quite far forward and had massive amounts of possession and created reasonable chances. Pretty much everyone is involved, but Coutinho and Lallana only put up good, not great xGChain numbers for the match. On the Swansea side, Llorente is the only guy up high most of the time, while he and Wayne Routledge both put up big numbers for the game, and Swansea came away with a vital win.

 

 

Just a single plot this time from Liverpool’s trip to Bournemouth earlier in the season, mostly to compare same team performance. Here Firmino is posted out wide instead of central, and had comparatively little impact in creating big scoring chances for LFC that match. Normally he’s a fiery red circle, but for this match he’s ineffective green. That’s another cool element these plots allow. Instead of focusing on the full match, you can isolate one player across a number of positions and games and see what it does to their performance.

 

 

I posted this one because both team’s maps are pretty incredible. City’s front three have average touches nearly on the 18, and nearly everyone except Claudio Bravo is red or orange. Meanwhile Boro had almost none of the ball and created almost nothing as well. The match ended 1-1, with Boro scoring a very late equalizer. 90% of the time our simulations think City win that match.

 

 

It’s always fascinating to see what happens to these maps when two elite teams square off. This is from the 1-0 Dortmund home win earlier this season. Bayern dominated the touches, but Dortmund just edged then in xG, 1.40 to 1.24. Aubameyang was rampant the entire game, and every time Dortmund touched the ball, they felt dangerous while doing a pretty good job of stymying Bayern’s great attackers.

How Do You Use This Inside a Professional Football Club?

Typically what I would do would be take passing networks for the last 10 matches from the next opposition and divide them into home and away games. Stick the numbers next to each of them for reference, and start to look for patterns.

Which players provide the engine for plan A when this team attacks?

Which players have the most valuable touches?

Does their fullback tend to get really high in possession and can we play behind them?

Which players should we look at for potential pressing triggers?

If we have a choice, which center back would we allow to play the ball forward?

Conclusion

This is already long, so I will wrap it up here. We view passing networks as an integral part of data-based football analysis. Provided you understand their limitations, they can provide a huge productivity boost to opposition and own team analysis. We also think the addition of our xGChain metric adds a layer of value to a visualization that previously only contained counting stats.

If you work in football and want to see what else the StatsBomb IQ platform has to offer, please get in touch.

–Ted Knutson

ted@statsbomb.com

@mixedknuts