In League Two this season, something strange happened... Two of the automatically promoted teams and two sides that reached the playoffs were StatsBomb customers.
We're into September, which means football is once again in full swing. But that also means the StatsBomb Conference is only about five weeks away, and it is going to be fantastic. Today I am going to fill in the details around the event, and also explain just a few reasons why many people think this is the best football analytics conference of the year.
The event details
- Date - Friday 8th October 2021
- Location - Stamford Bridge, London
- Time - 9.00 a.m to 6.00 p.m.
This event will be in person at Stamford Bridge on the 8th of October. As we did in 2019, much of the conference will either be broadcast publicly, or the talks will be posted on YouTube in the weeks after the event. However, not all talks will be available either live or online, as some of our speakers only have permission to give their talks to the people present in the room at the time.
Basically, if you want to make sure you see all the talks, you’ll need to show up. And we know from last time that the food should be great, so you’ve got that going for you too.
What to expect from the StatsBomb Conference 2021?
- Your host - Ted Knutson
- Key talks from StatsBomb experts - Nicole Kozlova, Ukrainian National Team player and StatsBomb Data Science Intern + one talk from our Data Science Team
- Industry speakers (see list below)
- Networking ops - around 600 attendees from across the football ecosystem
- Research paper competition
A stellar line-up of industry speakers
- Dr Ian Graham - Head of Research (Liverpool FC)
- Daryl Morey - President of Basketball Ops - (Philadelphia 76ers)
- Harry Moyal - Deputy General Manager (Olympique Lyonnais)
- Mladen Sormaz - Head of Football Analytics (Leicester City)
I continue to be amazed at the quality of people who agree to speak at our event. Ian, Harry, and Mladen are all giving their own talks, while Daryl will be in more of a fireside chat format. I’ll ask him about his beginnings in basketball analytics, his interest in uh… “soccer” and his famed 'launch and squish' preferred style of play.
Not yet on the speaker list is REDACTED, Head of REDACTED at a Champions League club, who will be announced very soon.
Directors of Football Panel
- Victor Orta - Director of Football (Leeds United FC)
- James Cryne - Director and Co-Owner (Barnsley FC)
- Will Kuntz - SVP of Soccer Operations & Assistant GM (LAFC)
We’re also excited to have three key executives from some of the most forward-thinking clubs in world football, who will discuss their experiences in driving an analytics-based culture within their organisations and how the industry is changing.
Research Paper Competition
Following an influx of highly impressive submissions for the research paper competition, we are pleased to announce the following speakers and the title of the papers that they'll be presenting on the Research Stage:
- Devin Pleuler - Tempo: another expectation model?
- Maaike Van Roy, et al - Optimally disrupting opponent build-ups
- Javier M. Buldú - The quest for the right pass: quantifying player’s decision making
- Hadi Sotudeh - Introduction of potential counter-attack
- Soumyajit Bose and Manas Saraswat - Turning with the ball and decision-making under pressure
- Samer Fatayri, Kirill Serykh, Egor Gumin - How to save efficiently: what drives the goalkeepers’ decisions?
- Max Odenheimer - Splitting GSAA: finding the best shot-stopper for your team
- Juan Camilo Campos - Determining the phases of the play using graph convolutional networks
What measures are we putting in place? The event will feel safe with reduced touchpoints and lots of ventilation. We are really excited that this is an in-person event and a chance to actually see people again.
- All attendees must be double-jabbed or provide proof of a negative test taken at most 48 hours before the event
- We will be providing StatsBomb branded masks upon entry - though the use of these inside the venue is up to the individual's discretion
- Detailed information around our Covid policy will be sent to attendees prior to the event*
Basically, the conference will be brilliant and - unlike most conferences in the football analytics space - ours is open to the public. We hope to see you there!
Tickets available here: The StatsBomb Conference Tickets, Fri 8 Oct 2021 at 09:00 | Eventbrite
*Stamford Bridge has their own Covid policy which we will be adhering to - the key points have been summarised above. If you'd like to read the full set of guidelines that we'll be following before, during, and after the event, you can read them here: Stamford Bridge Covid Policy
2020, annus horribilis. In the context of the world, this past year has been simply awful in nearly every conceivable way. In the context of StatsBomb, it’s mostly just been weird. What could have been a lost year ended up making our little data company stronger, but not in ways you’d notice unless we told you. Today I’m going to walk through this very strange year from a StatsBomb perspective and also clue you in on what is coming in 2021.
Wooo, 2020 is great!
On February 25th, we passed our sales goal for the first quarter… And then we didn’t sign another contract for 2.5 months. For a SAAS company with revenue growth baked into the cash flow, that's a horrifying outcome. Two months of firing on all cylinders and adding new customers everywhere, and then nearly a full quarter of existential dread at a company level (and god knows what at a personal level) due to the COVID-19 pandemic. Internally, we shifted to work from home as soon as we got back from Sloan in the second week of March. We also furloughed some of our staff later in the spring, while nearly all of StatsBomb North took salary deferrals until we could find our feet financially. We cut most of the marketing expenditure (largely content publication), tried to figure out what to collect when there were no football matches being played, and began discussing with investors how we could raise enough money to survive however long *waves hands vaguely* this was going to last.
2020 was supposed to be a big year for us at conferences and teaching courses in English and Spanish. It still ended up that way, but only with regard to courses, and only because we rapidly converted both existing courses (Intro to Analytics + Coaching and Analysing Set Pieces) and new stuff (Modern Scouting and Data Recruitment) into online versions.
The Introductory course gets rave reviews and has been taken by people all over the world at this point. It is not mathy or technical, and appropriate to anyone who wants to learn a bit more about how the sport actually works. This includes coaches and professional footballers. Thanks to Pablo Rodriguez translating the entire thing, it is also now available en espanol. Set Pieces (built by Euan Dewar and myself) had 100 people sign up for a live course taught by me in April, 70 of whom already worked in professional football including half of the Premier League. I re-opened it for a month late in the summer due to high demand, but it won’t go live again until after Euros 2021 at the earliest, and maybe not at all. I believe that people who paid to learn set pieces from us deserve time to build their edges. Though teams haven’t made football weird yet, it’s finally trending that way.
The Set Pieces course is great, but the Recruitment course James Yorke put together might be our best work. It touches on nearly everything we have learned from six years of doing this professionally, and delivers new ideas and frameworks appropriate to both seasoned professionals and absolute beginners in the football recruitment space. Ever since I founded it as a blog, StatsBomb has always had a mission to teach the world more about football. The courses are the best vehicle we currently have to do this, and we remain dedicated to this mission. We’ll look to translate Recruitment to Spanish and potentially all the courses into new languages in 2021.
Back to the global pandemic... unlike so many other businesses in this year, we were lucky. The return of the German Bundesliga in mid-May followed by nearly all the other leagues in June signalled a return to (somewhat) normal from a sales perspective. Cash flow stabilised and then customer acquisition accelerated? So much so that in late summer, it was all we could do to keep up with signing contracts and onboarding new customers. This was a very good problem to have, especially relative to what looked possible in the March-April timeframe. Football data use was already accelerating across the industry, but not having scouts able to travel and be physically present in stadia added fuel to the fire. It's a trend we only expect to continue in the coming years.
From a tech perspective, this year was always going to be challenging. A lot of the work required to move the business from one that started collecting 9 competitions a year in 2018 to one that will collect 80+ in 21-22 was boring, technical work that had no direct visibility for customers. Cleaning up early tech debt consumed a ton of developer hours, as did adding a framework for basic match metadata that was completely owned and controlled by StatsBomb. New project work took up the rest of the time, and when you combine all of this with the difficulty of transitioning the team entirely to remote work at the developer AND DATA COLLECTOR levels, you end up with some tradeoffs. One of the obvious ones was fewer StatsBomb IQ updates than in recent years, which was mostly noticeable both to customers and consumers as less cool stuff appearing in their social media feeds and on the platform. The good news is that we’re through the other side of a lot of that work and IQ now has more dedicated developers than ever before. CTO Thom Lawrence and most of the senior dev team have spent Q4 interviewing rafts of job applicants, and we’ve been hiring like mad men and women. It has been hugely encouraging to see the volume of talented people interested in working at StatsBomb. Scaling from a small team of startup founders and early employees into a mid-sized dev team is often fraught with difficulty, but we’ve had two massive hiring rounds the last two years, and I continue to be surprised and impressed with the quality of people that join our company. I’ve said publicly before that hiring was always one of the things I worried about, but I worry a lot less than I used to, and credit for that is distributed across our entire company. We are a great place to work, and that is only true because of the people that work for us.
What does growth look like for StatsBomb during the pandemic year?
MRR stands for "monthly recurring revenue," and it’s a better measure of growth for a company like StatsBomb, whose business model revolves around selling software and data as a service. The red line represents actual customer contracts, while the two other lines were projections done in January and then post-COVID in May. As you can see, we took a hit in the spring/summer like everyone else, but the recovery since that time has been dramatic. Despite a global pandemic hitting revenue for our entire customer base, we’ve managed to triple recurring revenue for a third year in a row. As mentioned above, for various reasons we didn’t release almost anything new in 2020, which means our sales strength last year lies mostly in the data launched in 2018 (we have done some minor upgrades) and StatsBomb IQ, which has been iterated since our inception as a company in 2017. It’s worth digging into this a bit further, if only to cut through the noise from other companies and people on social media.
We created our own data because we felt there was both a need and a desire for better quality data in football. It was necessary to produce better information and predictions around recruitment, opposition analysis, and to more closely match with coach expectations for how they look at the game.
With zero fuzzy nonsense about how artificial intelligence allegedly delivers better insights about the game, StatsBomb Data delivers better information across the full data set and at the individual chance level than every other competitor in the space. Obviously I run the company, so you can expect me to say things like this, but detailed outside analysis agrees. Almost regardless of your modelling approach, better data yields better results. If anyone tries to tell you otherwise, they are wrong. Better information is important in recruitment. It’s important in fantasy sports. It’s important in team analysis. And it’s important in gambling. It's now clear in the football world that if you care at all about the quality of information you produce in these areas, you need to be on our data. And as you can see from our growth numbers, the market has responded to this.
However… football is a huge sport - we’re just scratching the surface for how big we can grow. From a customer perspective, we’re growing quickly almost everywhere now. We have paying customers at the top of the Champions League right down to English League Two. We have customers in every country in the big 5 and dotted around smaller leagues as well. We also work with the world’s number 1 ranked men’s national team in Belgium, and federation interest is increasing rapidly. But again, given all the countries in the world that play football, we are just getting started.
Note: I want to say a brief thank you to our investors as a whole, and especially Matthew Lubman and Cristian Cibrario. Luckily, they did not need to jump in and save us back in the spring when the entire world stopped playing football, but they were willing to do so, and that by itself deserves a lot of credit.
New Stuff in 2020
So I mentioned a lot of effort in 2020 went toward new projects, but projects that no one has seen thus far - it’s probably useful to fill in some holes on what they are.
- Significant 2020 effort went into building the infrastructure and collection software around Live data collection. This spring we will begin to deliver live data streams for a small set of leagues that will gradually expand over time. We had hoped to do that this past autumn, but the impact the pandemic had on hiring and training new collectors made it impossible to staff in a timely manner. Later in the year, we will deliver a Live IQ toolset designed to work with this new data and unlock insights and visualisation for media and gambling customers. We have already incorporated our best-in-class offline expected goals model into Live, and have one more significant upgrade we will deliver in the Spring that will make this product truly compelling alongside vis work designed for the media landscape.
- So much Data Science. We started partly as a data science company, and then turned into a data company, which due to resource conflicts, meant that delivery of new insights based on data science suffered for a while. In December, we hired our fourth data scientist, and they all have so much lost time to make up for. Part of their focus will be taking a lot of the models developed behind the scenes and getting them into the IQ product and API, but most of the focus of this team will be breaking ground on new research. Expect to hear more about some of the recent projects in March.
- American Football. We’ve quietly been working on a second sport in 2020 and expect to deliver that product to the other football world in summer 2021. I don’t want to give away much detail until we are ready to put this new data in the hands of customers, but I will say that I’m really happy with what we’ve ended up with from a data perspective. And I should be, since I’ve spent about 3-4 man months of my own work time from 2020 focused on this product.
- Finally, you get the big one. Which is mostly REDACTED until the worldwide product launch in March 2021. Here is what I can tell you:
- It is an entirely new data product for football/soccer.
- It's called StatsBomb 360.
- To my knowledge, no other company currently delivers a product like this.
- In fact, I'm not sure the industry even realised this was a product. But it is now.
- Current StatsBomb customers will hear a lot more about 360 Data starting in mid-January.
- We will deliver around 3000 matches of data for this new product in the 20-21 season. That will then scale to 38 competitions worth for 21-22. (For comparison, base SB Data will have 80+ for that season. Given we didn’t cover that many leagues until our third full season of data collection, it is a huge scale up to get this product running at that capacity.)
- With zero hype, I believe this will fundamentally change how people analyse the game. I said the same at our initial data launch in 2018, and this new data is at least as significant as that, if not more so.
So that’s three new products in market in 2021, on the back of our third year in a row of massive MRR growth. We want to become the best sports data company. 2021 will go a long way toward realising that goal.
2020 has been hard. Despite what sometimes felt like the unbearable weight of the world around us, I am incredibly proud of what our team managed to accomplish this year. Even if the progress wasn’t obvious to the outside world, what we’ve done in 2020 has us delivering over 600 matches a week to current customers, and has set us up for a big year in 2021. Hopefully we will celebrate everyone’s success and a return to normalcy at the StatsBomb Conference in October 2021. Or online at the new product launch in March 2020? Actually yeah, let's do that... check our social media for the launch day and time for StatsBomb 360. You are all officially invited.
Until then, all the best,
PostScript: StatsBomb Internal Hackathon vis or illegal soccer rave?
Hackathon prototype or new concept around air hockey?
A manager change. A global pandemic. A 10th place finish in expected goal difference. Another FA Cup trophy at the end of it. And then the release of seemingly the entire scouting department plus the firing of Head of Football Raul Sanllehi to boot. What to even make of the 19-20 season from Arsenal?
Recruitment last summer was "fine," if overpriced. Wide forward had long been a need, and Nicolas Pepe filled it adequately, even though Arsenal probably overpaid for him by £25M. Kieran Tierney took ages to get healthy and settled, but once he started playing, he quickly became a fan favourite. 18-year-old William Saliba was purchased and then immediately sent back to Saint-Etienne to continue his education. And David Luiz was brought over from Chelsea as a rich man's Shkrodan Mustafi, in the hope that Unai Emery would never be forced to play both of them together on match day.
Combine those with the younger signings from a year before in Torreira, Bernd Leno, and Matteo Guendouzi, and it felt like Arsenal fans might have a reason to be optimistic.
My worry with the hire of Unai Emery was that Arsenal would sacrifice the funky attacking patterns that were a hallmark of Arsene Wenger's era in exchange for stabilising the defence. The reality was that Emery never stabilised the defence while the attack did indeed become a hell of a lot less fun.
Along with frustrations over recruitment toward the end of the Wenger era, including a complete lack of inbound young attacking talent, one of the points I had hoped Arsenal would address with a new manager was to become more aggressive in the high press. The low block and occasional pressing style under Steve Bould (from the late Wenger years) was just effective enough for fourth, but no more than that. My hope was that Arsenal would dial up the pressure and move up the table as well.
Unfortunately under Emery, Arsenal not only forgot how to dominate the ball, they also forgot how to dominate... anything.
The lack of an elite defence (see Manchester United) combined with relative dross on the attacking end, meant Unai Emery was shown the door after a string of poor results in November. Speaking to one club insider this past summer, they stated that the writing was already on the wall for Emery after season 1, which made it even more strange that Sanllehi wanted to extend his contract. They also felt the summertime splash in the transfer market made little sense with a lame duck manager, which isn't wholly unfair.
Reading between the lines, the state of Arsenal's upper management in recent years can be described as "bumpy" at a minimum, and ranges to "complete flaming chaos with a side of potential corruption" for those who were both present and particularly descriptive of the situation.
Enter Mikel Arteta
Allegedly, Arteta was nearly hired to succeed Arsene Wenger before a change of heart saw Emery take the reins in summer 2018. A season and a half later, Arteta entered the club with his first chance at being a head coach. Before moving forward, however, it makes a little bit of sense to step back.
In spring of 2017, Arsenal did a very quiet set of interviews with top head coach candidates in preparation for Wenger leaving the club. On the potentials list were a variety of names, including Thomas Tuchel and Roger Schmidt. (I heard Marcelino was mooted as well, but I don't know anything beyond rumour there.) Tuchel would later replace Emery at PSG and take them to the CL Final this summer, while Schmidt finished out his time in China before taking a break from football, and was recently hired to coach similarly-abbreviated but not-remotely-the-same-stature club PSV. I note this because at one time it seemed fairly clear Arsenal were looking for tactical styles that included a regular high press.
In addition to playing under Arsene Wenger, Mikel Arteta spent three seasons coaching and learning from Pep Guardiola, not only one of the greatest coaches in modern football, but also an advocate of the... you guessed it... high press playing style.
The problem for coaches taking over mid-season is that it's really hard to find the training time to teach defensive structure changes. This is especially true when the team they take over are also involved in Europa League play and make deep runs in domestic cups - there just isn't enough time on the training pitch to completely revamp the principles. So it wasn't entirely surprising that Arteta's Arsenal team played similarly to Emeryball, with perhaps a bit better tactical understanding.
This is especially true when you consider that most of the post-pandemic run-in were games where the results didn't matter, at least for Arsenal. Combine a potential lack of motivation in the league run-in (and a focus on the FA Cup) with four red cards in the 21 matches after Arteta took over, and you get a sense that the headline averages don't tell the whole story.
What was slightly surprising, was that the team started to put up honest-to-god results against Big 6 teams for the first time in... well, seemingly forever. Arsenal beat Liverpool in the league, then Man City and Chelsea in the FA Cup before defeating Liverpool on penalties in the Community Shield. This was new. This was different. Even if the process wasn't exactly sterling - Arsenal were still losing the expected goals battles - the results were... good?
But are they sustainable? Ahhhh, there is the rub.
How do Arsenal get better in 2020-21?
A) Recruitment Arsenal's squad have been an army of misfit toys for a while now. They have talent, but the elite players seem to play the same positions (Aubameyang and Lacazette), and there are almost no peak age players (24-27) in the squad. From a squad building perspective, it has been messy for years.
Possibly the biggest need in the squad this summer was retooling the centreback rotation. If Guardiola's style is anything to go by, Arteta will require ball-playing CBs with pace. William Saliba is extremely young, but the potential answer to part of that equation. The addition of Gabriel Magalhaes from Lille looks to be another potential answer. However, given the ages, my guess is that these two will rarely play together as a pairing this year.
Which leaves you with the rest of the centrebacks. Mustafi looked mostly competent post-pandemic right up until he reverted to his tradition Arsenal form of calamitous mistakes in the last few matches. If calamity is the baseline, the brief purple patch has to be considered an outlier. Relying on him is not a thing most coaches would do by choice. Sokratis has been deemed surplus to requirements, and given his age when he arrived from Dortmund was always a stopgap measure. Pablo Mari looks to be a man mountain, but injury in the opening minutes of the season's return meant we have basically nothing further to evaluate him on aside from being large and left-footed. It's a start, at least?
Rob Holding is likely to be a loanee this season, lacking both the pace and the passing to be a top tier CB, though he's a serviceable Premier League one and was bought for a pittance from Bolton.
Calum Chambers is... still under contract.
That leaves us with the variance of David Luiz. Luiz's tenure at Arsenal has been plagued largely by mistakes and red cards, which is why it came as a surprise that he signed a contract extension early in the summer and looks set for another season at the club. Fans were left muttering, "Kia taketh, but when doth he give?" Willian on a free was presumably not the hoped for answer.
While Gabriel and Saliba are probably the future, the present looks somewhat uncertain.
Equally uncertain is the midfield composition. Convincing Dani Ceballos to return was vital - he was Arsenal's most important midfielder after the pandemic return, and offers versatility as well as top quality production. Xhaka has the passing but lacks the legs to be one of the elite players at his position, and then you hit... well, we don't know yet.
Torreira is likely to be sold, Guendouzi as well, and Arsenal wish Ozil would wriggle himself into a move anywhere but here. Which leaves the cupboard fairly threadbare when it comes to central midfielders. Outgoings could fund a high level incoming like Aouar (which would be fairly impressive recruitment coup), but even adding a player of that calibre means quite a few minutes for Joe Willock to gain experience. Maitland-Niles looks like RB1 if Bellerin leaves to pursue trophies and fashion in Paris, which leaves Bukayo Saka as a pacey flex-8 and almost no one else.
If Arsenal do sign Aouar, they'll hope for a return of his 20-year-old season under Bruno Genesio.
My thoughts on Thomas Partey I will revisit in a postscript.
Signing Saka to a long-term extension was extremely important for both the Gunners and their fanbase. The academy production line has gone from almost nothing five years previous to churning out attacking talent that is only a small step below the best academies in the world. Saka broke through and regularly looked like one of the better players on the pitch playing in the Premier League at age 18. Losing that type of talent would have been heartbreaking, even if Arsenal don't quite know his best position yet.
Nketiah (21) isn't quite ready to lead the line, but he's probably only a season or two off and also needs lots of minutes to finish his development. Reiss Nelson (only 20) is more of a question mark, and this is especially true with the signing of Willian. My guess is Arteta felt he wasn't going to be good enough to be the second option wide right, which means Nelson probably goes into the Sales/Loans pile to generate enough funds to fill out the squad, but I could be wrong.
Before his injury, Gabriel Martinelli looked like a wonderkid, and it is hoped he'll continue to develop into an elite attacker even if he probably won't be back in action until the winter.
Arsenal's attacking youth are a bit too young and a bit raw, but are already good enough to merit either decent minutes in the league and cup competitions, or to generate enough sales revenue to fund additional transfers.
B) Defensive Style
"No playmaker in the world can be as good as good counter-pressing." - Jurgen Klopp
One of the ways Arsenal can get better as a team is by playing a better defensive style. High pressure is related to both fewer expected goals against and to generating additional expected goals in attack. Arsenal as a whole lack some creativity in attack, and I'm not sure they are going to be able to fix that in the transfer market this window.
However, in the absence of elite creative players, gegenpressing and learning to transition more and better can still help generate additional, almost free goals. The test here will be whether Arsenal have the personnel to do it, and whether Arteta can get them playing it well in time for the new season.
Even if they aren't elite at it to start (and neither Klopp's LFC nor Pochettino's Spurs were), committing to this style and improving at it over the course of the season is a useful goal unto itself. And potentially necessary to help generate more goals from a team that doesn't look like it will add creative firepower until potentially January at the earliest.
C) Set Pieces These went from a mild success under Emery to a moderate disaster under Arteta both in attack and defence. Andreas Georgson has been brought in to help improve this phase of the game and I wish him all the best. Even if he makes it so I stop wincing every time Arsenal face a defensive corner, I will consider this a success.
D) Behaviour With a Lead
This one is subtle, but it was a massive problem throughout the course of last season. Arsenal had the fourth worst expected goals difference in the league when they had a lead.
All of the top 6 teams had a positive expected goal difference with the lead. The best teams look to extend leads, not protect them. Protecting leads is a recipe for 14 draws in 38 league matches and another likely midtable finish.
To put this another way - when playing with a one-goal lead, Arsenal became relegation candidates. They had 36% of the expected goals, 32% of the total shots, but still managed 47% of the goals. Luck? Talent? Bit of both probably, with a strong worry that it won't be repeatable should they try it again.
This is typically a tactical choice, and one that needs to change if Arsenal are going to challenge for anything useful this season.
An optimist would look at the FA Cup trophy and numerous victories against the Top 6 as a positive sign. They will also take hope in the idea that Arteta should put in a similar defensive system to what Pep runs at City, even if they are cautious that Arsenal may not have the players to execute that style particularly well for at least another year.
A pessimist will look at the average numbers on both sides of the ball under Emery and Arteta, and conclude that Arsenal have not been Actually Good since 2016-17 and there are no obvious signs of getting better. They will also look at the mess at the top of the club and mixed bag of recruitment as further proof that Arsenal are not ready to pull out of their slide into mediocrity just yet.
Myself, I lean toward optimism. I think the results suggest a better process is on the way, and that's backed up by my eye test when I watch this team execute tactically now versus what it was like under Unai Emery.
A top 4 challenge is probably a little less than a coinflip, especially with the firepower Chelsea have brought in this summer, but if better tactical choices are made, a return to the top 6 is fairly likely.
I moved this down here because it requires some time and some nuance to explain and I didn't want to clutter up the preview with additional nonsense.
If you prefer the podcast version of this argument, you can find it here: https://soundcloud.com/statsbomb-pod/statsbomb-podcast-july-1-q-a
I said this back in June and I stand by it. I also followed up that tweet with additional reasoning.
First, "free" isn't free. In many cases free simply means taking 30-50% of the transfer fee a team would normally pay and adding it to the player's wage packet. So instead of Partey making £200K a week, he now makes 300 or 350K a week. Given his age - he would be starting his contract at 27 - this means Arsenal would be taking a HUGE risk if Partey didn't work out.
When it comes to squadbuilding, taking big risks on older players with giant wages and potentially low resale value is a terrible idea. Occasionally it works out - Aubameyang at Arsenal is one rare example and someone I would have been happy to buy at £50M but much less happy with at £70M - but mostly it doesn't and it means you are stuck with a player that has declining output and who may become effectively unmoveable.
(There are a million zillion examples of this, but the one that always strikes me is how often Manchester City did this until a couple of years ago. Fernandinho worked out alright, but a ton of those others guys did not (Navas, Negredo, Bony, Nolito), and the failure to keep younger fullbacks around basically cost City a title in Pep's first season at the club.)
Beyond the age, and the wages, and the squadbuilding principles, I just don't see good data-based reasons to be excited about Partey.
If you're splashing the cash, you really want scouting and data to both be excited about a player. That's not output I want to spunk £50M and big wages on. Or big wages and a 5-year contract.
He was amazing against Barcelona! Great, then why wasn't he amazing against all the other teams in La Liga last season?
Atletico are weird tactically, and his stats are not reflective of how he will play elsewhere. Again, fine... but we can ONLY judge his output from Atletico because that's the only place he's played since 2015. This increases the risk that the transfer goes wrong.
He's an elite defensive midifelder! His output looks nothing like an elite defensive midfielder.
He's NOT a defensive midfielder, he's a box-to-box midfielder who is great on the ball. Okay, but he also doesn't score goals or create goals for his teammates, so what are you actually paying for?
Transfer shopping is all about correctly evaluating and mitigating risk. If you are going to spend big, there needs to be very little risk something doesn't work out, and hopefully some upside involved as well. Liverpool don't make mistakes on transfers. That's why they are where they are now, despite finishing 8th as recently as 15-16. If you buy young guys and they don't work out, at least you can sell them on at a small loss and roll the dice again (Lucas Torreira). But if you buy/sign an older player on big wages and they don't work out, you are dead. You either have to eat a ton of their wages to move them on, or they stick around until the end of their contract with declining production.
My perspective on Arsenal and transfers has been the same for years now. I think they should not be building to try and finish 4th and make the Champions League.
I think they should be building a squad to try and finish first, even if that's one or two years down the line. And the way you do that is not via signing 27 and 28yos... you do it via taking some risk on 21-24 year olds, hoping your analysis is good, and letting them mature into elite players while improving the style of play.
We complete our data history of the European Cup with the all-Bundesliga final of 2013. After seeing off the Spanish giants in their semi-finals, Bayern Munich and Borussia Dortmund met at Wembley, each seeking to become the first German winner in over a decade. This is the sixth and final part of the series.
We’ve previously covered: - 1960: Real Madrid 7 - 3 Eintracht Frankfurt - 1972: Ajax 2 - 0 Inter Milan - 1989: AC Milan 4 - 0 Steaua Bucharest - 1995: Ajax 1 - 0 AC Milan - 2009: Barcelona 2 - 0 Manchester United
Bayern were the favourites coming into the match, having run away with the Bundesliga and traversed a difficult route to the final that included a historic 7-0 aggregate thrashing of Barcelona in the final four; Dortmund had come ever so close to elimination against Málaga in the last eight before then seeing off Real Madrid to make it through to the final.
Bayern had been extremely unfortunate to lose out to Chelsea in the 2012 final and were seeking to make amends and send coach Jupp Heynckes off into retirement on a high with victory.
New Style, Vintage Results
Just as we seemed to have settled into a stylistic tussle between patient possession and deep block defending, along came the Germans to upset the apfelkarren. Suddenly, the attention of the footballing world shifted to the Bundesliga. Gegenpressing (later translated as counterpressing) firmly entered the football lexicon and there was much talk of the importance of transitional phases of play.
The meeting of two German sides at Wembley produced a high-paced encounter that was actually closer in style and output to the 1960 final that any of the others we’ve covered in this series. The shot count was nowhere near as high, but the 2013 final nevertheless sits second only to 1960 in terms of the expected goals (xG) total, although that was heavily tilted towards Bayern.
There was also some of the frantic, back-and-forth play of that early final on display. The average speed of attack was the fastest of all the finals we’ve covered, faster still than in 1960. Dortmund were especially swift to transition forward after gaining possession.
The average pace towards goal for teams in last year’s Champions League was 2.53 metres per second; Dortmund raced forward at a rate of 4.61 metres per second. Not that it lead to a particularly dangerous set of shots. Jurgen Klopp’s team began on the front foot, getting off six efforts on goal before Bayern had even mustered one, and accumulated 12 over the course of the 90 minutes.
But even with Robert Lewandowski, scorer of all four goals in Dortmund’s 4-1 thrashing of Real Madrid in the first leg of their semi-final and impeccable in his use of his body to shield the ball and turn defenders, and an effervescent Marco Reus among their starts, they not only managed five less shots than Bayern, but the average quality of those shots was also far below those of their opponents. Despite a heavily aerial attack, there was very little fat on the Bayern shot map.
Remove Dortmund’s penalty from the equation and they created under one expected goal. It may have taken Bayern until the 89th minute, when Arjen Robben skipped between two defenders and finished neatly to finally enjoy success in a major continental final after two failed attempts at the Champions League with Bayern and a World Cup final defeat with the Netherlands, to score their winner but it was clearly deserved. Robben and Thomas Müller had been involved in much of their best play.
The pace with which the two teams attacked saw them regularly turn over possession. Even Bayern’s more patient buildup in deeper areas usually eventually resulted in a long ball forward from one of the two central defenders.
The final featured the lowest passing completion percentage of any since 1960, with just a 71% completion rate -- nearly four percentage points fewer than the next lowest. Dortmund’s 65% rate was the first time since 1960 that a team had dipped below 70%.
Not only did it stand out in comparison to the other finals in this series but also within the context of contemporary finals. The completion rate was the lowest of all those contested in the 2010s. [table id=82 /] And that’s the thing. For all that this was heralded as a new dawn in football, it didn’t start a revolution nor did it herald a new era of German dominance.
The national team won the following year’s World Cup, but did so with a more possession-dependent style of play. At club level, Spain came back strongly, with Barcelona and Real Madrid lifting the Champions League trophy in each of the subsequent five seasons -- four times in Madrid’s case.
Germany is yet to provide another finalist.
Such is the widespread availability of footage in the modern age that even before Bayern and Dortmund took to the pitch at Wembley, their ideas had already been acutely analysed and elements incorporated elsewhere.
They didn’t enjoy the same sort of extended advantage that a novel play style afforded Inter Milan in the 1960s or Ajax in the 1970s, for example. The totals for counterpressures and counterpressures in the respective attacking thirds in this match fell on or below the average points for those metrics during last season’s Champions League.
What was once unique quickly became commonplace.
Pep Guardiola’s arrival at Bayern in the summer of 2013 and some of his innovations, including narrowly positioned full-backs, also provided ready examples of how possession-based teams might seek to better protect themselves against rapid transitions. Add all that up and this final almost feels like a rapidly resolved glitch in the system.
Dangerous Bayern Corners
This Bayern side were a real force from set-pieces. Two of the goals in their semi-final rout of Barcelona had come from them, and they also created numerous chances from corners in the final. Seven shots from eight corners and pretty much an entire expected goal.
There wasn’t all that much sign of some of the more advanced routines we see these days, although a neat early free-kick scheme saw Thomas Müller drop off to receive a central pass and lay wide for a cross headed on goal by Mario Mandzukic.
The same player was unable to adjust his body sufficiently to successfully convert a near-post flick from a right-wing corner. But in Mandzukic, Müller and Javi Martínez, Bayern had three players very much capable of winning individual duels to get on the end of deliveries.
We hope you’ve enjoyed this series. Alongside our release of the Arsenal Invincibles data earlier this week, we also made our data from each of the last 20 Champions League finals freely available. If you fancy digging into some of the competition’s recent history, all the details for accessing the data can be found here.
And a complete primer (in English and Espanol) on how to work with the data via StatsBombR is here.
As those of you who follow me on social media are aware, earlier this year we started working on The Invincibles Project. The idea behind this was to collect all of the data from this historic season to be able to look at it through a modern lens. I had initially pitched this as a follow-up project after the Messi Data Biography as something different, and another way of unlocking football's history.
As an Arsenal fan, I found the whole thing exciting. Prime Thierry Henry! Doing things like this:
The majesty of Robert Pires. Taking bodies!
Dennis Bergkamp! Patrick Vieira! Jose Antonio Reyes! Kolo kolo Toure! Sol Campbell! Mad Jens!
OMG SO EXCITING.
Also as an Arsenal fan, I know that other Arsenal fans could use a little joy in their lives and this seemed like the only way we were getting anything fun out of the Gunners in 2019-20.
We started collecting this with an eye to releasing it side by side with the data set from a different red team, should they manage to finish their season undefeated. Sorry Liverpool fans, due to circumstances beyond our control, that data release slipped through our fingers. You'll have to settle for merely a league title and one of the largest title winning margins in history.
In order to collect data, we need to have video. It was fortunate for us that Lionel Messi has played his entire career for Barcelona, because that is one of the few teams in the world that has historic video available on the internet from pre-2010 without needing to jump through a million hoops. That doesn't mean that getting all of the video to reconstruct Messi's club career was easy - far from it. It was merely doable.
Arsenal? The only undefeated season in Premier League history? You would think this would be at least as simple as sourcing 15 seasons of Messi, right?
It was not.
We managed to get about half the 2003-04 season from the usual sources of football video history. And then we hit a wall. Our man in Spain and historic video expert Pablo Rodriguez then went to work, checking with various and sundry collectors that he knows who have large archives of historic, important football video. Through these wonderful people and the standard exchange of goods and services we were able to get to 32 matches of video. And then we hit another wall.
Why? well as Andrew Mangan of Arseblog reminded me, not all matches during that time period were broadcast to TV. In the modern day, every Premier League match is broadcast to air in multiple countries, which makes it easy to grab that video and store it away on a giant hard drive. Back then? A number of 3PM matches on Saturdays were simply never broadcast. (At least to our knowledge.) Which means that the collectors would not have that video unless they somehow tapped into different sources.
We checked with Arsenal. I've been lucky enough to meet people that work for the club over the years, and we figured maybe they would let us have access to the video to collaborate on the data release and some cool stuff with club media. And they totally would have been...
Except they didn't have the video either.
Someone who worked for Prozone back in the day suggested that the opponents might have those videos, as they would have been delivered by courier as part of their service. But that ran into a variety of snags, including the fact that football clubs change personnel on this end with remarkable regularity, and having the archive, being able to access it, and even knowing who to talk to was insurmountable for us.
The other problem here is the transition from analog to digital. Pretty much all archives back then were tape archives that would later need to be digitised so the match would be preserved for history. Rob Bateman of Opta tells the tale of trying to collect old Premier League matches from the 90s and being surrounded by crumbling video tape from the league's first decade. These Arsenal matches came right at the tail end of that period, and my understanding is that the PL has started to archive its history as much as possible, but it's still very much a work in progress.
Finally you hit the problem of a license fee. We got in touch with the archive service with a willingness to pay a fee to obtain the final six matches needed to complete the project. We were quoted a figure to license the video for the entire Arsenal season that frankly didn't make any sense to me, and certainly eclipsed my budget for a public service project.
I wanted to get everyone a data gift to bring people some joy during the pandemic, but I didn't want to/could not pay the price of a car to make that happen.
The Premier League itself actually showed willingness to help us out, but as you can understand, they are rather busy with other priorities right now (like restarting the league during the middle of a viral pandemic) and suggested maybe we can revisit this when the world wasn't quite so mad? Which totally makes sense.
But I have an anniversary data release deadline, and thus here we are.
Classics Data Pack 1
To make up for my own disappointment in not being able to complete this project, I added some extra matches I thought might interest people, including non-Arsenal fans. So what you are getting today as a gift from StatsBomb is a hefty little slice of football history, wrapped in the above-named package. In addition to delivering 32 of 38 matches from the Arsenal 2003-04 Premier League season, we are also giving you UEFA Champions League Finals data from 2000-2019. The collection on those CL matches aren't all finished, so will trickle out to the repository gradually over the next week to complete the set.
Thank you to all of the fans out there who have supported StatsBomb over the years. Thank you to our customers who buy our products and give us feedback to make us better every day.
And thanks to Arsenal for a truly magnificent season and set of memories. It would be great if we could get some more of those sooner rather than later. Information on how to access the data is here
The data comes with our standard non-commercial license that is usable for fan analysis and academic research. If you are a commercial entity that would like to use this data, get in touch with firstname.lastname@example.org and we can have a conversation.
All the best,
*If we get video and I still run StatsBomb, we will finish this project.