2020 in Review... but Mostly 2021 in Preview

2020, annus horribilis. In the context of the world, this past year has been simply awful in nearly every conceivable way. In the context of StatsBomb, it’s mostly just been weird. What could have been a lost year ended up making our little data company stronger, but not in ways you’d notice unless we told you. Today I’m going to walk through this very strange year from a StatsBomb perspective and also clue you in on what is coming in 2021.

Wooo, 2020 is great!

On February 25th, we passed our sales goal for the first quarter… And then we didn’t sign another contract for 2.5 months. For a SAAS company with revenue growth baked into the cash flow, that's a horrifying outcome. Two months of firing on all cylinders and adding new customers everywhere, and then nearly a full quarter of existential dread at a company level (and god knows what at a personal level) due to the COVID-19 pandemic. Internally, we shifted to work from home as soon as we got back from Sloan in the second week of March. We also furloughed some of our staff later in the spring, while nearly all of StatsBomb North took salary deferrals until we could find our feet financially. We cut most of the marketing expenditure (largely content publication), tried to figure out what to collect when there were no football matches being played, and began discussing with investors how we could raise enough money to survive however long *waves hands vaguely* this was going to last.

Courses

2020 was supposed to be a big year for us at conferences and teaching courses in English and Spanish. It still ended up that way, but only with regard to courses, and only because we rapidly converted both existing courses (Intro to Analytics + Coaching and Analysing Set Pieces) and new stuff (Modern Scouting and Data Recruitment) into online versions.

The Introductory course gets rave reviews and has been taken by people all over the world at this point. It is not mathy or technical, and appropriate to anyone who wants to learn a bit more about how the sport actually works. This includes coaches and professional footballers. Thanks to Pablo Rodriguez translating the entire thing, it is also now available en espanol. Set Pieces (built by Euan Dewar and myself) had 100 people sign up for a live course taught by me in April, 70 of whom already worked in professional football including half of the Premier League. I re-opened it for a month late in the summer due to high demand, but it won’t go live again until after Euros 2021 at the earliest, and maybe not at all. I believe that people who paid to learn set pieces from us deserve time to build their edges. Though teams haven’t made football weird yet, it’s finally trending that way.

The Set Pieces course is great, but the Recruitment course James Yorke put together might be our best work. It touches on nearly everything we have learned from six years of doing this professionally, and delivers new ideas and frameworks appropriate to both seasoned professionals and absolute beginners in the football recruitment space.  Ever since I founded it as a blog, StatsBomb has always had a mission to teach the world more about football. The courses are the best vehicle we currently have to do this, and we remain dedicated to this mission. We’ll look to translate Recruitment to Spanish and potentially all the courses into new languages in 2021.

May-June 2020

Back to the global pandemic... unlike so many other businesses in this year, we were lucky. The return of the German Bundesliga in mid-May followed by nearly all the other leagues in June signalled a return to (somewhat) normal from a sales perspective. Cash flow stabilised and then customer acquisition accelerated? So much so that in late summer, it was all we could do to keep up with signing contracts and onboarding new customers. This was a very good problem to have, especially relative to what looked possible in the March-April timeframe. Football data use was already accelerating across the industry, but not having scouts able to travel and be physically present in stadia added fuel to the fire. It's a trend we only expect to continue in the coming years.

Development

From a tech perspective, this year was always going to be challenging. A lot of the work required to move the business from one that started collecting 9 competitions a year in 2018 to one that will collect 80+ in 21-22 was boring, technical work that had no direct visibility for customers. Cleaning up early tech debt consumed a ton of developer hours, as did adding a framework for basic match metadata that was completely owned and controlled by StatsBomb. New project work took up the rest of the time, and when you combine all of this with the difficulty of transitioning the team entirely to remote work at the developer AND DATA COLLECTOR levels, you end up with some tradeoffs. One of the obvious ones was fewer StatsBomb IQ updates than in recent years, which was mostly noticeable both to customers and consumers as less cool stuff appearing in their social media feeds and on the platform. The good news is that we’re through the other side of a lot of that work and IQ now has more dedicated developers than ever before. CTO Thom Lawrence and most of the senior dev team have spent Q4 interviewing rafts of job applicants, and we’ve been hiring like mad men and women. It has been hugely encouraging to see the volume of talented people interested in working at StatsBomb. Scaling from a small team of startup founders and early employees into a mid-sized dev team is often fraught with difficulty, but we’ve had two massive hiring rounds the last two years, and I continue to be surprised and impressed with the quality of people that join our company. I’ve said publicly before that hiring was always one of the things I worried about, but I worry a lot less than I used to, and credit for that is distributed across our entire company. We are a great place to work, and that is only true because of the people that work for us.

Growth

What does growth look like for StatsBomb during the pandemic year?

MRR stands for "monthly recurring revenue," and it’s a better measure of growth for a company like StatsBomb, whose business model revolves around selling software and data as a service. The red line represents actual customer contracts, while the two other lines were projections done in January and then post-COVID in May. As you can see, we took a hit in the spring/summer like everyone else, but the recovery since that time has been dramatic. Despite a global pandemic hitting revenue for our entire customer base, we’ve managed to triple recurring revenue for a third year in a row. As mentioned above, for various reasons we didn’t release almost anything new in 2020, which means our sales strength last year lies mostly in the data launched in 2018 (we have done some minor upgrades) and StatsBomb IQ, which has been iterated since our inception as a company in 2017. It’s worth digging into this a bit further, if only to cut through the noise from other companies and people on social media. 

We created our own data because we felt there was both a need and a desire for better quality data in football. It was necessary to produce better information and predictions around recruitment, opposition analysis, and to more closely match with coach expectations for how they look at the game.

 

With zero fuzzy nonsense about how artificial intelligence allegedly delivers better insights about the game, StatsBomb Data delivers better information across the full data set and at the individual chance level than every other competitor in the space. Obviously I run the company, so you can expect me to say things like this, but detailed outside analysis agrees. Almost regardless of your modelling approach, better data yields better results. If anyone tries to tell you otherwise, they are wrong.  Better information is important in recruitment. It’s important in fantasy sports. It’s important in team analysis. And it’s important in gambling. It's now clear in the football world that if you care at all about the quality of information you produce in these areas, you need to be on our data. And as you can see from our growth numbers, the market has responded to this.

However… football is a huge sport - we’re just scratching the surface for how big we can grow. From a customer perspective, we’re growing quickly almost everywhere now. We have paying customers at the top of the Champions League right down to English League Two. We have customers in every country in the big 5 and dotted around smaller leagues as well. We also work with the world’s number 1 ranked men’s national team in Belgium, and federation interest is increasing rapidly. But again, given all the countries in the world that play football, we are just getting started.

Note: I want to say a brief thank you to our investors as a whole, and especially Matthew Lubman and Cristian Cibrario. Luckily, they did not need to jump in and save us back in the spring when the entire world stopped playing football, but they were willing to do so, and that by itself deserves a lot of credit.

New Stuff in 2020

So I mentioned a lot of effort in 2020 went toward new projects, but projects that no one has seen thus far - it’s probably useful to fill in some holes on what they are.

  1. Significant 2020 effort went into building the infrastructure and collection software around Live data collection. This spring we will begin to deliver live data streams for a small set of leagues that will gradually expand over time. We had hoped to do that this past autumn, but the impact the pandemic had on hiring and training new collectors made it impossible to staff in a timely manner. Later in the year, we will deliver a Live IQ toolset designed to work with this new data and unlock insights and visualisation for media and gambling customers. We have already incorporated our best-in-class offline expected goals model into Live, and have one more significant upgrade we will deliver in the Spring that will make this product truly compelling alongside vis work designed for the media landscape.
  2. So much Data Science. We started partly as a data science company, and then turned into a data company, which due to resource conflicts, meant that delivery of new insights based on data science suffered for a while. In December, we hired our fourth data scientist, and they all have so much lost time to make up for. Part of their focus will be taking a lot of the models developed behind the scenes and getting them into the IQ product and API, but most of the focus of this team will be breaking ground on new research. Expect to hear more about some of the recent projects in March.
  3. American Football. We’ve quietly been working on a second sport in 2020 and expect to deliver that product to the other football world in summer 2021. I don’t want to give away much detail until we are ready to put this new data in the hands of customers, but I will say that I’m really happy with what we’ve ended up with from a data perspective. And I should be, since I’ve spent about 3-4 man months of my own work time from 2020 focused on this product.
  4. Finally, you get the big one. Which is mostly REDACTED until the worldwide product launch in March 2021. Here is what I can tell you:

 

  • It is an entirely new data product for football/soccer.
  • It's called StatsBomb 360.
  • To my knowledge, no other company currently delivers a product like this.
  • In fact, I'm not sure the industry even realised this was a product. But it is now.
  • Current StatsBomb customers will hear a lot more about 360 Data starting in mid-January.
  • We will deliver around 3000 matches of data for this new product in the 20-21 season. That will then scale to 38 competitions worth for 21-22. (For comparison, base SB Data will have 80+ for that season. Given we didn’t cover that many leagues until our third full season of data collection, it is a huge scale up to get this product running at that capacity.)
  • With zero hype, I believe this will fundamentally change how people analyse the game. I said the same at our initial data launch in 2018, and this new data is at least as significant as that, if not more so.

So that’s three new products in market in 2021, on the back of our third year in a row of massive MRR growth. We want to become the best sports data company. 2021 will go a long way toward realising that goal.

Conclusion

2020 has been hard. Despite what sometimes felt like the unbearable weight of the world around us, I am incredibly proud of what our team managed to accomplish this year. Even if the progress wasn’t obvious to the outside world, what we’ve done in 2020 has us delivering over 600 matches a week to current customers, and has set us up for a big year in 2021. Hopefully we will celebrate everyone’s success and a return to normalcy at the StatsBomb Conference in October 2021. Or online at the new product launch in March 2020? Actually yeah, let's do that... check our social media for the launch day and time for StatsBomb 360. You are all officially invited.

Until then, all the best,

--Ted Knutson

CEO, StatsBomb

Ted@StatsBomb.com

 

 

 

 

PostScript: StatsBomb Internal Hackathon vis or illegal soccer rave?

Hackathon prototype or new concept around air hockey?

 

Projecting Mexico's 2022 World Cup Squad

What will México’s squad look like at the 2022 World Cup in Qatar? The qualifying process is yet to get underway in the CONCACAF region, but we can probably say with some degree of confidence that Mexico will make it. So we sat down with ESPN columnist and expert on all things Mexican football Tom Marshall to discuss the players likely to be on the plane in two years time.

Nick Dorrington (ND), Content Manager at StatsBomb: Hi Tom. Thanks for giving up your time to join us. Before we get to the players, how have Mexico generally lined up under Gerardo Martino in terms of formation and play style?

Tom Marshall (TM): Hello Nick. Thanks for the invite! So far, it’s largely been a 4-3-3 with one holding midfielder, although Martino experimented with a back three against Algeria in October and used more of a 4-2-3-1 in the second half against Japan in November. In terms of playing style, Mexico have employed the traditional hallmarks of a Martino team: usually a high press, trying to play the game in the opposition’s half, dominating possession while trying to be as vertical as possible, full-backs pushing high up, playing out from the back and trying to win the ball back as quickly as possible on the attack-defense transitions.

ND: So between the sticks, Guillermo Ochoa seems the obvious pick as first choice. Who else do you think will join him in the squad? Hugo González and Jonathan Orozco were the other two goalkeepers at the Gold Cup last year, so are they the most likely candidates?

TM: Ochoa is definitely the first choice right now. He has said he believes that González will be his eventual replacement and that’s probably the projection as of now. Alfredo Talavera has been in form this season for Pumas and should be in the conversation, although he is now 38 years old.

ND: As you say, Talavera has had an excellent season, outperforming expectation in relation to our Post-Shot xG model.

But he’ll be 40 by the time the World Cup comes around, and in 2019-20 his numbers were less impressive.

TM: As for Jonathan Orozco, he’s arguably the most suited to Martino’s style, but had a really poor season at Club Tijuana.

ND: What are the skills that Martino’s style requires of a goalkeeper?

TM: Martino is looking for a goalkeeper who is adept at playing out from the back to start attacks. It feels like right now he doesn’t have a perfect fit. Ochoa is a very good shot-stopper and high-level performer in big games, but not the most adventurous with the ball at his feet.

ND: Orozco does look to have had a poor season. By our numbers, only Gil Alcalá had a worse tournament among Liga MX goalkeepers in terms of shot stopping. But he really does stand out with the ball at his feet, particularly in terms of his calmness under pressure. Orozco was actually the only goalkeeper during the Torneo Guard1anes 2020 to improve on his overall passing completion rate when pressured. Ochoa went longer when pressed; Orozco went shorter.

(Yellow = successful pass; Red = incomplete pass)

But even with a skillset that seems well-suited to Martino’s approach, you don’t necessarily think that Orozco will make the squad?

TM: It’s a tough one. I don’t think it is guaranteed based on Orozco not featuring for Mexico in 2020. Of the younger generation, I particularly like Luis Malagón (Necaxa) due to his comfort playing under pressure, while Carlos Acevedo (Santos Laguna) will likely get a shot and has been impressive this past season.

ND: Let’s move on to the defence. At full-back, Luis Rodríguez and Jesús Gallardo were the starters at last year’s Gold Cup. But Rodriguez will be going on for 32 by the time the World Cup comes around. Do you think his starting berth could come under threat? Who is most likely to edge him aside?

TM: Rodríguez is a good quality Liga MX player, but his starting role is by no means guaranteed for the national team. This is one of the more concerning positions for Martino. Jorge Sánchez is physical, has the stamina to get up and down as Martino likes, but tends to switch off too often defensively. Alan Mozo at Pumas has shown some potential, although I’m personally a bigger fan of Pachuca’s Kevin Álvarez, even if he hasn’t played too many Liga MX games. The other interesting option is Julian Araujo at LA Galaxy. He’s done well in MLS, but the caveat is that he hasn’t yet been persuaded to choose Mexico over the United States.

ND: Álvarez has also had some minutes at left-back, we could be a useful attribute. He seems to offer a good mix of strong defensive output and decent ball progression, particularly off the dribble.

Do you think Martino will take four dedicated full-backs or will one of the backups be a player, like Álvarez, capable of filling different roles?

TM: There are definitely a lot of options if he wants to cut a full-back. I like Érick Aguirre’s chances of making the squad for this exact reason. He can play in either full-back position, a number of roles in central midfield and even on either wing. Sánchez can and has played on the right and the left. And there’s the option of Jesús Corona as a wing-back as he’s played at times for Porto.

ND: In the centre of defence, Carlos Salcedo and Néstor Araujo look dead certs to make the squad. Who else do you think Martino will take?

TM: I’d assume Monterrey’s César Montes will be a certainty by then. He looks like he’ll be challenging for the start in the lead-up to Qatar 2022.

ND: Montes profiles like a solid, stand-off central defender with a decent passing range. Is that a fair assessment?

TM: I think that’s fair. It’s his physical attributes that also stand out. There aren’t too many Mexican center-backs with his size and strength. European clubs are certainly watching and the hope is that he’ll be playing in one of the top leagues by the World Cup.

ND: Who are the other options at centre-back? Will Héctor Moreno still be in the picture?

TM: Edson Álvarez is an option, while the two younger centre-backs that have stood out are Johan Vásquez and Gilberto Sepúlveda.

As for Moreno, he’s got a good chance of being in the squad because he is experienced and gets what Martino is looking for, especially in possession. The question is how much his level will drop between now and the World Cup given he’s an aging defender playing in Qatar.

ND: Moving forward, and presuming the 4-3-3 shape persists, who looks the likely starter at the base of the midfield? Am I right in thinking this has been somewhat of a problem position for Martino?

TM: Álvarez without any doubt. The truth is there aren’t many options. In an ideal world, Martino would have someone who could distribute the ball and be technically better than Álvarez, but I wouldn’t say it is a problem position necessarily. Álvarez is a leader, protects the defense well and already has World Cup experience. The issue for me is if he isn’t available.

ND: Is his relative lack of playing time at Ajax a concern? He’s been decent when he’s got on the pitch, but hasn’t done so a great deal since moving there from América in 2019.

TM: Yes, Álvarez has a battle on his hands at present. He’s perhaps too defence-minded for a side that dominates games in their domestic league. That said, I’d be confident that if he left Ajax, he’d be able to resurrect his career in a different European league. I’d love to see him in the Premier League and think he’d be successful at a mid-table team.

ND: Who are the other options for that defensive midfield position?

TM: Perhaps Luis Romo is next in line, but he’s arguably better further forward. Hector Herrera may be dropped back into a role he’s played before, although he doesn’t necessarily have the defensive capabilities. It's worth throwing out the name of 20-year-old Erik Lira, although he’s only really broken into the first team at Pumas this season.

ND: There seems to be a good crop of players in their early-to-mid twenties competing for the two interior positions ahead of the holding midfielder. We know that Mexico aren’t afraid of including older players in their World Cup squads -- Rafael Márquez in 2018; Cuauhtémoc Blanco in 2010 -- but will there still be a place for Andrés Guardado at 36? It could easily just have been a system thing, but there was a pretty clear drop off in his output at Real Betis from 2018-19 to last season.

TM: It’s going to take something big for Guardado not to be in the squad. He’s the leader of the team, the captain and commands respect. The experience he’s got at World Cups and playing in Europe should mean he is at least useful off the bench. Martino will be hoping that one of Carlos Rodríguez, Sebastián Córdova, Orbelín Pineda, Víctor Guzmán, Marcel Ruiz, Eugenio Pizzuto or Érick Gutiérrez will step up and earn that starting spot. As mentioned in the question, there are plenty of potential options and no shortage of talent, but those names are either still developing or have been too inconsistent to be able to confidently predict they’ll be in the XI at the World Cup.

ND: Is there a certain profile of player you think Martino will be looking for to fill those interior positions?

TM: The interior midfielders are prominent in Mexico’s press and Martino also likes technically gifted players there, often ones which would be number 10s on other teams.

ND: If we look at some of those names you mentioned, Pineda and Rodríguez look like talented players but both provide almost nothing defensively. Ruiz is a more conservative player, with solid defensive output and a safer passing profile. Cordóva and Guzmán look to have a better balance between defensive and offensive output, particularly Guzmán.

He also played on a Pachuca side who were one of the most active pressing teams in opposition territory during the Torneo Guard1anes 2020. Gutiérrez will also be playing on an aggressive pressing side under Roger Schmidt at PSV Eindhoven as and when he returns from injury. Now we’ll move on to the forward line, where there would seem to be a clear first-choice front three?

TM: Definitely. There is a lot of excitement in Mexico about Raúl Jiménez (fingers crossed everything goes well with his recovery), Jesús Corona and Hirving Lozano.

ND: They seem to have a pretty good mix of attributes. Lozano is a high-volume shooter, Corona creates more shots than he takes, while Jiménez combines solid shot volume with a good all-round game.

How has that worked out when they’ve played together for the national team?

TM: Not well! The first time Mexico lost 4-0 to Argentina and on the second occasion they were down 1-0 to South Korea when Lozano was taken off. Mexico went on to win 3-2. Martino criticized the trio for being static in that game against Argentina, indicating that they were too predictable. There was more fluidity though against South Korea and it feels more a case of them needing to play more regularly together to get used to each other's movements than the right mix not being present.

ND: Who do you think will fill the other two or three forward positions in the squad?

TM: Martino is a huge fan of the way Rodolfo Pizarro finds space and uses the ball between the lines. In fact, it’s not out of the realm of possibility that Martino could field Corona as a wing-back to allow Pizarro a starting spot. If Mexico use a 4-2-3-1, all four could also play.

ND: Pizarro is certainly someone who receives the ball all across the attacking midfield line.

He doesn’t really stand out as a creative passer, but he can knit things together and make inroads off the dribble.

TM: Let’s not forget Diego Lainez. He may not be starting much for Real Betis, but he’s still highly regarded as a potential difference-maker.

ND: Is there an obvious backup to Jiménez as the central reference point?

TM: José Juan Macías should develop into that role by Qatar 2022. He’s got the ambition and professionalism to be an important player for Mexico, although he’s a different type of center forward than Jimenez.

ND: It will be interesting to see if he can reach that level. He looked very good in a small sample size when he returned to Chivas from his loan at León in January, but he didn’t have a great Torneo Guard1anes 2020. In fact, the young forward who really stood out was Cruz Azul’s Santiago Giménez. While he is more penalty area centric, Giménez also has a pretty similar statistical profile to Jiménez.

If he can add some other elements to his game, is he in with a shot of making the squad?

TM: 100 percent. He’s certainly part of a group of players who could be in the mix, including Rogelio Funes Mori, who is in the process of becoming a Mexican citizen, Javier Hernández, Alan Pulido and Henry Martín.

ND: The forward positions are often the ones in which younger players can suddenly emerge in the year or two leading up to the World Cup. Are there any guys not yet in contention who might make a leap forward in the intervening couple of years?

TM: Perhaps Marcelo Flores, who has impressed for Arsenal’s Under-18s. LA Galaxy’s Efraín Álvarez certainly has the talent to explode between now and then.

ND: One last question: with the squad that we’ve talked through here, can Mexico finally progress beyond the round of 16 at Qatar 2022?

TM: Yes! In all seriousness, Mexico will have a solid squad and I think will go into the tournament as one of the team’s to watch in terms of being aggressive and entertaining. Martino’s team won’t take a step back, even against the top teams. El Tri will need a little bit of luck and going into Qatar 2022 as a seeded team, which isn’t impossible with the new rankings system, would be a significant bonus.

ND: That’s great. Thanks again for providing us with your time and insight Tom.

TM: No problem at all. Thanks for having me.

Introducing Similar Team Search

Over the past few weeks we’ve highlighted the 'Similar Player Search' function in StatsBomb IQ and presented some use cases for the tool. Today, we’re going to look at a related feature we’ve recently added to our industry leading analysis platform: 'Similar Team Search'.

The functionality is almost identical to that of the player search. A team and radar type (Attacking, Defending or Custom) is selected and the algorithm then produces a list of teams with similar statistical profiles, ranked on a scale of 0-100, with 100 being an exact match. A selection of filters can then be applied to tailor the results as necessary.

This is a powerful tool that takes in data from the wide range of global competitions we collect (80+ from 2021), and one that has a number of potential use cases for clubs, including:

  • Head Coach Recruitment: which teams employ a similar style of play to ours? Who are their head coaches?
  • Tactical Inspiration: team x press in a similar way to us but seem to be more effective at suppressing shots. What are they doing differently?
  • Player Recruitment: team x has a comparable approach to ours. Do they have any players who could be of interest to us?

Let’s look at a few examples using custom radars that seek to primarily identify stylistic rather than performance-related similarities.

We will start off with an attacking radar that attempts to capture team ball progression and chance creation style. Which teams come out as being similar to Manchester City, by some distance the Premier League's top scorers last season?

That looks like a promising list of high possession, territorially dominant teams. Slightly further down you’ll find sides like PSV Eindhoven, Crvena Zvezda and Flamengo -- the latter of whom were coached by Pep Guardiola’s former assistant Doménec Torrent prior to his dismissal last month.

Now let’s turn our attention to the defensive side of things, more concretely to the manner in which team press and counter-press. Our template in this case is Marcelo Bielsa’s Leeds side in their promotion campaign from the Championship. They were a team who pressed aggressively all over the pitch and made it difficult for opponents to advance into dangerous areas.

So which teams produce defensive outputs similar to those of that Leeds team?

The two closest matches are Liga MX teams Monterrey and Atlas. Two sides from Argentina and Mirandés of the Spanish second division fill out the top five. Could this be a good starting point for an eventual search for Bielsa's replacement?

Finally, let’s look at a global radar that seeks to capture where and how teams defend and how and at what pace they transition from there to attack.

Getafe seem like a good example here, given that they employ an aggressive high press but combine that not with a high percentage of possession, as is often the case with pressing teams, but with a low possession share and quick and direct attacks. If José Bordalás was to leave tomorrow, who would be capable of replicating the approach that has helped lead them to three consecutive top-eight finishes since returning to La Liga in 2017?

Well... Diego Dabove of Argentinos Juniors would seem to be a pretty good candidate. Not only are his side the closest statistical match to Getafe, but there would also be no language barrier to overcome.

Other close matches include Slovácko of the Czech First League and Valérien Ismaël’s Barnsley.

That was a quick run through of the new 'Similar Team Search' function in StatsBomb IQ. With a wide variety of metrics to choose from, it is highly customisable tool, and we are certain our customers will utilise its functionality in ways we haven't yet conceived.


If you are a club, media or gambling entity and want to know more about what StatsBomb can do for you, please get in touch.

We are currently offering an extended 14 day free trial of our analysis platform StatsBomb IQ to potential clients.