Messi Data Release Part 2, 2008/09 - 2011/12

Last week we unleashed the first part of our Messi dataset, covering the big little man's early days from 2004/05 - 2007/08. That first release offered a unique look into the burgeoning years of our intrepid protagonist. His formative days have never really been given this sort of treatment before, so the value was obvious and people seemed to have fun with it.

We did our analysis on the site featuring all sorts of data nuggets and cameos from famous faces. Of course though the whole point here is to get it out into the wider world so you can have a go with it. It's early days for this project but already folks are starting to get their feet wet. This project isn't just intended for experienced analysts, we want it to be a gateway for everyone into the world of playing with data (hence why we produced R Primers in English and Spanish). Here's an assorted selection of what folks on Twitter have shared so far, with apologies to anyone who feels left out:












Now though, we bring in part two. Given Messi's inconsistent playing time during those initial years, the real meat of his career was yet to come. With this new dataset you'll really get the chance to sink your teeth into some beefy stuff.

Our second release covers 2008/09 - 2011/12. That'll be the entirety of Pep Guardiola's time as Barcelona manager and, my word, what heady days they were. Trophies galore, heaps of goals and essentially turning football on its head (for better or worse, depending on your perspective) via the medium of a million passes. This, of course, is also when Messi himself went truly supernova. At 21-years-old he had now moved from 'best young talent in the world' to 'this is absurd, how is he even doing this, Jesucristo'.

But don't take my word for it. All of that data is now yours to play with at your own leisure (it actually quietly went up last night! Hello to any eagle-eyed folks - Ethan - who spotted that). If you've yet to use our data then head over to our resource centre, sign the user agreement and jump in. If you're already set up then you don't need to do anything, the new seasons are all there. Enjoy finding your own ways to demonstrate Messi's ridiculousness.

Again, if you're a bit tentative with getting started then have a gander at our guides for using the data in R. Hopefully they'll offer a gentle nudge in the right direction.

GUIDE IN ENGLISH HERE -> Using StatsBomb Data In R English

GUIDE IN SPANISH HERE--> Using StatsBomb Data In R - Spanish

However you approach it you should have a blast digging into the Pep-era data. Next week this tour takes us to the 2012/13 - 2015/16 seasons, the era of 'Messidependencia', the dawn of a new preposterous attacking trio and all sorts of records being broken.

Until then, be well and have great days.

Introducing the Lionel Messi Data Biography

Scene: StatsBomb Strategy Meeting, Autumn 2018

“Should we release a new men’s data set soon? People seem to really enjoy the World Cup.”

“Probably? We’re definitely going to release the Women’s World Cup next summer, and we’ll put it out daily.”

“Oo, I like that.”

“Okay, but back to the men’s side… what could we release that would matter?”

“We could do a season of Premier League data. That would certainly get eyeballs.”

“Nah, Manchester City already did that in 12-13.”

“Really? Man, where did that data go? No one even knows that.”

“It’s not a terrible idea, but maybe we can do better.”

“What about a season of La Liga? I feel like that market has been under served and deserves some love.”

“Not bad. Maybe you do 17-18 so you get both Ronaldo and Messi in it.”

“That seems fine, but I’m still not excited.”

“I have this idea for some older matches. The Manchester United treble turns 20 this spring. It would be really interesting to collect some of that run and do the analysis of those games in a modern light.”

“Oo, I love that. Let’s do it.”

“Yeah, but it’s still not enough data for a public release. People want something they can sink their teeth into.”

“What if we release the last two seasons of Cristiano Ronaldo and Lionel Messi and really compare from a data perspective? They are clearly the two best players of all time.”

“Ronaldo, yuck.”

“One or two seasons of Lionel Messi isn’t cool."

"You know what would be cool? ALL of Lionel Messi.”


“Oh shit... that’s never been done. Messi started in like 2005 - I don’t think data companies produced x/y data that early.”

“Can we even get the video on that?!?”

“Let’s find out!”

And that is how the Messi Data Biography began. Getting the video was an enormous pain in my ass. None of the usual video platforms have video anywhere near that far back. We then talked to friends, clubs, and former media rights holders for months trying to track down all these matches. I pulled every string I could think of and we were still only able to get to about 90% completion, hitting a hard wall with the last 10%. About at the point where we were going to buy DVDs off eBay in a hope we could fill in as much as possible, Pablo Rodriguez found a super-fan video archivist, and this source filled in all the missing matches. You all owe Pablo many, many drinks for his service. I’ve basically been floating on a cloud ever since.

So what is the Messi Data Biography? Quite simply, it is a data archive of every match Lionel Messi has played in La Liga since his career began in 2004-05.

We collected all of this data with our own time, energy, and crossed eyeballs (the old video is really poor quality) over the last few months as a kind of passion project. At this point, every single member of the StatsBomb and Arqam team has contributed, and I can only thank them for all of the hard work getting us to this point.

The MDB exists on the top tier StatsBomb Data spec for the 18-19 season, so despite the fact that these matches occurred as far back as 04-05, the data is the same incredibly rich event data our Champions League customers use right now.

It was expensive. It was painful. It is… fucking brilliant.

Messi's first senior goal? We've got that. Messi's entire Pep career? That too. The body count from all of the opponents Messi nutmegged in his career? Also in the data!

And we will be releasing all of this data TO THE PUBLIC over the next four weeks.

To recap: Free data. For the entire La Liga career. Of the greatest footballer ever.

The schedule from July 15-August 9 (a.k.a 'Messi Month') looks like this:

Monday - Analysis from each set of seasons will be published on and our media partners.

Tuesday - That same data will be released to the public for non-commercial use. The first Tuesday, we will also publish our own R primer written by Euan Dewar to help people who are new to the data and R get started.

Wednesday and every other day after - You get to analyse, visualise, and simply play with the data yourselves.

This is our gift to football. We hope you enjoy.

--Ted Knutson
CEO, StatsBomb

P.S. I know there will be soooooo many questions people have. I may put out an FAQ next week to answer the bulk of the important ones. (Like will you release all of Messi's CL data, etc etc etc.) For now, just enjoy your weekend!

P.P.S. I said I wouldn't leak until July 15th. I didn't. This isn't a leak. This is an ANNOUNCEMENT. It's like, a totally different thing.

NOTE: If you wish to use any data from the Messi Data Biography for commercial purposes, please send an email to