Ever since StatsBomb formed as a data company in 2018, we’ve been committed to supporting the development of analysts in this community, previously held back by a scarcity of publicly-available football event data to analyse and experiment with. Well, it’s that time again. More free data.
The uptake of the free data has remained strong ever since the initiative’s launch and has helped many analysts develop and demonstrate their skills to the point where they’re now being hired into clubs at an alarming rate. It cannot be stressed enough that clubs are actively looking for real world evidence of your skills, and are willing to hire you if you’re good enough. There’s now countless examples of this pathway being tread– young analysts getting their hands on data, experimenting and doing cool shit with it, before professional football comes along and hoovers them up. And it’s happening faster now than ever before. If that’s a career path that interests you, then you’re in the right place.
We’ve made a lot of data available in the last three years and have been delighted by the creativity shown with the data and the clear development shown in analysts that have taken it upon themselves to learn to code with this resource. But we know you can only drill the same datasets so many times before it becomes a bit stale seeing the same player names and results coming up, which can chip away at motivation. You need fresh data.
This release—the entire season of the 2020/21 FA Women’s Super League—is the latest to come out. We’ve now made over 3 Million events available across 1000+ matches through various datasets we’ve released, which, if you needed reminding, include:
- FA Women’s Super League (18/19, 19/20, 20/21)
- The Lionel Messi Data Biography (04/05 – 19/20)
- Arsenal Invincibles Season (03/04)
- UEFA Men’s Champions League Finals (various, 99/00 – 18/19)
- FIFA Men’s 2018 World Cup
- FIFA Women’s 2019 World Cup
- NWSL 2018
To reiterate, this is the same quality and depth of event data that is used by hundreds of professional football clubs worldwide every day. Learning to use, manipulate, and understand the data will make you that much more employable to these clubs.
That’s the first bit of the news. The second should also be an exciting one for aspiring data analysts.
We know that learning to use data, particularly embarking on learning a programming language, is a daunting prospect at the very beginning. To ease the steep learning curve, in 2019 we released our beginner’s guide to the R programming language: “Accessing And Working With StatsBomb Data In R”, penned by one of the finest data chefs around, Euan Dewar. The guide received overwhelmingly positive feedback and is a genuine leg-up into getting started with this stuff.
But the original guide’s over two years old now, so it was high time we gave it a new lick of paint and upgraded it to include ALL the extra R content Euan has released since into this new version of the guide, so it’s now all in one place.
In the guide, you’ll learn the basics of manipulating data and then generating basic data vis such as this…
Before advancing onto more sophisticated stuff, such as this…
That’s more than 50+ slides of example code, advice, and guidance into understanding and getting started in this world. It’s a beast of a document and took a lot of time to put together, so please remember to thank Euan for his efforts and generosity.
Lastly, it’s important to note that if you intend to publish your work on social media, which we greatly encourage you to do if you want your skills to be noticed, then please do remember to abide by our user agreement and credit StatsBomb as your data source when doing so. Our User Guide and Logos can be found in our media pack.
To access the data, go through our Github here
To access the R Guide 2.0, click here
If you’re an aspiring analyst, these resources are here to assist you. You could be the latest analyst to get hoovered up into professional football.
Best of luck,
The StatsBomb Team