There’s so much more data available for football matches now than there ever has been. There are some fantastic initiatives making tracking and event data freely available by StatsBomb and Metrica Sports. But there’s also lots of other information more widely available out on the web at the likes of https://www.transfermarkt.co.uk/ and recently fbref.com. These won’t be as detailed match to match, but offer a wider overview of what’s happening on and off the pitch.
Accessing information from the web can be time consuming unless automated, which is made very easy using some powerful Python packages and tutorials widely available, like FCPython. There are limits to what you can/should access automatically since we don’t want to overload websites with requests. More information on scraping can be found here.
There are many public functions for web scraping different places, but advice I would give would be to try and make your own to ensure you actually understand what you’re getting. I’ll still add my own interpretation of a web scraping function which works for player/squad tables on fbref, so here’s a link to the GitHub page which has the functions I used with some examples: https://github.com/ciaran-grant/fbref_data
There are lots of types of passes available on fbref, it’s really appreciative having all this data available to explore. Here I am taking a look at progressive passes, with a view to see who’s leading the way this year and if there are any styles we can infer.
Among the progressive passes available the ones that I’ve focused on are below:
- Total Progressive Distance
- Number of Progressive Passes
- Passes into the Final Third
- Passes into the Penalty Area
- Key Passes
- Expected Assists
These have been chosen to try to get a cross between both quantity and quality of ball progression.
As each player has played a different number of minutes, I have used per 90 minutes to compare players. It may be more suitable to use number of minutes in possession for offensive passing such as this, and also number of minutes out of possession for defensive measures. But per 90 minutes goes most of the way there.
Each metric has a significantly different range of values, for example Total Progressive Distance per 90 minutes will be in the 100s/1000s whilst xA per 90 minutes will be between 0 and 0.5 usually. An extra 1 progressive distance is way less impressive than an extra 1 xA. To compare between statistics I’ve normalised each relative to the best performer respectively, this forces all comparisons relative to their peers at their productivity per 90 minutes.
This makes it hard to compare between different groups sometimes, but as long as you’re aware of the context what everything is relative to then that should be minimised.
On to the fun stuff!
Both are among the top across almost all statistics in both ball progression and actually creating quality chances.
There seems to be two styles that most players fit into, the above two are aliens so they don’t count. There are chance creators:
Angel Di Maria leads everyone in passes into the penalty area and xA, with lots of key passes too. These types of players seem to be great at using the ball in and around the box, converting possession into chances.
There are also deep progressors:
This is among all players, position agnostic. And David Alaba appears as the best passer into the final 3rd, whilst also high volume and distance in progressive passes. These are deeper players who can move the ball from the halfway line into the final third for your chance creators to thrive.
I have identified these personally just using some intuition, I think next steps will be to test my theory and apply some clustering or PCA to these players to try to identify more styles.
As everyone is secretly wondering, here are a selection of some of the best u23 performers I have found. In no particular order we have Christopher Nkunku, Martin Odegaard, Jadon Sancho and Trent Alexander-Arnold. These are normalised relative to other u23 players:
Whilst for perspective at just how good these guys are, here they are relative to everyone. They are some of the best players in the world already, pretty scary.