#28 – Quantifying Non-Shot Chances – The Last Man Analytics

Why do they matter?

Goals, shots and expected goals are only counted if shots are actually taken.

Problem

The problem is that there are some chances where a goal is likely but a player doesn’t shoot for some reason, they choose to pass, or hesitate etc.

How do we measure or quantify these Non-Shot chances?

Existing Solutions

Expected threat is a great way of thinking about this, as Karun Singh (https://twitter.com/karun1710) proposed and outlines here: https://karun.in/blog/expected-threat.html. It manages to attribute value to passes and plays before goals and assists.

The secondary Ozil assist that penetrates the defence is the one that actually creates the goal scoring opportunity, even though Kolasinac is the one credited with the assist and Aubameyang actually scores the goal.

Expected threat can attribute a proportion of the goal contribution to Ozil’s pass as well as Kolasinac’s assist which is great.

Even if there isn’t a shot at the end of the play, expected threat is assigned to all locations on the pitch so you can see how valued it is.

Specifically, you can quantify the threat of each action by working out the difference in expected threat at the start and end of the action. If it increases, then you’re more likely to score in the next few actions than previously which is good.

How this solves the problem.

If a player has a great chance to shoot but doesn’t, there is no recording of this under traditional metrics.

Using expected threat, the quantified threat at that location could be attributed to that possession. So even though there was no shot, there is still some value attributed to getting in a good position to score.

Across all possessions in a match, each team will get into threatening positions and sometimes shoot, sometimes not.

If they shoot, then great we can count the shot and assign it an expected goal value. If they don’t, then we can assign that possession the highest expected threat value achieved in that possession.

This way, all possessions are worth something and reflect the intuition that a goal could happen on any given possession but sometimes things just don’t fall exactly right.

A Simplistic Approach

Now I would like to use expected threat values, but haven’t created my own expected threat model yet.

I have created a simple expected goals model for this and will use it as an approximate to the type of concept that expected threat more rigorously develops.

Expected threat is calculated through iterations of solving what a player is likely to do from each position and each subsequent iteration works out what they can do in subsequent actions.

An expected threat model with a single iteration is considered an expected goal model, so approximates expected threat. It assumes that a player can only shoot from all areas of the pitch and what the likelihood is of scoring from there, as opposed to considering moving or passing to other locations as well.

What I’ve done:

1. Using StatsBomb event data with shot freeze frames to train a logistic regression model to predict the probability of scoring a goal considering distance from goal, angle of goal available, number of defenders blocking the goal and the distance to the closest defender.

https://github.com/ciaran-grant/Non-Shot-xG

Much of the work uses the Friends of Tracking (https://github.com/Friends-of-Tracking-Data-FoTD/LaurieOnTracking) tutorials as a base and builds from there.

def calculate_xG(shot):
    ''' calculate_xG (shot)
    Calculates the Expected Goals based on a model trained using StatsBomb event and freeze frame data.
    Input is a row of a dataframe with columns for distance, angle, distance_nearest_defender, number_blocking_defenders
    '''
    # For the model, get the intercept
    intercept=1.0519
    # For as many variables as put in the model, 
    # bsum = intercept + (coefficient * variable value)
    bsum=intercept + 0.1080*shot['distance'] - 1.6109*shot['angle'] - 
    0.1242*shot['distance_nearest_defender'] + 0.3260*shot['number_blocking_defenders']
    # Calculate probability of goal as 1 / 1 + exp(model output)
    xG = 1/(1+np.exp(bsum)) 
    return xG

The above model is by no means the best (or even close). But using StatsBomb’s freezeframe event data for shots let’s you build a model that can use defender location information which is great for this concept.

2. Using Metrica’s sample tracking data (match 2), calculated the expected goal for all frames of the game. Creating necessary features using locations of ball and nearest players on the pitch for both the home and away team.

3. It’s only physically possible for an attacking player to shoot if they have the ball, so have created a new non-shot xG for those frames where the ball is within a certain distance of an attacking player.

Note: this is approximating an attacking player being in control of the ball, but also picks up frames where the ball flies past a player from a cross or an attacker walks past the goalkeeper holding the ball.

x-axis: Time (s), y-axis: Expected Goals.
Home team scores a goal. Expected goal (red) and non-shot expected goals (green) are tracked on time series below. When a shot is taken, they are the same thing.

4. Metrica define possessions as per their documentation. For each possession, if it ended in a shot, then take the xG of that shot, otherwise take the highest xG_available to approximate the non-shot quality of the possession.

*x-axis: Time (s), y-axis: Expected Goals.*
Home team gets into the opponent’s penalty area, but misplaces the cross and turns over possession.

Non-shot expected goals are NOT designed to suggest that the above red player should have shot rather than trying to cross for a better opportunity to shoot. They are useful to track the quality of possessions where there is no alternative. The above possession doesn’t count in any traditional metrics because no shots were taken, but perhaps should be tracked somewhere.

5. Totalled all the traditional game statistics including shot xG and the new non-shot xG which are below.

	Home	Away
Goals	3	2
Shots	13	11
Expected Goals	2.04	2.02
Non-Shot Expected Goals	5.31	3.46
Passes	543	421
Possessions	147	136

Post match box score for Metrica Sample Game 2

Discussion

As you can see, the non-shot xG is always higher than the standard xG since it includes xG and in addition, the quality of the possessions without shots. A measure for how ‘wasteful’ a team was would be to look at the difference.

A much higher non-shot xG than xG would suggest the team got into quality shooting positions frequently but didn’t take advantage and take the shots.

A more similar non-shot xG and xG would suggest the team made the most of their shooting positions by taking the shot.

Again, using the simplistic xG model here is an approximation for something more sophisticated like expected threat. But I think it nicely highlights the distinction and what’s missing when only relying on goals, shots and expected goals to review how a team performed in a match solely from the box score.

There are some notebooks and code alongside this that I’ve put on GitHub, do check it out if you’d like! Feel free to ask questions or comment over at @TLMAnalytics.

#28 – Quantifying Non-Shot Chances

Published by thelastmananalytics

Leave a comment Cancel reply

Share this:

Published by thelastmananalytics

Leave a comment Cancel reply