In previous posts I have mentioned xG and xA but not everyone knows what they are, xG is expected goals and xA is expected assists but what do they mean?
In basic, xG measures the quality of shot being taken and the probability of it going in (out of 1). A simple xG model can be created just by the location of the shot. A sophisticated model takes a lot more factors into account. These factors can include like the type of shoot (if it came from the foot or a head), if the player went round the goalkeeper, if the pass that led the shot was a cutback or a through ball, the speed of the attack that led to the shot (as a shot coming from a counter attack has a better chance of going in) and more. A lot of different factors go into deciding how good of a chance it actually was and creating the shot’s xG.
Straight away, the advantage of xG is crystal clear. If you just look at number of shots, there’s no way to know if they’ve been shooting from 40 yards or getting on the end of good chances in the box, you can’t tell if they’re Mauro Icardi or André Schürrle.
If you have xG data, you can tell a lot more about a player especially for scouts. If a scout sees a player has had a lot of shots, they might want to investigate the player further. However, if they’re all from 30 yards out then it could mean a few things. This could show that they lack match intelligence or they lack the skill to get closer to goal, this would make a scout more sceptical about the player.
This is the same for chances created, xA. For example, if a midfielder plays 4 long balls into the box and the striker wins all 4 balls and gets an attempt on goal with all of them. The midfielder will be credited with 4 key passes. However, if a midfielder plays 4 through balls and his striker gets onto the end of them and gets a shot off after each long ball, he will also be credited with 4 key passes. If the shots were taken in the same areas, the midfielder that played the through balls will have the higher xA. This is because it is more difficult to score from a header, so the headers are less likely to be scored, a lower xG. The chance created was better by the midfielder that played the through balls, that’s why he has the higher xA. If a scout looked at this, they could tell who the better creator is.
Some people oppose xG as they believe there is too much data in the game, and it takes away from the enjoyment of football. However, as football fans it is intuitive, and we already do it. If your team creates a lot of chances and the opposition’s goalkeeper saves everything. The opposition then scored a goal from 35 yards. When you talk about that game with your friends, you’ll say you were unlucky to lose. That’s xG in a very basic way. You’re rating how good your opportunities were in comparison to the opposition’s and using it to determine who deserved to win. That in essence is xG.
But why is xG important? Well, as weird as it sounds, xG is actually far better at telling us how good a player or a team is than actual goals scored and goals conceded. Occasionally a team or a player will have a purple patch in which they finish every chance they have, or a goalkeeper saves everything like David De Gea in 2017/18. The xG model shows if a team can consistently perform well and create high quality chances. Over the course of a season, xG tends to balance out but obviously not always in a game as anything can happen in a game. That’s why xG can be used to portray the true sequence of events rather than just the outcome.
Of course, xG isn’t perfect, it’s just the best tool to determine how good a player or team are. But what are some of the problems? Well, every shot from as far back as data goes is used. For example, every shot from the left corner of the box is used to determine the probability of a future shot from that same position goes in. The problem with that is that it doesn’t distinguish between good and bad finishers or long shot takers that we know exist such as Coutinho. This model is objective to who is taking the shot. There are very few players that consistently out perform their xG which does show that finishing ability isn’t as important as first thought. For example, Cristiano Ronaldo is a well renowned clinical finisher however, since 2014/15 he has only over performed his xG once. Another clinical finisher in Robert Lewandowski runs pretty much level with xG apart from 2018/19. This does suggest that the xG model is very accurate. This does change what is valued in a striker as Lewandowski and Ronaldo are both elite strikers and yet they run pretty much level with xG. This shows we shouldn’t value finishing as much but more their ability to get on the end of the chances.
Another problem with xG is that it doesn’t take into consideration the positioning of defenders as there isn’t the data for how many players are in between the goal and the player taking the shot. This is why they have things like the through balls and counter attacks built into the model as a rough way of telling where the defence is and how much pressure the shooter is under. This problem is why managers like Dyche or Favre, teams who like to block most shots consistently out-perform xG defensively. This is becuase their team block a lot of shots and xG can’t tell that those shots are a lot harder to finish even if they weren’t being blocked.