• This week I have been working on incorporating player values into my algorithms that price up football games so I thought I could write a small piece with some related charts.

    Thanks to http://www.transfermarkt.com for their excellent website!

    Which teams in Europe’s big 5 leagues are over or underperforming their squad value?

    I’m using each team’s performance in their national league with a league-by-league adjustment based on each leagues’ performance in the Champions League this season (this adjustment is England 0, Spain -0.55, Italy -0.75, France -0.75, Germany -0.95.  I’m not totally confident about these adjustments but won’t be a mile off.)

    I’m also ‘correcting’ player value for age as younger players are generally not as good as their value implies while older players are better than their value implies.

    Figure 1

    The wealth of the Premier League is on show as 16 of the 20 teams this season have had a lineup valued at over €200 million while Spain, Italy, France and Germany have 4, 6, 2 and 3 respectively.

    ‘Performance level’ can be used to compare teams as it does not have individual meaning after the league strength adjustment (e.g. Bayern are just under 0.5 goals better per game than Barcelona but having a performance level of 0 on this graph doesn’t mean average in any way).

    It will be interesting to check back on some of these numbers at a later data to see if the position of teams relative to the line has changed and whether key players at overperforming clubs are more valuable (whether they are still at that club or not).

    Figure 2

    Lille have had starting line ups average less than €10m per player but have been performing very nicely in Ligue 1.  Hakon Haraldsson and Matia Fernandez-Pardo are 2 young players who may be ones to watch.

    Villarreal and Lens are also around the same area of the graph – Alberto Moleiro of Villarreal and Mamadou Sangare and Samson Baidoo of Lens may be more young players partly responsible for the current overperformance of their clubs.

    Liverpool are big underperformers; this will be in part due to the highly valued Mo Salah not getting close to matching his very high-performance levels of last season.  Real Madrid are playing ok (even in some of the games they have not won) but their position on the graph is a function of their very high squad value and their performance value not being at the level of a few others.  German teams feature quite heavily in the underperformers list partly because of German performances in Europe that imply the league is not especially strong right now (not totally confident on this).

    Premier League Squads

    I have lined up each Premier League squad by value to compare them.

    Figure 3

    I find a good way to compare to make this more readable is to look at the rank of the value of each team’s highest value player, 2nd highest value player e.t.c

    Figure 4

    Arsenal’s top few stars are not quite at the value of Liverpool’s and Manchester City’s, but their squad looks very deep and solid – they have the most valuable 5th ‘best’ player, the most valuable 6th ‘best’ player e.t.c

    Brentford are solid performers so far this year given their lack of high value key players. 

    Chelsea surprised me a bit, I thought their squad was stronger on paper than this.  I suppose they have a lot of players out on loan.  Everton may have one of the weaker benches (relative to starting 11) while Leeds looks like the opposite.

    Tottenham are quite big underperformers compared to other teams given their squad (it’s quite deep) while Sunderland are the opposite.  Evaluating player value is a difficult thing to do so it may be that some teams have mis-valued players rather than they are coached better.

    I hope to have more material using transfermarkt player values in the future so stay tuned for that.

    I hope you have a good holidays wherever you are (if you have them!) and please subscribe using the white box on the main page if you’d like to be updated whenever I post.

    Thanks for reading!

  • Arsenal

    They really need to get fit because the Manchester City of old has remerged.

    Aston Villa

    5 games into this season I remember calculating that it was about a 1 in 500 shot for Villa to be the same strength as last season and yet have performed like they did (or worse) for those first 5 games.  Sounds unlikely right – that’s a ‘p value’ of 0.002 for the hypothesis ‘are Villa worse this season?’.  However, I believe that 1 in 500 is only really 1 in 500 in this sense if we singled out Aston Villa, started measuring from that point on and then found it was a 1 in 500 chance for luck to explain their performances.  As this discussion of a freakish opening few games could have been about many different teams, 1 in 500 starts to become less significant. 

    Villa are still receiving attention for being bad to this day as it seems their raw xG and xG derived expected points totals are still not favourable.  https://x.com/xGPhilosophy is currently saying they should be 15th ‘based on their xG performances.

    xG is a fairly noisy metric and the midfield of the league is so tight that I suspect the gap between 7th and 15th isn’t very much.  Expected points from xG may be even noisier than xG difference but that’s a topic for another time.  As they have a similar squad to last season their xG numbers arguably deserve some regression to last season, so I think purveyors of xG expecting a Villa collapse are going to be wrong with this one.  Using my own ratings (shots/passes/goals) they have now recovered from that strange start to sit as an above average league team. 

    Bournemouth

    They lost most of their defence over the summer so there was a possibility they would struggle.  This hasn’t come to fruition and they’re another strong mid table team to add to the mix. 

    Brentford

    They lost Mbeumo, Wissa and Norgaard over summer – 3 fantastic performers who played nearly every minute last season.  Things looked a little bleak for them early on but new players like Outtara/Henderson/Kayode seem to have settled in now and they’re a rock-solid Premier League side once again. 

    Brighton

    I was thinking the other day a previous version of Brighton could have been well placed to take advantage of other teams struggles.  Unfortunately, this year Brighton are looking exceedingly mid table.   Chelsea and Liverpool have had troubles but in terms of a top 4 spot there are few stand out candidates to displace them.  

    Burnley

    With no signs of their ability last season in the Championship to stop opponents scoring shots/big chances Burnley face an uphill battle with their relegation chance sitting right around 90%. 

    Chelsea

    With Palmer back they’re going to be pretty decent but still comfortably below the level of a Premier League title contender.

    Crystal Palace

    The FA Cup champions are still good but maybe not quite as good as I’ve seen some xG numbers imply.

    Everton

    Another solid mid table side

    Fulham

    It would be great for them if Antonee Robinson can get back to full fitness.  Regardless though – still a solid side.

    Leeds

    Leeds have received a lot of attention from me this season as I’ve strongly rated their chances of staying up.  I currently have the Relegation betting markets as being the most wrong of any futures market.  I have West Ham around 67% of going down and Leeds just 15% while betting markets rate it as West Ham 45%, Leeds 28%. 

    For the previous 2 championship seasons many analysts would have been surprised that Leeds did not cruise the title.  They may have some issue with style over substance?

    They have been much better home than away which is also something to keep an eye on.  I wrote about this topic recently and found about 90% of the variability of half season home/away performances is down to chance (with the remaining 10% being something real).  For Leeds this would mean they may be ~0.05 goals better per game at home than away (which is worth a few ticks in decimal odds, e.g. 1.97 vs. 1.94)

    Liverpool

    We were saying their performances were pretty ordinary when they were getting last minute winners in the last 5 games and yet they’ve been consistently even more ordinary since then.   

    Manchester City

    Wow, this team is really starting to roll – Arsenal are going to be under so much pressure.  Arsenal and City should be joint favs for the Champions League for me.

    Manchester United

    They’re mere just ok and that’s without Europe.  As teams start getting knocked out of Europe in the second half of the season they will lose that slight advantage.

    Newcastle

    Again, I think they’re a decent above average side but Newcastle fans may have been dreaming bigger.

    Nottingham Forest

    I’ve flip-flopped a few times on where Forest stand.  I think others may be in a similar position – are they on their way up to the top 10 again?  Or are they fighting relegation?

    Sunderland

    Sunderland are bottom half in performance level but looking comfortably better than at least 3 teams and similar to a bunch of others is still a strong start to a new PL era for them.

    Tottenham

    Some dreadful home performances have captured headlines so far for Spurs who were one of the harder teams to predict coming into the season.  Their performances have varied quite wildly (2nd highest variability in the league), but average it out and they are aren’t meeting league average standards so far.

    West Ham

    I think they’re poor and in big trouble (see what I said about Leeds above).  One small point of contention may be how variable their performance level has been game to game (it’s the most of any team).  The worst performance of any team this season may be their home match vs. Brentford but they also have 4 above average performances against Man United, Newcastle, Forest and Everton. 

    Wolverhampton Wanderers

    I don’t have much to add to the discourse on Wolves. I suppose even if they continue to lose they will find some motivation in trying to avoid being called the worst PL team ever (points wise).

    If you enjoyed these thoughts please consider subscribing for free using the box on the top left of the home page!

  • This week I have done some analysis looking at the performance of teams’ home and away.  I want to test whether some grounds really are fortresses and some teams do better than expected when playing away.

    This is a separate conversation to home advantage which is an ever present and significant factor in every professional football match not played at a neutral ground.  Today we are considering whether some teams may enjoy more or less home advantage than overall averages.

    In order to be a good analyst on a sport like football ones needs an excellent perception of what is signal and what is noise. If teams played half their matches with white stickers on their boots and half their stickers with black stickers we would observe many teams experiencing completely different form with the each type of sticker. In this instance you should be able to see all ‘sticker based’ form would just be noise. Home and away games are similarly split 50/50 so should we consider home and away form in the same way? In my opinion an analyst who assumes everything is noise until solid proof to the contrary arrives is a better analyst than one who finds patterns and structure everywhere.

    Let’s see if we can find any evidence of teams maintaining strong home or away form.

    Analysis

    In order to test this, I am going to use goal difference and xG.

    The xG is my own shots & passes metric, but for simplicity I’ll refer to it as xG.

    I am going to use 290 team seasons (the last 3 years of the big 5 leagues in Europe) for this analysis today. Thanks to whoscored.com and Opta for the data.

    For each team I need to calculate the following:

    1. Goal difference/xG at home for the 1st half of the season
    2. Goal difference/xG at home for the 2nd half of the season
    3. Goal difference/xG away for the 1st half of the season
    4. Goal difference/xG away for the 2nd half of the season
    5. Expected goal difference at home based on average home advantage and opponents for the 1st half of the season
    6. Expected goal difference away based on average away disadvantage and opponent for the 1st half of the season.
    7. Expected goal difference at home based on average home advantage and opponents for the 2nd half of the season
    8. Expected goal difference at home based on average away disadvantage and opponents for the 2nd half of the season

    Difference between home/away performance for the 1st half of the season is given by (1-5) – (3-6) while home/away relative performance for the 2nd half of the season is given by (2-7) – (4-8).  If there is a positive correlation between these 2 results it means some of that home or away form skew is being retained and hence not the result of luck.

    For example, say for the first half of the season, a league average team A has an xGD of +0.5 per game at home and -0.4 xGD per game away.  Let’s also say they’ve faced an equivalent easier schedule at home (expected to perform at +0.1xG per game) than away (-0.1xG per game) and home advantage across the league is +0.3 xG per game.  Using the method this gives a home/away skew of (0.5 – (0.3 + 0.1)) – (-0.4 – (-0.3 – 0.1)) = +0.1xG per game.  Team A are therefore performing 0.1xG per game better at home for the first half of the season than away.  We could then do the same calculation using figures from the 2nd half of the season to calculate the teams’ home/away skew for the 2nd half. 

    Results

    Rather than show a lot of messy plots I made a table showing the slope of the relationship between home/away skew as a percentage.  The percentage shows how much of the home/away skew is retained for each item for each season or year.

    Table 1

    League by league we see quite random results (remember we are correlating home/away skew from the 1st half of the season to the 2nd half).  One season we see teams maintaining their superior home or away form through the season (positive percentages) then for another season superior home or away form apparently predicts the reverse.  This wouldn’t make any sense so must be the result of noise.

    Looking towards the bottom of the table starts to paint the picture more clearly.  I have averaged the by each each and for all 3 years. We get a slope of .119 across all the leagues which means about 12% of the home/away performance xG skew for teams in the first half of a season is retained by teams during the second half of the season.  Using the team A example from the previous part, 0.1xG at home means we could expect them to overperform at home in the 2nd half of the season by just 0.012xG per game!

    Goals however just appear to be too infrequent and random to find any correlation between half seasons, even though I have used 290 team seasons. 

    Looking between whole seasons gives some satisfying results as ~8.5% of home over/under performance is retained when using the xG metric while ~3.5% of home over/under performance is retained when just looking at goal difference.

    Conclusion

    My takeaway from this is while home/away form is largely the result of randomness, there could be rare instances that considering a team’s home/away form could have an impact. If we have a prior belief that a stadium may offer its home team extra home advantage that may be more valuable than any noisy data we have.

    Using just goals was useless over half a season but when comparing whole seasons we saw a small correlation. If your team seems to be getting betting results home or away that trend is likely just the result of randomness! 

    I believe the metric I’ve used for ‘xG’ in this article is a lot less noisy than raw xG models so I would have strong reservations about putting any consideration into home/away skew when using xG unless we are talking about home/away overperformance over a period of 1-2 years or more.

    To finish with I’ll leave this table of home over/under performance for every team. The table is split into 3 parts for display purposes. Does the position of any teams match up to your expectation?   Remember about 90% of the deviation of xG per game is likely random chance over 1 season (maybe less over more seasons) and teams at the extreme ends of the table are likely to have experienced some extra variance home or away. I may revisit the topic with more data, if you have any questions or ideas about this I would love to hear them.

    If you enjoyed this piece today you can subscribe by inputting your email address in the white box on the home page syzygyanalytics.co.uk! Thanks for reading!

  • Today I am recording some points projections for 7 different leagues using up to 4 different methods.  All 7 leagues are around 30% complete so let’s see at the end of the season which method predicts the remaining 70% the best! 

    *To be notified of future posts please consider subscribing by inputting your email address in the box on the top left of the home page*

    The 4 different rating systems I am using are as follows:

    My ratings

    Developed by me over the last 10 years these ratings are purely using on pitch actions (passes/shots/goals) from this season.  I don’t think the 10-15 game samples for all the leagues I am looking at today are sufficient to put full stock in what they say (for example a regression to last season’s performances would add some value) but they are still very useful for comparison.

    MIR (Market implied odds)

    A market implied/inferred odds algorithm I created recently using betting odds to calculate the performance level of each team.  I’m using the closing odds (odds at kick-off) for each team for their last 4 or 5 matches with an additional factor of their performance level in their latest match (as the betting odds can not have included this yet).

    Opta

    Opta points projections from https://theanalyst.com.  I think their projections use a broader elo based system so I don’t expect their projections to keep up with the predictive performance of other systems in this specific situation.

    Spreadex

    I’m using the mid-point of points spreads for each team at this spread betting website.  They don’t have any spreads for a few of the leagues in this article.

    Let’s get started!

    Points Projections

    Leeds look plenty strong enough to survive although you must wonder if their shooting data is better than their real level and the Opta and Spreadex predictions end up the more accurate ones.  Sunderland are still rated fairly poorly despite performances. 

    My ratings are pretty close to the MIR.  PSG I have rated as a bit lower but they have not been dominating all their games but their lineups are fairly inconsistent, and they are potentially such a dominant team you wonder if they are slightly cruising at times.  Additionally they could deserve a boost for their performance levels last season so all in all I am not confident about their final tally being closer to 75 than 80.

    Injuries and lineups are a complicating factor across all leagues.  For my ratings I could ideally have included factor that compares predicted future lineup strength to the lineup strength each team has had for the season so far (I don’t have that player level analysis quite ready to go yet).  I have quite a few teams performing differently so far to market opinion here.

    Although billed by many as one of the most competitive leagues around Europe Inter really look like the team everyone is trying to catch.  Napoli are unlikely to repeat last season’s success.  Fiorentina have looked really poor on the pitch so let’s see if they can recover.

    I rate Hoffenheim’s performances strongly so far, let’s see if they can keep it up.

    Opta are showing 29.0 pts for Sheffield Wednesday which I think includes their 12pt deduction (normally the Opta predictions have a smaller difference between the top and bottom team).  The other projections ignore this deduction so I’ve removed the deduction from the Opta projection.  Ipswich have good underlying numbers so I think could be well place to get promoted back to the Premier League.

    Quite a few differences of opinion in League 1 here.  I don’t rate Luton’s or Bolton’s performances so far nearly as high as the general opinion of these teams.  Blackpool have struggled and are being slow to improve.  This will be an interesting prediction to look back on at the end of the season.

    I’m not seeing title contending level performances from Chesterfield?  Barnet haven’t been getting results so far, but they should be considered as one of the favourites for the title. 

    Interesting angles for analysis at season’s end

    • Which projections predicted the future performance of each team the best?
    • Combining my rating and the market implied rating could create a more accurate prediction – what weighting for each would have maximised the predictive ability here?
    • At The TransferFlow Justin Worrall/Ted Knutson (https://www.thetransferflow.com/p/outrights-longshot-bias) wrote an interesting piece on favourites being overvalued by models.  My current grasp of this is, as performance levels for teams can wander in both directions through the season, on average this hurts the favourites’ chances more than it will help (the chances of the favourites is hurt more by their performance dropping than it is by their performance improving.  They don’t need it to improve to have a strong chance of winning the league).  Can we see evidence of this across these projections how can we adjust for it in the future? 

    With thanks to

    Joseph Buchdahl and https://www.football-data.co.uk/ for betting odds, Opta and Spreadex for their projections and Whoscored for passing/shooting data.

  • This week I’m just looking at some more data on games states and xG from understat.com.

    10 League Analysis

    I want to further investigate the question of whether goals are converted more readily than xG would imply in different game states.  Adding up xG data from understat.com and using data from whoscored.com (Opta), I made the following table:

    Table 1

    • I am looking at performance for teams that play in leading game states (same story for trailing teams, you just need to flip the signs)
    • The third data column is the performance we expect from leading teams because leading teams will on average be stronger teams.  It’s a somewhat involved calculation I (attempted) to explain in this post: https://syzygyanalytics.co.uk/2025/08/01/soccer-team-rating-iii-game-states-i/

    The red numbers on the right show leading teams underperform expectation.  I believe this is likely because of loss aversion (the losing team can find more motivation) and explains the increased prevalence of draws in football compared to what standard Poisson distributions would predict. 

    I am somewhat surprised to see that goals have been converted at a slightly lower rate for teams that lead than would be expected (leading teams are underperforming their xG numbers shown by a +0.27 GD but a +0.37 xG difference).  This is because it seemed possible to me that chances while teams lead are likely to be better chances than the Understat model calculated (e.g. less defensive coverage). 

    This difference is nearly all down to the Italian league but we are looking at nearly 3,800 matches so the fact teams underperform their xG here could imply loss aversion is even playing a role in finishing ability. 

    Game state adjustment

    My current perspective for a game state adjustment is as follows:

    1. Teams that lead suffer a degradation in performance (likely due to loss aversion).
    2. If teams perform worse when they lead it will continue to happen in the future as well.
    3. To correct for game state, we want to look whether a team has led/trailed for a significantly different amount of time than would be expected for a team of that level.
    4. For example, consider a strong team that has led for seven 90s (and tied for eight 90s) in the first 15 matches when we would only expect them to have led for a nett three 90s.  Using the averaged -0.18xG per game figure from the table above, the xG difference for this team has been damaged by the leading states by around 1.26 total xG (0.08 xG per game). 
    5. However, in the next 15 games on average we expect it to be damaged by 0.54 total xG (0.035 xG per game).  The game state adjustment is then the difference between these figures – our game state adjustment will add 0.045xG per game onto this teams’ performance. 
    6. xG is one of the least game-state affected metrics, we would have larger adjustments for something more shots/passes based. 

    A bit of a slapdash article again but I am doing a lot of football betting as well so it’s difficult to dedicate a lot of time each week.  More to come in the future…

     Thanks for reading, please subscribe using the box in the top left on the home page and get in touch if you want to talk about anything. 

  • Welcome to my blog, today I’m briefly covering 3 different topics of interest on fixture adjustments, Premier League predictions and score effects.

    How many iterations does a fixture adjustment need?

    When attempted to rate a team accurately to create forecasts we are faced with a circular problem.  Our output will hopefully be accurate team ratings, but to do so we need an input of accurate team ratings to be able to adjust each teams’ performance relative to their schedule!

    I made an algorithm that derives team ratings from the betting odds at kickoff.  (With thanks to Joseph Buchdahl @12Xpert and his very useful data downloadable from football-data.co.uk). To start with there is no fixture adjustment and the team rating is just the team’s average positive or negative supremacy (in expected goal difference per game) over their opponents. 

    I use these ratings to make a schedule adjustment for the first iteration.  This outputs a new set of ratings which I can then put back in at the start again and so on.  Figure 1 shows team ratings for the PL (in expected goal difference per game) this season using this method:

    Figure 1

    Consider Sunderland as an example here – many will know already that they have faced a very skewed schedule of lower ranked teams so far.  You can see their rating drop sharply between no adjustment and 1 iteration (figure 1).  However, after 1 iteration, every other team that has played Sunderland has been given a scheduled that includes Sunderland as a team rated as -0.48 (possibly not a level Sunderland are at).  One more iteration sorts out this problem, and each following iteration shows minimal change.

    Figure 2

    Figure 2 compares the accuracy of these ratings against the current odds for each team in Gameweek 9 (pricing error per game).  The second group of numbers is the variability of each teams derived ability game to game (lower the better).

    What we can conclude from these figures is 2 iterations is a good number.  Neither metric from figure 2 appreciably approves after 2 iterations and so extra iterations are not worth the time or effort.

    Premier league predictions after 8 weeks

    Figure 3

    I hope to return to these at the end of the season to see which performed best.

    I’d rate the market derived algorithm to be favourite – betting odds at kick off are really accurate.  My own ratings have outliers for each of the 3 promoted teams which could be because of the small sample of games so far (for others teams I include some data from last season).  Opta is 4th favourite because I think they use a unique method with a team elo rating system (the spread from best to worse might not be enough and teams like Tottenham are too low).    

    Game state effects with xG are not especially large

    I often see people discredit trailing teams who create chances as ‘score effects’.  At times it seems it’s impossible for trailing teams to be given any credit regardless of what they create.

    With thanks to understat.com here are some numbers I have calculated before:

    Figure 4

    Raw numbers for the last 2 PL seasons show trailing teams perform worse than average by xG and goals.  The third column is important – this is the expected performance of trailing teams owing to the fact that trailing teams will generally be weaker teams (the reason they are trailing to begin with).   This figure is subtracted off the raw numbers to give the final 2 columns.

    We can see trailing teams do score more and create more xG but it’s to the tune of around a fifth of a goal or xG per 90.  If a team puts up lots of shots and 1.5xG while trailing I see no reason to dismiss the majority of it as ‘score effects’. 

    If you enjoyed this piece, please consider subscribing (top left on the main page) or leaving a comment!  Get in touch at x.com/samh112358 about anything if you wish, thanks for reading!

  • Last week I went through the last 10 years of home advantage in the top 5 European leagues and showed a surprising decline in home advantage last season.

    I’ve done more analysis using a more detailed calculation for home advantage (using shots instead of just goals) and found that last year just looks like an anomalous year:

    My model implied home advantage is using shots and passes to create something more like xG than just goals.  The model implied home advantage is not even quite within the 99% confidence interval of home advantage using just goals.  I’m now back to believing home advantage is steady and last year was a bit of a freak.

    Premier League Fixture Adjustment

    I see quite a lot of bad information around on schedule strength so I thought I would post my own.  I converted it all into total goals (or xG, same thing) for easier reading:

    The first column is what to add to each team’s goal (or xG) difference to level out the early schedule bias.  The 2nd column is the goal difference you would expect for a completely average team that had played the same fixtures as each team. 

  • Home field advantage (HFA) is a well-known concept present in most sports, but it contains certain aspects of mystery.  While understanding the overall magnitude is trivial with some research, understanding the magnitude of all the contributing factors is not.

    The way in which HFA changes over time may give us some clues.

    The following is a plot of the average HFA in every football game across the big 5 leagues (England, Spain, Italy, France & Germany) each season since 2014/15.

    Thanks to soccerstats.com for data since since 2020 and fbref.com for 2014-2020

    Each error bar represents 99% of the possible variance of each season’s HFA.  This means there is a 0.5% chance of the ‘real’ value being below the bottom mark of the error bar and a 0.5% chance of being above. 

    The following details the steps I used to calculate the size of the error bars:

    1. Consider the standard deviation, s, of home goals and away goals. The poisson distribution is typically used to model football games and assumes the average amount of goals scored is equal to the s of goals scored.  I looked at some data and found the s for home goals is actually around ~85% of the mean of home goals and for away goals s is around 95% of the mean of away goals.
    2. For HFA we want to do home goals – away goals. The standard deviation of a new sample created from the difference of 2 samples is the root of the sum of each standard deviation squared.
    3. This gives us the standard deviation for HFA for 1 game.  For the standard deviation of HFA for n games we divide the standard deviation for 1 game by the square root of n.
    4. The normal distribution tells us a data point has around a 1% chance of being more than 2.58 standard deviations away from the mean.  The size of the upper and lower tail is then 2.58 * our result from 3).

    What does this plot tell us?

    From 2014/15 to 2018/19 home advantage is buzzing along around the 0.34-0.40 mark.    This means if 2 equal teams played each other, the team that had home advantage would win by an average of around 0.34-0.40 goals (the home team would win about 45% of the time with the away team winning only around 30%).  Any small variation in these years is explained by natural randomness – each data point would be inside the error bars of the other years.  

    Halfway through the 2019/20 season, leagues across Europe and the whole world were suspended indefinitely for covid-19.  The second half of the 2019/20 and most games during the 2020/21 season were played in front of empty stadiums.  VAR was also introduced at the start of the 2019/20 season. 

    The effect of one or more of these factors is clear in the 2020/21 season because HFA was cut by about 50%.  Consider the HFA in 2021 to the HFAs between 2014 and 2019 – it’s clearly statistically significant – in fact the difference has a p value of around 0.000003 (i.e. a very low chance of the drop being explained by randomness). 

    When fans returned in 2021/22 so did more HFA and in 2022/23 we saw an HFA figure that reached pre covid levels. 

    This shows some evidence that VAR was not a factor for a reduction in HFA (less opportunity for the referee to be biased to the home team).  I say potential because it’s possible the HFA in 2022/23 experienced some positive variance and VAR does have a small part to play in reducing HFA.

    The HFA drop when stadiums were empty implies home fans could contribute to the positive effect of HFA by as much as 50%. 

    2023/24 had an HFA of 0.3 – a little down on the consistent pre-covid numbers but 2024/25 saw a large plummet back down to the same figures as we saw with empty stadiums!

    This is very surprising to me as how do we justify it?  Total goals per game is not down so HFA is the same fraction of total goals as it always was.

    2025/26 is up again so far but the size of the error bar should make it clear that it’s too early to make any conclusions from that.

    I plan on adding more leagues to the plot and looking at some more in-depth HFA metrics such as total shots.  This will help give further insight and significantly reduce the size of the error bars.

    For now, I do suggest analysts or bettors reconsider their figures for HFA if it’s been a year since they did so. The drop in 2024/25 is as statistically significant as the covid induced HFA drop. Are we entering a new era with a fundamentally different home field advantage? Quite possibly!

    Subscribe (top left on the home page) if you would like to be notified when I post more on this topic

    Thanks for reading

  • In this article I am going to share my projections for the Premier League season and compare them to the projections of other popular sites.  We can check back at the end of the season to compare how each projection performed!

    Figure 1

    • Opta’s projections are here: https://theanalyst.com/competition/premier-league/table
    • Spreadex’s can be found on their website, spreadex.com
    • Elevenify can be found on x at https://x.com/elevenify
    • For the comparisons on the right, green = I’m projecting a higher points total, red = I’m projecting worse.
    • My ratings are a complex team performance rating based on shots/goals/passes/take-ons for the 4 games we’ve had so far and regressed to a prior.   The prior is stronger for teams like Aston Villa (they have a similar line up to last season, so their poor performances have been quite heavily regressed) than promoted teams (a lot of new players and/or the unknown of playing in a new league).

    Notable teams

    Fulham and Leeds stand out as my 2 favoured teams.  I rate Fulham as a really solid team (just slightly above league average) and did through all last season as well.  I’m not sure why I have this slightly controversial view, maybe individually their players don’t rate highly or raw xG numbers over-rate them…

    I rate Leeds right around league average after 4 games (the Arsenal performance was disappointing but their other 3 games have been extremely solid) and I can’t justify a strong negative prior based on their performances last season.  You can be sure I will be investing heavily in Leeds and Fulham in the long-term betting markets.

    Opta and Elevenify are not optimistic about Manchester United’s hopes while Spreadex and I think they can post a respectable total.  I rated them a shade above league average last year which I think will be their ‘floor’ this year for a few reasons (new signings, no Europe e.t.c). 

    On the other side of thing I am very low on Burnley’s performances thus far.  I’m not sure what Opta see in them (they have been rated well by Opta since before the start of the season.)  I think there are probably worse investments in the world than the 40%+ ROI available on Burnley getting relegated (very risky though!).  I am backing rock bottom at an implied chance of almost 33% as well.

    I have been a low on Chelsea since last season while I’m less confident in being relatively down on Manchester City.  This means I have the league as more of a 2-horse race than others.

    Please subscribe (top left) if you want to be notified about future posts!  I’m planning an analytical piece about some revised game state adjustments and much more.

  • The season may be young but in betting, and life in general, we don’t always have the luxury of waiting for big sample sizes before making decisions.  Let’s look at 2 teams that have captured my attention so far.

    Tottenham Hotspur

    Of all the established teams in the Premier League this season, I think accurately rating Tottenham using team performance metrics is the most difficult.  This is because they spent a large part of last season a) with different players and b) focusing on the Europa League.  This is a case where quantitative player ratings would be somewhat required – an area I have never focused on (and seem to be the domain of the top-level syndicates).

    Today I want to note just how bad their performance was against Bournemouth. 

    The following graph plots the worst performance of the season against finishing position for the last 3 years (60 team seasons) in the Premier League.

    Figure 1

    The worst performances of each team so far in 2025/26 are shown stuck to the x-axis.  I had to remove some labels as they were on top of each other – it doesn’t matter because look at a team’s worst performance is not generally the most useful metric.  It does become important when we look at a performance as bad as Spurs’ performance against Bournemouth, just to the right of -2 adjusted xG diff per game (adjusted for fixture, game states, red cards e.t.c).

    The relationship between ‘worst performance’ and finishing position means Tottenham’s -1.85 adj xG performance against Bournemouth is equivalent to the worst performance for the whole season you’d expect from a team that finishes 12th or 13th.

    “It’s just an outlier”, yes, but I am comparing the performance to data points that are also outliers.  For example, if Tottenham finish 8th this season, they already have a performance that you woudn’t expect from an 8th placed team all season.

    They have played 3 games so it could be argued that I don’t need to look at 1 game when I could look at 3.  This is reasonable but I thought it interesting to emphasise just how poor they were against Bournemouth.  After 2 games I was entertaining the idea that they were top 4 contenders and outside shots of the title but I’m quite confident in now saying they are outside shots for top 4 at best.

    Wrexham

    My rating for Wrexham so far is very poor – I have them rated rock bottom of the Championship rankings after 4 games.  Their shot profile is quite skewed – they have poor shot numbers for/against but a very high shot quality.  My ratings are not positive about their chances because after 4 games, big chances (BCs) only receive a fairly low weight in my ratings.  I have calculated the standard deviation of BC difference for one game to be around 2, which is almost 60% of the total BCs in a game.  Shots on the other hand have a standard deviation of around 6 which is around 25-30% of the total number of shots in a game.  Building on this information, I created the following graph:   

    Figure 2

    To create the BC and shot difference lines, I take Wrexham’s shot difference/BC difference and turn it into an xG difference (average shot = 0.11 xG average big chance = 0.4 xG).  I calculate the 4 game standard deviations by dividing the standard deviation of 1 game by root(n) where n = 4.  The y-axis is the ‘probability density function’ (the total area under each line is 1).

    This graph shows that if we are looking simply at total shot ratio it is hard for Wrexham to be a league average team (see how little area is underneath the orange line for xG diff > 0) but if we use just big chance difference there is still a reasonable chance their performance levels are anywhere from top of the league to bottom (-1 xG per game or +1 xG per game is a big difference).

    The grey line is my full team rating metric (the sharper peak is because of a lower standard deviation) which you can see is shifted somewhat towards the big chance difference but not a lot.  This still says it’s hard for Wrexham to be a league average team based on their performances so far.

    High value xG shots make a huge difference to xG but they are too random when we are trying to extract maximum predictive information from a small sample.  I don’t have a fully quantitative way of defining this yet, but I hope to create one!