Football Prediction Model: How It Works Explained

You’ve probably seen the football forecasts and wondered how the football prediction model works. Let’s take a look at how raw match data results in the percentage predictions you see here.

Starting with the Basics: Teams Strength Ratings

So, how do you predict football matches? The thing about football prediction is that you can’t just look at the last few games and call it a day. A team that battered someone 4-0 might have just gotten a bit lucky, while a team that lost 1-0 might have actually played brilliantly, but couldn’t find the back of the net. That’s where the GxG (Goals x Expected Goals) and GAxGA (Goals Against x Expected Goals Against) metrics come in. Think of them as a team’s true Strength Rating, blending what actually happened (the goals) with the underlying performance (Expected Goals). This also includes further advanced metrics, not only Expected Goals.

A single game’s score doesn’t tell the whole story. Winning 2-0 at Anfield means a lot more than beating a struggling team at home by the same score. This is where things get interesting.

The Global League Rankings

Here’s what I haven’t mentioned before: the metrics are not calculated in isolation. Think of European football as one massive, interconnected network. Through competitions like the Champions League, Europa League, and Conference League, we can figure out the relative strength of different leagues. This matters because beating the 10th-placed team in the Premier League is a far bigger achievement than beating the 10th-placed team in the Austrian Bundesliga. As an Austrian myself, I must say this with regret, but unfortunately, it’s true.

The model assigns each league a “strength factor” and then uses this to adjust every single match result. Another helping hand in this process is the data from clubelo.com. They kindly show the average ELO ratings for almost every major European league. Those averages are also used to calculate the strength factors. ELO ratings, in themselves, are not particularly sophisticated, but clubelo.com is a great website with data dating back almost 100 years, so they remain reliable and accurately reflect the differences between leagues, updated daily.

Moving Averages

To track a team’s form, we use a method called exponential moving average (EMA) instead of a simple average. This approach gives more weight to recent games but still considers the team’s full history. For example, Manchester City experienced a tough stretch in the 2024/25 season, which lowered their ratings; however, their strong past performances kept them in the Top 10.

These calculations are also adjusted based on the strength of each opponent. It’s similar to tennis rankings, where beating a top player means more than beating a lower-ranked one. For football, every match’s home advantage, league strength, importance, fatigue, and many other factors are considered.

Ghost Games and Match Importance

Here’s something quirky: sometimes “ghost games” are included in a team’s record. When a team gets promoted or hasn’t played for a while, the model needs a baseline to work from; otherwise, swings after games will be too big. These ghost games serve as a starting point, typically set at a level slightly below the league average to reflect the uncertainty of a new arrival.

Something special also happens when a title race is over. Remember those dead rubber games at the end of the season? They aren’t ignored, but their impact on Strength Ratings is reduced. A 4-0 loss when a team is already on the beach doesn’t reveal as much as a 1-0 loss in a title decider.

Simulating Football Seasons

After calculating all these detailed strength ratings, the rest of the season is simulated; fifteen thousand times, in fact.

Using a method called a Monte Carlo simulation, every remaining fixture is played out based on each team’s unique attacking and defensive ratings to generate a realistic scoreline. This is done for the entire league schedule to create a single final table, and then repeated 14,999 more times.

Why so many simulations? Because football is wonderfully unpredictable. In one simulation, Arsenal might win all their remaining games. In another, they might drop a lot of points to weaker sides. By running thousands of simulations, it becomes possible to confidently say things like, “Arsenal has a 73% chance of finishing in the top 4.”

The Final Polish

Now, a magician never reveals all their tricks, but special adjustments are also baked in that really set this model apart. There are different tiebreaking rules for various leagues (head-to-head vs. goal difference), special handling for split-season formats such as the Austrian Bundesliga or Scottish Premiership, and proprietary adjustments to how international games are weighted.

The beauty of this approach lies in its combination of sophistication and intuitiveness. When the model indicates that Liverpool has a 65% chance of winning, it considers their recent form, their opponent’s struggles, the historical home advantage, the relative strength of the league, and various other data points.

But here’s the kicker: 100% certainty is never possible, and neither is the model. That’s why the focus is on probabilities, not certainties. Football would be boring otherwise. The aim is simply to be right more often than wrong and to quantify uncertainty honestly.