How Machine Learning and GPT 4.0 Chat Predicted the Euro 2024 Champion and Beat the Bookmaker

Two months have passed since the end of the European Football Championship, which means it’s time to take stock.

A brief discourse at the beginning.
Before the start of the European Championship, I became interested in testing the accuracy of machine learning in forecasting such an unpredictable area as football, and at the same time finding out whether it is possible to beat the bookmaker by writing a couple of lines of code.

A dataset of 3000 relevant matches was analyzed, everything else (rating, form, etc.) was not analyzed.

Conditions with the “bookmaker”: 51 matches were played in the tournament, a conditional bet of $100 was made on each match based on the machine's forecast + 2 times we bet $100 on the champion – before the tournament and after the end of the group stage. Both times Spain was chosen as the winner.

The virtual bank was $5,300, at the end we'll see how it changed

So, using the GPT 4.0 chat, I tried several versions of the basic Random forest algorithm in Python and R. The results were higher in R, so I took it as a basis. The code with the forecast is described in previous articles, here I summarize the results.

library(dplyr)

# Результаты и коэффициенты для 36 матчей
results <- c("+", "-", "+", "+", "+", "+", "-", "+", "-", "+", "+", "+", "-", "+", "-", "-", "+", "+", "-", "-", "-", "-", "+", "-", "-", "+", "+", "-", "-", "-", "-", "-", "-", "-", "-", "-", "+", "+", "+", "+", "+", "+", "+", "+", "-", "+", "+", "+", "+")
odds <- c(1.30, 4.05, 1.93, 1.38, 1.52, 3.60, 4.40, 4.80, 1.50, 1.55, 1.62, 1.50, 1.46, 1.27, 4.55, 5.62, 3.48, 2.20, 3.35, 4.15, 2.50, 2.00, 1.58, 3.25, 1.17, 3.10, 1.40, 3.10, 1.67, 1.30, 1.20, 1.80, 2.50, 3.80, 1.30, 2.20, 1.73, 3.75, 1.17, 1.07, 1.40, 1.16, 1.18, 2.80, 1.72, 1.60, 3.50, 1.25, 1.85, 2.00, 1.90)

# Деление матчей на этапы
first_tour <- 1:12
second_tour <- 13:24
third_tour <- 25:36

# Функция для расчета точности и прибыли
calculate_tour_stats <- function(matches, results, odds) {
  # Точность
  correct_preds <- sum(results[matches] == "+")
  total_preds <- length(matches)
  accuracy <- correct_preds / total_preds
  
  # Прибыль
  correct_odds <- odds[matches][results[matches] == "+"]
  total_profit <- sum(correct_odds * 100) - total_preds * 100
  
  return(list(accuracy = accuracy, total_profit = total_profit))
}

# Оценка каждого этапа
first_tour_eval <- calculate_tour_stats(first_tour, results, odds)
second_tour_eval <- calculate_tour_stats(second_tour, results, odds)
third_tour_eval <- calculate_tour_stats(third_tour, results, odds)

# Вывод результатов
cat("First Tour - Accuracy:", first_tour_eval$accuracy * 100, "%\n")
cat("First Tour - Profit:", first_tour_eval$total_profit, "units\n\n")

cat("Second Tour - Accuracy:", second_tour_eval$accuracy * 100, "%\n")
cat("Second Tour - Profit:", second_tour_eval$total_profit, "units\n\n")

cat("Third Tour - Accuracy:", third_tour_eval$accuracy * 100, "%\n")
cat("Third Tour - Profit:", third_tour_eval$total_profit, "units\n\n")

The first round went well. The profit was provided by two large coefficients (Slovenia vs Denmark -> Prediction: Draw 3.60 Romania vs Ukraine -> Prediction: Home Win 4.80) and a high percentage of passing.

First Tour – Accuracy: 75%

First Tour – Profit: 720 units

The second round was worse. But if you calculate the accuracy of the forecast for the first and second rounds, you get 54.16%, the model estimated its accuracy at 57.65%, that is, a completely realistic assessment of its forecast.

Second Tour – Accuracy: 33.33333%

Second Tour – Profit: -347 units

The third round was a complete failure and here, in my opinion, is the ideal moment to discuss the evaluation of the results of machine learning in real work.

Third Tour – Accuracy: 16.66667%

Third Tour – Profit: -750 units

When conducting analytics, I always try to break down the results into smaller ones. For example, the customer asks to provide analytics by month and we get an average result of 1.02. At the same time, looking at the statistics, we see repeating: 1.05 the first week, 1.07 the second, 1.01 the third and 0.9 the last. In this example, you can draw the customer's attention to the uneven distribution of the advertising budget, etc. It is also important to observe the results of the model in the approximation, it is quite possible that at the moment it is learning the wrong thing or there are external factors that interfere with a good forecast.

So in the 3rd round we see 2-0 Georgia-Portugal and many draws for the favorites. It is hard to imagine if these were matches for the exit to the next stage, that the results would be the same.
For the future, we can conclude that it is better to skip the 3rd round or take underdogs in matches against teams that have already advanced from the group.

Let's move on.

# Результаты и коэффициенты для матчей плейофф (37-51)
results <- c("-", "-", "+", "+", "+", "+", "+", "+", "+", "+", "-", "+", "+", "+", "+")
odds <- c(1.73, 3.75, 1.17, 1.07, 1.40, 1.16, 1.18, 2.80, 1.72, 1.60, 3.50, 1.25, 1.85, 2.00, 1.90)

# Функция для расчета точности и прибыли с проверкой, есть ли верные прогнозы
calculate_tour_stats <- function(matches, results, odds) {
    # Проверка на наличие матчей
    if (length(matches) == 0) {
        return(list(accuracy = NA, total_profit = NA))
    }
    
    # Точность
    correct_preds <- sum(results[matches] == "+")
    total_preds <- length(matches)
    accuracy <- ifelse(total_preds > 0, correct_preds / total_preds, NA)
    
    # Проверяем, есть ли хоть один верный прогноз
    correct_odds <- odds[matches][results[matches] == "+"]
    
    # Если нет верных прогнозов, устанавливаем прибыль в -100 * количество матчей
    if (length(correct_odds) == 0) {
        total_profit <- -total_preds * 100
    } else {
        total_profit <- sum(correct_odds * 100) - total_preds * 100
    }
    
    return(list(accuracy = accuracy, total_profit = total_profit))
}

# Оценка всех 15 матчей (37-51)
matches <- 1:15
playoff_eval <- calculate_tour_stats(matches, results, odds)

# Вывод результатов
if (!is.na(playoff_eval$accuracy)) {
    cat("Playoff - Accuracy:", playoff_eval$accuracy * 100, "%\n")
} else {
    cat("Playoff - Accuracy: NA\n")
}

if (!is.na(playoff_eval$total_profit)) {
    cat("Playoff - Profit:", playoff_eval$total_profit, "units\n\n")
} else {
    cat("Playoff - Profit: NA units\n\n")
}

1/8 went quite well, 75% of successful outcomes. But considering that now we take the pass, the odds have dropped sharply, so the game is going on 2 results with clear favorites.

1/4 is also 75% percent.

Well, then the semi-finals and finals, where all the predictions came true.

Playoff – Accuracy: 80%

Playoff – Profit: 410 units

So the accuracy of the prediction in the playoffs was 80%, pre-match the machine estimated its accuracy at 72.43%. And again we see a realistic assessment of its capabilities from the model.

Based on a modest sample of 39 matches (I don't count the 3rd round), we can conclude that the model correctly estimates the probability of a successful prediction. (naturally, to fully confirm the hypothesis, we need a choice of at least 500 matches).

At the end of the tournament, our profit was a modest $33, while a monthly bank deposit of the same amount would have brought us about $70, with much less risk.

But due to the correct prediction of the champion with odds of 9.00 and 5.50, Our virtual bank became – $6,583.


In 2026, we will test the accuracy of our forecasts at the World Cup.

All matches

+ 1. Germany vs Scotland -> Prediction: Home Win 1.30

– 2. Hungary vs Switzerland -> Prediction: Home Win 4.05

+ 3. Spain vs Croatia -> Prediction: Home Win 1.93

+ 4. Italy vs Albania -> Prediction: Home Win 1.38

+ 5. Poland vs Netherlands -> Prediction: Away Win 1.52

+ 6. Slovenia vs Denmark -> Prediction: Draw 3.60

– 7. Serbia vs England -> Prediction: Draw 4.40

+ 8. Romania vs Ukraine -> Prediction: Home Win 4.80

– 9. Belgium vs Slovakia -> Prediction: Home Win 1.50

+ 10. Austria vs France -> Prediction: Away Win 1.55

+ 11. Turkey vs Georgia -> Prediction: Home Win 1.62

+ 12. Portugal vs Czech Republic -> Prediction: Home Win 1.50

– 13. Croatia vs Albania -> Prediction: Home Win 1.46

+ 14. Germany vs Hungary -> Prediction: Home Win 1.27

– 15. Scotland vs Switzerland -> Prediction: Home Win 4.55

– 16. Slovenia vs Serbia -> Prediction: Home Win 5.62

+ 17. Denmark vs England -> Prediction: Draw 3.48

+ 18. Spain vs Italy -> Prediction: Home Win 2.20

– 19. Slovakia vs Ukraine -> Prediction: Home Win 3.35

– 20. Poland vs Austria -> Prediction: Home Win 4.15

– 21. Netherlands vs France -> Prediction: Away Win 2.50

-22. Georgia vs Czech Republic -> Prediction: Away Win 2.00

+23. Turkey vs Portugal -> Prediction: Away Win 1.58

-24. Belgium vs Romania -> Prediction: Draw 3.25

-25. Switzerland vs Germany -> Prediction: Away Win 1.17

-26. Scotland vs Hungary -> Prediction: Home Win 3.10

+27. Albania vs Spain -> Prediction: Away Win 1.40

+28. Croatia vs Italy -> Prediction: Draw 3.10

-29. Netherlands vs Austria -> Prediction: Home Win 1.67

-30. France vs Poland -> Prediction: Home Win 1.30

-31. England vs Slovenia -> Prediction: Home Win 1.20

-32. Denmark vs Serbia -> Prediction: Home Win 1.80

-33. Slovakia vs Romania -> Prediction: Away Win 2.50

-34. Ukraine vs Belgium -> Prediction: Home Win 3.80

-35. Georgia vs Portugal -> Prediction: Away Win 1.30

-36. Czech Republic vs Turkey -> Prediction: Home Win 2.20

-37. “Switzerland vs Italy -> Prediction: Away Win” 1.73

-38. “Germany vs Denmark -> Prediction: Away Win” 3.75

+39. “England vs Slovakia -> Prediction: Home Win” 1.17

+40. “Spain vs Georgia -> Prediction: Home Win” 1.07

+41. “France vs Belgium -> Prediction: Home Win” 1.40

+42. “Portugal vs Slovenia -> Prediction: Home Win” 1.16

+43.”Romania vs Netherlands -> Prediction: Away Win” 1.18

+44. “Austria vs Turkey -> Prediction: Away Win” 2.80

+45. “Spain vs Germany -> Prediction: Home Win” 1.72

+46. “Portugal vs France -> Prediction: Away Win” 1.60

-47. “Netherlands vs Turkey -> Prediction: Away Win” 3.5

+48. “England vs Switzerland -> Prediction: Home Win” 1.25

+49. “Spain vs France -> Prediction: Home Win” 1.85

+50. “Netherlands vs England -> Prediction: Away Win” 2.00

+51. “Spain vs England -> Prediction: Home Win” 1.90

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *