SlideShare a Scribd company logo
1 of 8
Download to read offline
Evangelos Matselis 3508326 Sports Analytics
PREDICTION
MODEL FOR
GERMAN
BUNDESLIGA
6/25/2015 Building an ELO model suitable for gaming
This work focuses on the ELO rating system, its background,
utility and improvements made for predicting match
outcomes in German Soccer League.
PREDICTION MODEL FOR GERMAN BUNDESLIGA
Sports Analytics
Page 1
PREDICTION MODEL FOR
GERMAN BUNDESLIGA
B U I L D I N G A N E L O M O D E L S U I TA B L E F O R G A M I N G
Introduction
The ELO rating system was developed by the Hungarian-American, Arpad Elo, and was originally used to
rate chess players. However, during the past years, the model was modified and further used for multiplayer
sports too, such as soccer, basketball, baseball and hockey.
Dyte and Clarke (2001) were the two of the first to use ratings to predict match outcomes, but it was not until
2009 that ELO proved to be the best model for predicting match outcomes amongst many other models.
In this work, we will build an ELO model for the German Soccer League to predict the outcome of Bundesliga
matches in four consecutive seasons.
The basic model
As mentioned before, ELO was used as a rating system for chess players. There are two main assumptions in
the original model. Elo assumed that the performance of each player is a normally distributed variable.
Furthermore, since performance is not measurable, a player is assumed to perform in high level if he wins and
in lower level if he loses.
The ELO model for World Cup in soccer, introduced by Bedford and da Costa (2004), which was the base of
this study, is based on the logistic distribution. The formulas used are as follows:
π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑) = π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) + π‘Š βˆ— (π‘‚π‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘ βˆ’ 𝐸π‘₯𝑝𝑒𝑐𝑑𝑒𝑑)
where
π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑) is the new rating of team A
π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) is the previous rating of team A
PREDICTION MODEL FOR GERMAN BUNDESLIGA
Sports Analytics
Page 2
π‘Š = {
40, 𝑖𝑓 π‘”π‘œπ‘Žπ‘™ π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’ = 0,1
60, 𝑖𝑓 π‘”π‘œπ‘Žπ‘™ π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’ = 2
40 + 40 (
3+𝑔𝑑
8
) , 𝑖𝑓 π‘”π‘œπ‘Žπ‘™ π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’ β‰₯ 3
π‘‚π‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘ = {
1, 𝑖𝑓 π‘‘π‘’π‘Žπ‘š 𝐴 𝑀𝑖𝑛𝑠
0.5, 𝑖𝑓 π‘‘π‘Ÿπ‘Žπ‘€
0, 𝑖𝑓 π‘‘π‘’π‘Žπ‘š 𝐴 π‘™π‘œπ‘ π‘’π‘ 
𝐸π‘₯𝑝𝑒𝑐𝑑𝑒𝑑 =
1
10
(
βˆ’π‘‘π‘–π‘“π‘“
400
)
+1
where the difference is calculated subtracting team A’s rating before the game from team B’s rating before
the game, adding a home advantage which is positive if team A plays at home or negative if team A is
away. Home advantage was set as 100 in World Cup games and this particular value was used in many
other ELO models that were developed later.
Building the model for predicting Bundesliga game outcomes
Bedford and da Costa’s model proved to be really helpful in developing the improved model for predicting
game outcomes in Bundesliga. The formulas used for this model are more or less the same. The new formula is
the following:
π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑) = π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) + π‘Š βˆ— 𝐹 βˆ— (π‘‚π‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘ βˆ’ 𝐸π‘₯𝑝𝑒𝑐𝑑𝑒𝑑)
The new value introduced is 𝐹 = √|πΊπ‘œπ‘Žπ‘™ π·π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’| which rewards the winning team with extra rating
points if they win by more than a goal. Also, in the new model, W is constant and set as 30, due to the high
level of difficulty in the German League. Finally, as the study starts in season 2010/2011, the starting ratings
for all team is set as 1500.
Home Advantage
Calculating the home advantage for Bundesliga teams proved to be very challenging. The value of 100 set in
previous studies seemed to be extremely large, as the German League is extremely competitive, where home
teams may well have an advantage to get the 3 points, but it is not that high. Thus, a way should be found to
calculate a different advantage for home teams in every competition.
In 1995, Bundesliga changed its point system and decided to award 3 points to the winners instead of 2,
while both teams continued to share 2 points when games ended in a draw and the loser was not awarded
PREDICTION MODEL FOR GERMAN BUNDESLIGA
Sports Analytics
Page 3
any points. Since 1995, 5814 games were played. 2709 (47%) of them ended in home win, 1485 (26%)
ended in a draw and 1620 (27%) ended in an away win.
The points awarded per game are 2 βˆ— 0.26 + 3 βˆ— (0.47 + 0.27) = 2.74. The points awarded per game in
home teams are 3 βˆ— 0.47 + 0.26 = 1.65, while the points awarded per game in away teams are
3 βˆ— 0.27 + 0.26 = 1.09.
With a simple division, it can be easily found that the expected points for home teams are 0.60 and for away
teams are 0.40. This basically means that home teams are expected to win 60% of the points awarded on
average. Solving the basic Exp formula for π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) βˆ’ π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐡(𝑑 βˆ’ 1), we find that:
π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) βˆ’ π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐡(𝑑 βˆ’ 1) = 72 which was the value used for home advantage.
Season-to-season carryover
The study started by predicting the match outcomes for season 2010/2011. At season start, the rating for all
teams was set as 1500. However, teams change every year. Some of them strengthen, some become weaker,
transfer are held and form is lost due to season break. Thus, carrying the previous year’s final ratings would
probably lead to false conclusions and misinterpretations.
The solution is based on MARS ratings system for AFL. In MARS, the team ratings are dragged back towards
the starting rating (in our example is 1500) by an amount that is equal to a half of the difference between
the rating at the end of the season and 1500.
New Season Rating = 1500+(Previous season Final rating-1500)/2
In every new season, there are teams that are promoted from 2.Bundesliga and teams that are relegated
from Bundesliga. The starting rating for each promoted team is calculated as follows:
Starting Rating = 1500 + (Average Final Rating of relegated teams – 1500)/2
PREDICTION MODEL FOR GERMAN BUNDESLIGA
Sports Analytics
Page 4
The β€œDraw” Problem
Although ELO is one of the best predictors for match results, it is very difficult to predict a draw. For the model
to predict a draw, the expected probability for both teams must be equal. In terms of our model, that means
that the away team must have a rating that is higher than the home team’s by 72 points. It is easy to
understand that such a coincidence is almost impossible to happen. However, recent statistics show that the
draw is a common result in most competitions, as more than 15% of the games end without a winner.
The first approach to this problem, was to give a range for the model to predict a draw. For example, is the
Expected winning probability for each team was found to be between 0.45 and 0.55, the model would
predict a draw. But this method did not seem mathematically correct, as in terms of probability, when a team
has 55% of winning the game, then that is a good advantage against the opposition, as the remaining 45% is
split in draw and away win.
Since the model was being built for gaming purposes, it was decided to take advantage of β€œDraw No Bet”.
Draw no Bet (or Asian Hanticap 0) is a special bet offered by the majority of the bookmakers worldwide. By
betting on a draw no bet game, the odds are lower than the actual ones for home and away wins, but if the
game ends in a draw, then the bet is void and the gamer gets his money back. Thus, whenever the game
ended in a draw, the result did not count for the correct prediction percentage, as shown in Figures 1, 2 and
3.
Figure 1: Bundesliga Round 1 Season 2011/2012
Figure 2: Bundesliga Round 1 Season 2012/2013
PREDICTION MODEL FOR GERMAN BUNDESLIGA
Sports Analytics
Page 5
Figure 3: Bundesliga Round 1 Season 2013/2014
Results and Conclusion
The improved ELO model was successfully applied to the German Bundesliga during the seasons 2010/2011-
2013/2014. The total performance of the model was over 60% from the starting season (figure 4), when all
teams started at a 1500 rating and exceeded 70% in 2011/2012 and 2013/2014 (figures 5 and 7).
Figure 4: Bundesliga Round 34 Season 2010/2011
Figure 5: Bundesliga Round 34 Season 2011/2012
PREDICTION MODEL FOR GERMAN BUNDESLIGA
Sports Analytics
Page 6
Figure 6: Bundesliga Round 34 Season 2012/2013
Figure 7: Bundesliga Round 34 Season 2013/2014
Furthermore, the model managed to predict correctly more than 80% of the match outcomes in 42 rounds
throughout the seasons (always excluding the games that ended in a draw), while 12 of them had a
prediction percentage of 100%.
Finally, although it might be expected that the model would fail to predict correctly the majority of the game
outcomes at the beginning of each season, due to the teams’ lack of form and other factors, it managed to
achieve more than 60% correct prediction in three of the four examined seasons, whereas the worst results
usually come in the middle of each season, usually when the second half of the season starts. This may have an
explanation, as there is a large winter break in Bundesliga, due to heavy snowfall and cold weather.
What could be done in later studies, would be to modify the W value, if a very high rated team plays a low
rated team. This could give extra rating points to the lower rated teams when they manage to win a game
against a very strong opposition and of course would give less points to the strongest teams when they win
against a weaker team.
PREDICTION MODEL FOR GERMAN BUNDESLIGA
Sports Analytics
Page 7
References
1. Bedford, A. and Da Costa, C., 2004, β€˜A ratings based analysis of Oceania's road to the world cup’, in
Proceedings of the Seventh Australasian Conference on Mathematics and Computers in Sport, R. H.
Morton and S. Ganesalingam (ed.), Massey University, Palmerston North, NZ (Seventh Australasian
Conference on Mathematics and Computers in Sport).
2. Dyte, D. and Clarke, S.R., 2001, β€˜A ratings based poisson model for World Cup Soccer simulation’,
JORS, 5, 993-998
3. Wikipedia, (2015). Elo rating system. [online] Available at:
https://en.wikipedia.org/wiki/Elo_rating_system [Accessed 24 Jun. 2015].
4. Schiefler, L. (2015). Football Club Elo Ratings. [online] Clubelo.com. Available at: http://clubelo.com/
[Accessed 24 Jun. 2015].

More Related Content

Similar to PREDICTION MODEL FOR GERMAN BUNDESLIGA

A Hybrid Constraint Programming And Enumeration Approach For Solving NHL Play...
A Hybrid Constraint Programming And Enumeration Approach For Solving NHL Play...A Hybrid Constraint Programming And Enumeration Approach For Solving NHL Play...
A Hybrid Constraint Programming And Enumeration Approach For Solving NHL Play...Shannon Green
Β 
My Entry to the Sportsbet/CIKM competition
My Entry to the Sportsbet/CIKM competitionMy Entry to the Sportsbet/CIKM competition
My Entry to the Sportsbet/CIKM competitionSimone Romano
Β 
La liga 2013 2014 analysis
La liga 2013 2014 analysisLa liga 2013 2014 analysis
La liga 2013 2014 analysisRitu Sarkar
Β 
The Problem of the Chinese Basketball Association Competing for the Championship
The Problem of the Chinese Basketball Association Competing for the ChampionshipThe Problem of the Chinese Basketball Association Competing for the Championship
The Problem of the Chinese Basketball Association Competing for the ChampionshipDr. Amarjeet Singh
Β 
2012 UEFA Euro Efficiency Evaluation Based On Market Expectations
2012 UEFA Euro Efficiency Evaluation Based On Market Expectations2012 UEFA Euro Efficiency Evaluation Based On Market Expectations
2012 UEFA Euro Efficiency Evaluation Based On Market ExpectationsNathan Mathis
Β 
InstructionsCongratulations. You are a finalist in for a data a.docx
InstructionsCongratulations. You are a finalist in for a data a.docxInstructionsCongratulations. You are a finalist in for a data a.docx
InstructionsCongratulations. You are a finalist in for a data a.docxnormanibarber20063
Β 
CLanctot_DSlavin_JMiron_Stats415_Project
CLanctot_DSlavin_JMiron_Stats415_ProjectCLanctot_DSlavin_JMiron_Stats415_Project
CLanctot_DSlavin_JMiron_Stats415_ProjectDimitry Slavin
Β 
Effects of Rule Changes and Three-point System in NHL
Effects of Rule Changes and Three-point System in NHLEffects of Rule Changes and Three-point System in NHL
Effects of Rule Changes and Three-point System in NHLPatrice Marek
Β 
A Framework For Scheduling Professional Sports Leagues
A Framework For Scheduling Professional Sports LeaguesA Framework For Scheduling Professional Sports Leagues
A Framework For Scheduling Professional Sports LeaguesAmber Ford
Β 
GIS Software for Non-GIS Applications
GIS Software for Non-GIS ApplicationsGIS Software for Non-GIS Applications
GIS Software for Non-GIS Applicationsbredgecunning
Β 
Statistical Model Report
Statistical Model ReportStatistical Model Report
Statistical Model ReportPatrick Jennings
Β 
Statistical Model Report
Statistical Model ReportStatistical Model Report
Statistical Model ReportPatrick Jennings
Β 
The Year of the Pitcher: Analyzing No-Hitters
The Year of the Pitcher: Analyzing No-HittersThe Year of the Pitcher: Analyzing No-Hitters
The Year of the Pitcher: Analyzing No-HittersKenneth Burgos
Β 
Determinants of College Football Attendance
Determinants of College Football AttendanceDeterminants of College Football Attendance
Determinants of College Football AttendanceConnor Weaver
Β 
What Innings Determine Total Wins
What Innings Determine Total WinsWhat Innings Determine Total Wins
What Innings Determine Total WinsPayton Soicher
Β 
Analysis of Soccer Dataset
Analysis of Soccer DatasetAnalysis of Soccer Dataset
Analysis of Soccer DatasetSugandan Barathy
Β 
Analysis_of_the_Impact_of_Weather_on_Runs_Scored_in_Baseball_Games_at_Fenway_...
Analysis_of_the_Impact_of_Weather_on_Runs_Scored_in_Baseball_Games_at_Fenway_...Analysis_of_the_Impact_of_Weather_on_Runs_Scored_in_Baseball_Games_at_Fenway_...
Analysis_of_the_Impact_of_Weather_on_Runs_Scored_in_Baseball_Games_at_Fenway_...Steve Cultrera
Β 

Similar to PREDICTION MODEL FOR GERMAN BUNDESLIGA (20)

The Data Behind Football
The Data Behind FootballThe Data Behind Football
The Data Behind Football
Β 
A Hybrid Constraint Programming And Enumeration Approach For Solving NHL Play...
A Hybrid Constraint Programming And Enumeration Approach For Solving NHL Play...A Hybrid Constraint Programming And Enumeration Approach For Solving NHL Play...
A Hybrid Constraint Programming And Enumeration Approach For Solving NHL Play...
Β 
My Entry to the Sportsbet/CIKM competition
My Entry to the Sportsbet/CIKM competitionMy Entry to the Sportsbet/CIKM competition
My Entry to the Sportsbet/CIKM competition
Β 
La liga 2013 2014 analysis
La liga 2013 2014 analysisLa liga 2013 2014 analysis
La liga 2013 2014 analysis
Β 
The Problem of the Chinese Basketball Association Competing for the Championship
The Problem of the Chinese Basketball Association Competing for the ChampionshipThe Problem of the Chinese Basketball Association Competing for the Championship
The Problem of the Chinese Basketball Association Competing for the Championship
Β 
honors_paper
honors_paperhonors_paper
honors_paper
Β 
2012 UEFA Euro Efficiency Evaluation Based On Market Expectations
2012 UEFA Euro Efficiency Evaluation Based On Market Expectations2012 UEFA Euro Efficiency Evaluation Based On Market Expectations
2012 UEFA Euro Efficiency Evaluation Based On Market Expectations
Β 
InstructionsCongratulations. You are a finalist in for a data a.docx
InstructionsCongratulations. You are a finalist in for a data a.docxInstructionsCongratulations. You are a finalist in for a data a.docx
InstructionsCongratulations. You are a finalist in for a data a.docx
Β 
CLanctot_DSlavin_JMiron_Stats415_Project
CLanctot_DSlavin_JMiron_Stats415_ProjectCLanctot_DSlavin_JMiron_Stats415_Project
CLanctot_DSlavin_JMiron_Stats415_Project
Β 
Xll_ppt.pptx
Xll_ppt.pptxXll_ppt.pptx
Xll_ppt.pptx
Β 
Effects of Rule Changes and Three-point System in NHL
Effects of Rule Changes and Three-point System in NHLEffects of Rule Changes and Three-point System in NHL
Effects of Rule Changes and Three-point System in NHL
Β 
A Framework For Scheduling Professional Sports Leagues
A Framework For Scheduling Professional Sports LeaguesA Framework For Scheduling Professional Sports Leagues
A Framework For Scheduling Professional Sports Leagues
Β 
GIS Software for Non-GIS Applications
GIS Software for Non-GIS ApplicationsGIS Software for Non-GIS Applications
GIS Software for Non-GIS Applications
Β 
Statistical Model Report
Statistical Model ReportStatistical Model Report
Statistical Model Report
Β 
Statistical Model Report
Statistical Model ReportStatistical Model Report
Statistical Model Report
Β 
The Year of the Pitcher: Analyzing No-Hitters
The Year of the Pitcher: Analyzing No-HittersThe Year of the Pitcher: Analyzing No-Hitters
The Year of the Pitcher: Analyzing No-Hitters
Β 
Determinants of College Football Attendance
Determinants of College Football AttendanceDeterminants of College Football Attendance
Determinants of College Football Attendance
Β 
What Innings Determine Total Wins
What Innings Determine Total WinsWhat Innings Determine Total Wins
What Innings Determine Total Wins
Β 
Analysis of Soccer Dataset
Analysis of Soccer DatasetAnalysis of Soccer Dataset
Analysis of Soccer Dataset
Β 
Analysis_of_the_Impact_of_Weather_on_Runs_Scored_in_Baseball_Games_at_Fenway_...
Analysis_of_the_Impact_of_Weather_on_Runs_Scored_in_Baseball_Games_at_Fenway_...Analysis_of_the_Impact_of_Weather_on_Runs_Scored_in_Baseball_Games_at_Fenway_...
Analysis_of_the_Impact_of_Weather_on_Runs_Scored_in_Baseball_Games_at_Fenway_...
Β 

PREDICTION MODEL FOR GERMAN BUNDESLIGA

  • 1. Evangelos Matselis 3508326 Sports Analytics PREDICTION MODEL FOR GERMAN BUNDESLIGA 6/25/2015 Building an ELO model suitable for gaming This work focuses on the ELO rating system, its background, utility and improvements made for predicting match outcomes in German Soccer League.
  • 2. PREDICTION MODEL FOR GERMAN BUNDESLIGA Sports Analytics Page 1 PREDICTION MODEL FOR GERMAN BUNDESLIGA B U I L D I N G A N E L O M O D E L S U I TA B L E F O R G A M I N G Introduction The ELO rating system was developed by the Hungarian-American, Arpad Elo, and was originally used to rate chess players. However, during the past years, the model was modified and further used for multiplayer sports too, such as soccer, basketball, baseball and hockey. Dyte and Clarke (2001) were the two of the first to use ratings to predict match outcomes, but it was not until 2009 that ELO proved to be the best model for predicting match outcomes amongst many other models. In this work, we will build an ELO model for the German Soccer League to predict the outcome of Bundesliga matches in four consecutive seasons. The basic model As mentioned before, ELO was used as a rating system for chess players. There are two main assumptions in the original model. Elo assumed that the performance of each player is a normally distributed variable. Furthermore, since performance is not measurable, a player is assumed to perform in high level if he wins and in lower level if he loses. The ELO model for World Cup in soccer, introduced by Bedford and da Costa (2004), which was the base of this study, is based on the logistic distribution. The formulas used are as follows: π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑) = π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) + π‘Š βˆ— (π‘‚π‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘ βˆ’ 𝐸π‘₯𝑝𝑒𝑐𝑑𝑒𝑑) where π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑) is the new rating of team A π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) is the previous rating of team A
  • 3. PREDICTION MODEL FOR GERMAN BUNDESLIGA Sports Analytics Page 2 π‘Š = { 40, 𝑖𝑓 π‘”π‘œπ‘Žπ‘™ π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’ = 0,1 60, 𝑖𝑓 π‘”π‘œπ‘Žπ‘™ π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’ = 2 40 + 40 ( 3+𝑔𝑑 8 ) , 𝑖𝑓 π‘”π‘œπ‘Žπ‘™ π‘‘π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’ β‰₯ 3 π‘‚π‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘ = { 1, 𝑖𝑓 π‘‘π‘’π‘Žπ‘š 𝐴 𝑀𝑖𝑛𝑠 0.5, 𝑖𝑓 π‘‘π‘Ÿπ‘Žπ‘€ 0, 𝑖𝑓 π‘‘π‘’π‘Žπ‘š 𝐴 π‘™π‘œπ‘ π‘’π‘  𝐸π‘₯𝑝𝑒𝑐𝑑𝑒𝑑 = 1 10 ( βˆ’π‘‘π‘–π‘“π‘“ 400 ) +1 where the difference is calculated subtracting team A’s rating before the game from team B’s rating before the game, adding a home advantage which is positive if team A plays at home or negative if team A is away. Home advantage was set as 100 in World Cup games and this particular value was used in many other ELO models that were developed later. Building the model for predicting Bundesliga game outcomes Bedford and da Costa’s model proved to be really helpful in developing the improved model for predicting game outcomes in Bundesliga. The formulas used for this model are more or less the same. The new formula is the following: π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑) = π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) + π‘Š βˆ— 𝐹 βˆ— (π‘‚π‘π‘ π‘’π‘Ÿπ‘£π‘’π‘‘ βˆ’ 𝐸π‘₯𝑝𝑒𝑐𝑑𝑒𝑑) The new value introduced is 𝐹 = √|πΊπ‘œπ‘Žπ‘™ π·π‘–π‘“π‘“π‘’π‘Ÿπ‘’π‘›π‘π‘’| which rewards the winning team with extra rating points if they win by more than a goal. Also, in the new model, W is constant and set as 30, due to the high level of difficulty in the German League. Finally, as the study starts in season 2010/2011, the starting ratings for all team is set as 1500. Home Advantage Calculating the home advantage for Bundesliga teams proved to be very challenging. The value of 100 set in previous studies seemed to be extremely large, as the German League is extremely competitive, where home teams may well have an advantage to get the 3 points, but it is not that high. Thus, a way should be found to calculate a different advantage for home teams in every competition. In 1995, Bundesliga changed its point system and decided to award 3 points to the winners instead of 2, while both teams continued to share 2 points when games ended in a draw and the loser was not awarded
  • 4. PREDICTION MODEL FOR GERMAN BUNDESLIGA Sports Analytics Page 3 any points. Since 1995, 5814 games were played. 2709 (47%) of them ended in home win, 1485 (26%) ended in a draw and 1620 (27%) ended in an away win. The points awarded per game are 2 βˆ— 0.26 + 3 βˆ— (0.47 + 0.27) = 2.74. The points awarded per game in home teams are 3 βˆ— 0.47 + 0.26 = 1.65, while the points awarded per game in away teams are 3 βˆ— 0.27 + 0.26 = 1.09. With a simple division, it can be easily found that the expected points for home teams are 0.60 and for away teams are 0.40. This basically means that home teams are expected to win 60% of the points awarded on average. Solving the basic Exp formula for π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) βˆ’ π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐡(𝑑 βˆ’ 1), we find that: π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐴(𝑑 βˆ’ 1) βˆ’ π‘…π‘Žπ‘‘π‘–π‘›π‘” 𝐡(𝑑 βˆ’ 1) = 72 which was the value used for home advantage. Season-to-season carryover The study started by predicting the match outcomes for season 2010/2011. At season start, the rating for all teams was set as 1500. However, teams change every year. Some of them strengthen, some become weaker, transfer are held and form is lost due to season break. Thus, carrying the previous year’s final ratings would probably lead to false conclusions and misinterpretations. The solution is based on MARS ratings system for AFL. In MARS, the team ratings are dragged back towards the starting rating (in our example is 1500) by an amount that is equal to a half of the difference between the rating at the end of the season and 1500. New Season Rating = 1500+(Previous season Final rating-1500)/2 In every new season, there are teams that are promoted from 2.Bundesliga and teams that are relegated from Bundesliga. The starting rating for each promoted team is calculated as follows: Starting Rating = 1500 + (Average Final Rating of relegated teams – 1500)/2
  • 5. PREDICTION MODEL FOR GERMAN BUNDESLIGA Sports Analytics Page 4 The β€œDraw” Problem Although ELO is one of the best predictors for match results, it is very difficult to predict a draw. For the model to predict a draw, the expected probability for both teams must be equal. In terms of our model, that means that the away team must have a rating that is higher than the home team’s by 72 points. It is easy to understand that such a coincidence is almost impossible to happen. However, recent statistics show that the draw is a common result in most competitions, as more than 15% of the games end without a winner. The first approach to this problem, was to give a range for the model to predict a draw. For example, is the Expected winning probability for each team was found to be between 0.45 and 0.55, the model would predict a draw. But this method did not seem mathematically correct, as in terms of probability, when a team has 55% of winning the game, then that is a good advantage against the opposition, as the remaining 45% is split in draw and away win. Since the model was being built for gaming purposes, it was decided to take advantage of β€œDraw No Bet”. Draw no Bet (or Asian Hanticap 0) is a special bet offered by the majority of the bookmakers worldwide. By betting on a draw no bet game, the odds are lower than the actual ones for home and away wins, but if the game ends in a draw, then the bet is void and the gamer gets his money back. Thus, whenever the game ended in a draw, the result did not count for the correct prediction percentage, as shown in Figures 1, 2 and 3. Figure 1: Bundesliga Round 1 Season 2011/2012 Figure 2: Bundesliga Round 1 Season 2012/2013
  • 6. PREDICTION MODEL FOR GERMAN BUNDESLIGA Sports Analytics Page 5 Figure 3: Bundesliga Round 1 Season 2013/2014 Results and Conclusion The improved ELO model was successfully applied to the German Bundesliga during the seasons 2010/2011- 2013/2014. The total performance of the model was over 60% from the starting season (figure 4), when all teams started at a 1500 rating and exceeded 70% in 2011/2012 and 2013/2014 (figures 5 and 7). Figure 4: Bundesliga Round 34 Season 2010/2011 Figure 5: Bundesliga Round 34 Season 2011/2012
  • 7. PREDICTION MODEL FOR GERMAN BUNDESLIGA Sports Analytics Page 6 Figure 6: Bundesliga Round 34 Season 2012/2013 Figure 7: Bundesliga Round 34 Season 2013/2014 Furthermore, the model managed to predict correctly more than 80% of the match outcomes in 42 rounds throughout the seasons (always excluding the games that ended in a draw), while 12 of them had a prediction percentage of 100%. Finally, although it might be expected that the model would fail to predict correctly the majority of the game outcomes at the beginning of each season, due to the teams’ lack of form and other factors, it managed to achieve more than 60% correct prediction in three of the four examined seasons, whereas the worst results usually come in the middle of each season, usually when the second half of the season starts. This may have an explanation, as there is a large winter break in Bundesliga, due to heavy snowfall and cold weather. What could be done in later studies, would be to modify the W value, if a very high rated team plays a low rated team. This could give extra rating points to the lower rated teams when they manage to win a game against a very strong opposition and of course would give less points to the strongest teams when they win against a weaker team.
  • 8. PREDICTION MODEL FOR GERMAN BUNDESLIGA Sports Analytics Page 7 References 1. Bedford, A. and Da Costa, C., 2004, β€˜A ratings based analysis of Oceania's road to the world cup’, in Proceedings of the Seventh Australasian Conference on Mathematics and Computers in Sport, R. H. Morton and S. Ganesalingam (ed.), Massey University, Palmerston North, NZ (Seventh Australasian Conference on Mathematics and Computers in Sport). 2. Dyte, D. and Clarke, S.R., 2001, β€˜A ratings based poisson model for World Cup Soccer simulation’, JORS, 5, 993-998 3. Wikipedia, (2015). Elo rating system. [online] Available at: https://en.wikipedia.org/wiki/Elo_rating_system [Accessed 24 Jun. 2015]. 4. Schiefler, L. (2015). Football Club Elo Ratings. [online] Clubelo.com. Available at: http://clubelo.com/ [Accessed 24 Jun. 2015].