SlideShare a Scribd company logo
Predicting Football Match Results with Data
Mining Techniques
O. I. Aladesote, O. Agbelusi & M. Ganiyu
Abstract- Data mining techniques are very effective and useful for forecasting in many domains or fields. In this
research, prediction of Spanish la liga football match outcomes is carried out using various data mining techniques
(Multilayer Perception, Decision Tables, Random Forest, Reptree and Meta. Bagging) to determine the most accurate
among these techniques. The experimental results is done with Weka 3.9, shows that all the techniques performed well in
terms of accuracy but multilayer Perception was the most successful with an average accuracy of 100%..
I. INTRODUCTION
Football is a fast growing sport that is taking over as one the most viewed and richest sport therefore the drive to be
more than just a spectator has led to this research of being able to predict the final outcome of any match and
simultaneously making sport betting easier. One of the reasons for football being the most popular sport in the
planet is its unpredictability.
Every day, fans around the world argue over which team is going to win the next game or the next competition.
Many of these fans also put their money where their mouths are, by betting large sums on their predictions. Due to
the large amount of factors that can affect the result of a football match, it is incredibly difficult to correctly predict
its probabilities. With the increasing growth of the amount of money invested in sports betting markets, it is
important to verify how far data mining techniques can bring value to this area [9].
To solve this problem we propose building data-driven solutions designed through a data mining process. Data
mining is an aspect of computing that is used for extraction of hidden information and to automate the detection
of relevant patterns in a database. The data mining process allows us to build models that can give us predictions
according to the data that is fed into the system. The study is aimed at using data mining techniques for the
prediction of football match result. Every sport has particular rules, number of players, different styles, that is, a set
of different features. For a beginner, carrying out predictive model from the scratch with considerable dataset could
be somehow challenging. Finally, every individual especially football fans would be able to predict match result
based on identified factor at the end of this research.
We summarized the contributions of this paper as follows:
โ€ข Forecasting of la liga football match outcome using data of five previous seasons
โ€ข Comparative analysis to determine the most accurate technique.
The remainder of the paper is organized as follows: section 2 presents the literature review. In Section 3, the method
used to generate the results is presented. The experimental results for each data mining technique is presented and
discussed in section 4. Comparative analysis is done in section 5 and finally, conclusion and future work are
presented in section 6.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
46 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
II. LITERATURE REVIEW
Data mining is an important tool in event prediction. The literature selected and discussed in this section are those
that are more related and relevant to the result of football match prediction.
A match results prediction system is proposed using four data mining techniques. The author used basketball results
of four seasons (from 2005/2006 to 2009/2010) as training data and in order to assess or appraise the models, the
result of 2010/2011 season was used as test data. The result shows that the models performed with comparable
classification accuracy rate, with 67.8% as the highest [4].
The authors proposed the use of ANN and logistic regression techniques to forecast the outcome of 2014-2015
English premier league results to strengthen the complexity and inaccurate prediction results produced by statistical
approaches. The records of nine significant features are randomly selected from the records. The experimental result
of the model shows that logistic regression perform better than ANN and that the techniques show higher prediction
accuracy [17].
[13] carried out a preliminary investigation to forecast result of National Football League (NFL) using artificial
neural network (ANN). Five variables randomly extracted from first eight rounds of the competition was used for
the prediction. Teams were classified to be either strong or weak using cluster related methods.
The paper proposes data mining techniques to strengthen the limitations introduced by numeric prediction approach.
Eight years of data was used. To evaluate the performance of these techniques, both classification and regression
models were used. The experimental results clearly show that the accuracy rate of classification model outweigh
regression model [15]
The researchers carried out a performance evaluation using three classification models (naive Bayes, artificial neural
networks (ANNs) and decision trees) [16]. The models was built using different variables of NBA matches. The
experimental result shows that the accuracy of the proposed model is very reliable and that defensive fence is the
most significant variable among others. Three other variables were also chosen to be the significant.
The researcher adopted three data mining approaches to propose models for game outcome using historical data. The
purpose for this is to counter the idea of eligibility in ranking winning game based on experience. At the end of the
modeling process, all the three models were capable of forecasting the winner of the game and decision tree
produces the highest accuracy [5].
A reliable tennis match outcome prediction model is proposed with numerous factors that are systematically
prioritized to determine the match accuracy. The result shows that the proposed model with combine data and
judgement has 85.1% accuracy outcome of a match [7].
Machine learning method was adopted to forecast the result of future soccer matches based on dataset from past
matches. In this research, two important ideals were discovered as a result of some challenges encountered during
the modeling process of 2017 soccer match result. These two ideals brought about new feature engineering
methods (Recency and rating extraction) for match result forecasting. The author concluded that good forecasting
should be based on the knowledge of machine learning [3].
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
47 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
The authors developed predictive models to forecast the outcome of football match for 2008/2009 and 2015/2016
seasons. Techniques like artificial neural network (ANN), Random Forest (RF) and Support Vector Machine
(SVM) were used to develop models. Comparative analysis was made and the result shows that they are capable of
carrying out prediction correctly as compare with the result from the experience of football match analyst [8].
This paper proposes machine learning methods to determine the result of NBA match. The forecasting
process was based on the historical data, performance evaluation was done among the models developed and the
result shows that defensive rebounds features was an important features demonstrated by all the
methods for optimal prediction of the game result. Further research will be carried out using model like function
based techniques and deep learning [16].
III. METHODOLOGY
This section describes the dataset, classification techniques and performance analysis. The experiment is done using
Weka 3.9.2 on five algorithms: Multilayer Perception, Decision Tables, Random Forest, Reptree and Meta. Bagging.
In Weka, 10% cross-validation fold is adopted as classifier evaluation option.
A. Dataset
The dataset used for the implementation was the Spanish La Liga League of 2014/2015 to 2018/2019 seasons [18].
The league consists of twenty teams played both home and away matches, equaled to 380 matches per season and
1900 matches for these five seasons. The data consists of 61 features, in which 22 consists various statistical data
such as full and halt time result, home and away team shot, etc. while the remaining 39 consist of football betting
details. Out of the 22 features of the dataset, 10 features were randomly selected as predictors while full time results
(Home Win, Away Win and Lose) as the target.
B. Performance Analysis
The performance of these classification algorithms was measured based on the accuracy. Accuracy shows the rate at
which the classifier meets the correct target class, that is, it determines the instances of data correctly classified [2].
Accuracy = (1)
The total number of correctly predicted Home Win, Away Win and Lose match results is equivalent to the total
number of correctly predicted match results.
IV. RESULT AND DISCUSSION
The results of the experiment carried out on the five classification techniques would be presented and analysed
based on the percentage of accuracy of each technique. 10-foldcross validation techniques was adopted because of
small size of the data.
A. Multilayer Perception
Multilayer Perception is a type of neural network or artificial neural networks, which has appeared to be very a
valuable alternatives to old statistical techniques and does not create previous assumptions of data distribution [6].
Multilayer Perception is applied to the La Liga datasets using Weka 3.9.2. The percentage accuracy for the seasons
is 100% as depicted in Table 1 below and the result of Multilayer Perception for 2018/2019 Season in Figure 1.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
48 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
TABLE I
PERCENTAGE ACCURACY OF THE MULTILAYER PERCEPTION FOR FIVE SEASONS
Season Accuracy (%)
2018/2019 Season 100
2017/2018 Season 100
2016/2017 Season 100
2015/2016 Season 100
2014/2015 Season 100
FIGURE 1: DETAILED OUTPUT OF MULTILAYER PERCEPTION OF 2018/2019 SEASON
B. Decision Tables
Decision table is a type of rules that indicates actions to be taken when certain conditions are meant [12]. The dataset
are imported into Weka 3.9.2 and the data are run sing Decision Tables technique. The percentage accuracy for
2018/2019 season is 97.38%, 91.58% for 2017/2018 season, 94.74% for 2016/2017 season, 98.95% for 2015/2016
season and 96.84% for 2014/2015 season. The percentage for the seasons using Decision Tables is presented in
Table 2.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
49 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
TABLE II
PERCENTAGE ACCURACY OF THE DECISION TABLES FOR FIVE SEASONS
Season Accuracy (%)
2018/2019 Season 97.38
2017/2018 Season 91.58
2016/2017 Season 94.74
2015/2016 Season 98.95
2014/2015 Season 96.84
FIGURE 2: DETAILED OUTPUT OF DECISION TABLE OF 2017/2018 SEASON
C. Random Forest
Random Forest is a statistical learning mode, which is a tree-based ensemble with each node relying on group of
random variables. It performs well with small or medium dataset and can perform better than latest algorithms [1],
[11]. The dataset are imported into Weka 3.9.2 and the data are run sing Random Forest technique. The percentage
accuracy for 2018/2019 season is 98.42%, 98.95% for 2017/2018 season, 98.16% for 2016/2017 season, 97.63% for
2015/2016 season and 99.47% for 2014/2015 season. The percentage accuracy for the seasons using Random Forest
is presented in Table 3 below.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
50 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
TABLE III
PERCENTAGE ACCURACY OF THE RANDOM FOREST FOR FIVE SEASONS
Season Accuracy (%)
2018/2019 Season 98.42
2017/2018 Season 98.95
2016/2017 Season 98.16
2015/2016 Season 97.63
2014/2015 Season 99.47
Figure 3: Detailed output of Random Forest of 2016/2017 Season
D RepTree
Reduced Error Pruning Tree (Reptree) is a fast decision tree learning, which uses regression tree logic to either build
a decision using information gain as splitting principle or reduces the variance [10]. The dataset of La Liga football
League of 2014/2015 season to 2018/2019 season are implemented into Weka 3.9.2 for the prediction. The
percentage accuracy for 2018/2019 season is 98.68%, 98.68% for 2017/2018 season, 98.42% for 2016/2017 season,
97.89% for 2015/2016 season and 98.95% for 2014/2015 season. The percentage accuracy for the seasons is
presented in Table 4.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
51 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
TABLE IV
PERCENTAGE ACCURACY OF THE REPTREE FOR FIVE SEASONS
Season Accuracy (%)
2018/2019 Season 98.68
2017/2018 Season 98.16
2016/2017 Season 98.42
2015/2016 Season 97.89
2014/2015 Season 98.95
FIGURE 4: DETAILED OUTPUT OF REPTREE OF 2015/2016 SEASON
E Meta Bagging
Meta Bagging is a machine learning ensemble algorithm developed to enhance the accuracy of statistical
classification and regression of any machine learning based algorithms [14]. The dataset of La Liga football League
of 2014/2015 season to 2018/2019 season are implemented into Weka 3.9.2 for the prediction. The percentage
accuracy for 2018/2019 season is 99.74%, 98.42% for 2017/2018 season, 98.16% for 2016/2017 season, 98.42% for
2015/2016 season and 99.47% for 2014/2015 season. The percentage accuracy for the seasons is presented in Table
5.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
52 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
TABLE V
PERCENTAGE ACCURACY OF THE META BAGGING FOR FIVE SEASONS
Season Accuracy (%)
2018/2019 Season 99.74
2017/2018 Season 98.42
2016/2017 Season 98.16
2015/2016 Season 98.42
2014/2015 Season 99.47
Figure 5: Detailed output of Meta Bagging of 2014/2015 Season
V. COMPARATIVE ANALYSIS
The comparative analysis of the result shows that Multilayer Perception has the overall best average percentage
accuracy with 100%, Meta Bagging with an average accuracy of 98.84% for the seasons, Random Forest has an
average percentage accuracy of 98.53%, Reptree has an average accuracy of 98.42 while Decision Tables has the
least average accuracy of 95.90% as presented in Table 6 and Figure 6 below
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
53 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
TABLE VI
COMPARISON OF AVERAGE PERCENTAGE ACCURACY
Accuracy Multilayer
Perception
Decision Tables Random Forest Reptree Meta Bagging
2018/2019 Season 100% 97.38% 98.42% 98.68% 99.74%
2017/2018 Season 100% 91.58% 98.95% 98.16% 98.42%
2016/2017 Season 100% 94.74% 98.16% 98.42% 98.16%
2015/2016 Season 100% 98.95% 97.63% 97.89% 98.42%
2014/2015 Season 100% 96.84% 99.47% 98.95% 99.47%
Average Accuracy 100% 95.90% 98.53% 98.42% 98.84%
FIGURE 6: GRAPHICAL REPRESENTATION OF AVERAGE ACCURACY
VI. CONCLUSION AND FUTURE WORK
This work compared five data mining algorithms on Spanish la liga football match outcome. The experimental
results revealed Multilayer Perception has the most successful result, which makes it the best data mining technique
to predict la liga football match outcome with 100% accuracy as against Decision Tables with 95.90% accuracy,
Random Forest with 98.53%, Reptree with 98.42% and Meta Bagging with 98.84% accuracy. However, all data
mining techniques can also be applied in future work, consideration rating of each team as part of the variables.
References
[1] A. Cutler, D. R. Cutler, and J. R. Stevens, โ€œEnsemble Machine Learning,โ€ Ensemble Mach. Learn., no. January, 2012, doi:
10.1007/978-1-4419-9326-7.
[2] C. M. F. Che Mohd Rosli, M. Z. Saringat, N. Razali, and A. Mustapha, โ€œA Comparative Study of Data Mining Techniques on Football
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
54 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
Match Prediction,โ€ J. Phys. Conf. Ser., vol. 1020, no. 1, 2018, doi: 10.1088/1742-6596/1020/1/012003.
[3] D. Berrar, P. Lopes, and W. Dubitzky, โ€œIncorporating domain knowledge in machine learning for soccer outcome prediction,โ€ Mach.
Learn., vol. 108, no. 1, pp. 97โ€“126, 2019, doi: 10.1007/s10994-018-5747-8.
[4] C. Cao, โ€œSports data mining technology used in basketball outcome prediction,โ€ Dublin Inst. Technol., pp. 1โ€“86, 2012.
[5] D. Delen, D. Cogdell, and N. Kasap, โ€œA comparative analysis of data mining methods in predicting NCAA bowl outcomes,โ€ Int. J.
Forecast., vol. 28, no. 2, pp. 543โ€“552, 2012, doi: 10.1016/j.ijforecast.2011.05.002.
[6] M. W. Gardner and S. R. Dorling, โ€œArtificial neural networks (the multilayer perceptron) - a review of applications in the atmospheric
sciences,โ€ Atmos. Environ., vol. 32, no. 14โ€“15, pp. 2627โ€“2636, 1998, doi: 10.1016/S1352-2310(97)00447-0.
[7] W. Gu and T. L. Saaty, โ€œPredicting the Outcome of a Tennis Tournament: Based on Both Data and Judgments,โ€ J. Syst. Sci. Syst. Eng.,
vol. 28, no. 3, pp. 317โ€“343, 2019, doi: 10.1007/s11518-018-5395-3.
[8] H. Chen, โ€œNeural Network Algorithm in Predicting Football Match Outcome Based on Player Ability Index,โ€ Adv. Phys. Educ., vol. 09,
no. 04, pp. 215โ€“222, 2019, doi: 10.4236/ape.2019.94015.
[9] J. J. Zhang, E. Kim, B. Marstromartino, T. Y. Qian, and J. Nauright, โ€œThe sport industry in growing economies: critical issues and
challenges,โ€ Int. J. Sport. Mark. Spons., vol. 19, no. 2, pp. 110โ€“126, 2018, doi: 10.1108/IJSMS-03-2018-0023.
[10] S. Kalmegh, โ€œAnalysis of WEKA Data Mining Algorithm REPTree , Simple Cart and RandomTree for Classification of Indian News,โ€
Int. J. Innov. Sci. Eng. Technol., vol. 2, no. 2, pp. 438โ€“446, 2015.
[11] R. Kohavi, โ€œThe power of decision tables,โ€ Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes
Bioinformatics), vol. 912, pp. 174โ€“189, 1995, doi: 10.1007/3-540-59286-5_57.
[12] D. Tables, F. Definition, and S. P. Decision, โ€œ(cf. (cf.,โ€ pp. 68โ€“80, 1991.
[13] A. Reso- and K. Self-, โ€œDifferent Training Methods Perform in Calling the Games,โ€ pp. 9โ€“15, 1996.
[14] P. Shrivastava and M. Shukla, โ€œUses the Bagging Algorithm of Classification Method Learning and Forest Fire Data,โ€ Int. J. Adv.
Comput. Eng. Netw., vol. 01, no. 12, pp. 91โ€“95, 2014.
[15] S. J. Lee and K. Siau, โ€œA review of data mining techniques,โ€ Ind. Manag. Data Syst., vol. 101, no. 1, pp. 41โ€“46, 2001, doi:
10.1108/02635570110365989.
[16] F. Thabtah, L. Zhang, and N. Abdelhamid, โ€œNBA Game Result Prediction Using Feature Analysis and Machine Learning,โ€ Ann. Data
Sci., vol. 6, no. 1, pp. 103โ€“116, 2019, doi: 10.1007/s40745-018-00189-x.
[17] C.P. Igiri, E.O. Nwachukwu, "An Improved Prediction System for Football Match Result," IOSR Journal of Engineering, vol. 04, no
12, pp. 12-20, 2014, doi: 10.9790/3021-04124012020
[18] Spanish La Liga (football) dataset [Online]. Available:
https://datahub.io/sports-data/spanish-la-liga#data. [Accessed on 17 December, 2019].
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 18, No. 6, June 2020
55 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
IJCSIS
ISSN (online): 1947-5500
Please consider to contribute to and/or forward to the appropriate groups the following opportunity to submit and publish
original scientific results.
CALL FOR PAPERS
International Journal of Computer Science and Information Security (IJCSIS)
January-December 2020 Issues
The topics suggested by this issue can be discussed in term of concepts, surveys, state of the art, research,
standards, implementations, running experiments, applications, and industrial case studies. Authors are invited
to submit complete unpublished papers, which are not under review in any other conference or journal in the
following, but not limited to, topic areas.
See authors guide for manuscript preparation and submission guidelines.
Indexed by Google Scholar, DBLP, CiteSeerX, Directory for Open Access Journal (DOAJ), Bielefeld
Academic Search Engine (BASE), SCIRUS, Scopus Database, Cornell University Library, ScientificCommons,
ProQuest, EBSCO and more.
Deadline: see web site
Notification: see web site
Revision: see web site
Publication: see web site
For more topics, please see web site https://sites.google.com/site/ijcsis/
For more information, please visit the journal website (https://sites.google.com/site/ijcsis/)
ย 
Context-aware systems
Networking technologies
Security in network, systems, and applications
Evolutionary computation
Industrial systems
Evolutionary computation
Autonomic and autonomous systems
Bio-technologies
Knowledge data systems
Mobile and distance education
Intelligent techniques, logics and systems
Knowledge processing
Information technologies
Internet and web technologies, IoT
Digital information processing
Cognitive science and knowledgeย 
Agent-based systems
Mobility and multimedia systems
Systems performance
Networking and telecommunications
Software development and deployment
Knowledge virtualization
Systems and networks on the chip
Knowledge for global defense
Information Systems [IS]
IPv6 Today - Technology and deployment
Modeling
Software Engineering
Optimization
Complexity
Natural Language Processing
Speech Synthesis
Data Miningย 

More Related Content

Similar to Predicting Football Match Results with Data Mining Techniques

B04124012020
B04124012020B04124012020
B04124012020
IOSR-JEN
ย 
Cricket 2
Cricket 2Cricket 2
Cricket 2
iDTechTechnologies
ย 
Cricket Score and Winning Prediction
Cricket Score and Winning PredictionCricket Score and Winning Prediction
Cricket Score and Winning Prediction
IRJET Journal
ย 
Football Result Prediction using Dixon Coles Algorithm
Football Result Prediction using Dixon Coles AlgorithmFootball Result Prediction using Dixon Coles Algorithm
Football Result Prediction using Dixon Coles Algorithm
Aakash Jacobs
ย 
Cricket predictor
Cricket predictorCricket predictor
Cricket predictor
Rajat Mittal
ย 
IRJET-V8I11270.pdf
IRJET-V8I11270.pdfIRJET-V8I11270.pdf
IRJET-V8I11270.pdf
ShubhamSharma2566
ย 
IPL Match Prediction System Using Machine Learning.pptx
IPL Match Prediction System Using Machine Learning.pptxIPL Match Prediction System Using Machine Learning.pptx
IPL Match Prediction System Using Machine Learning.pptx
AJAman7
ย 
Comparative Analysis of Machine Learning Models for Cricket Score and Win Pre...
Comparative Analysis of Machine Learning Models for Cricket Score and Win Pre...Comparative Analysis of Machine Learning Models for Cricket Score and Win Pre...
Comparative Analysis of Machine Learning Models for Cricket Score and Win Pre...
IRJET Journal
ย 
I2 madankarky1 jharibabu
I2 madankarky1 jharibabuI2 madankarky1 jharibabu
I2 madankarky1 jharibabu
Jasline Presilda
ย 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
University of Salerno
ย 
Social Networking Site Data Analytics Using Game Theory Model
Social Networking Site Data Analytics Using Game Theory ModelSocial Networking Site Data Analytics Using Game Theory Model
Social Networking Site Data Analytics Using Game Theory Model
IRJET Journal
ย 
Familiarising Probabilistic Distance Clustering System of Evolving Awale Player
Familiarising Probabilistic Distance Clustering System of Evolving Awale PlayerFamiliarising Probabilistic Distance Clustering System of Evolving Awale Player
Familiarising Probabilistic Distance Clustering System of Evolving Awale Player
GiselleginaGloria
ย 
10.1.1.735.795.pdf
10.1.1.735.795.pdf10.1.1.735.795.pdf
10.1.1.735.795.pdf
researchict
ย 
Basketball players performance analytic as experiential learning approach
Basketball players performance analytic as experiential learning approachBasketball players performance analytic as experiential learning approach
Basketball players performance analytic as experiential learning approach
Nurfadhlina Mohd Sharef
ย 
Joseph Moore Dissertation
Joseph Moore DissertationJoseph Moore Dissertation
Joseph Moore Dissertation
Joseph Moore
ย 
Elg 5100 project report anurag & jayanshu
Elg 5100 project report   anurag & jayanshuElg 5100 project report   anurag & jayanshu
Elg 5100 project report anurag & jayanshu
Anurag Das
ย 
INCREASED PREDICTION ACCURACY IN THE GAME OF CRICKETUSING MACHINE LEARNING
INCREASED PREDICTION ACCURACY IN THE GAME OF CRICKETUSING MACHINE LEARNINGINCREASED PREDICTION ACCURACY IN THE GAME OF CRICKETUSING MACHINE LEARNING
INCREASED PREDICTION ACCURACY IN THE GAME OF CRICKETUSING MACHINE LEARNING
IJDKP
ย 
Machine Learning Based Selection of Optimal Sports team based on the Players ...
Machine Learning Based Selection of Optimal Sports team based on the Players ...Machine Learning Based Selection of Optimal Sports team based on the Players ...
Machine Learning Based Selection of Optimal Sports team based on the Players ...
IRJET Journal
ย 
D017332126
D017332126D017332126
D017332126
IOSR Journals
ย 
Support Vector Machineโ€“Based Prediction System for a Football Match Result
Support Vector Machineโ€“Based Prediction System for a Football Match ResultSupport Vector Machineโ€“Based Prediction System for a Football Match Result
Support Vector Machineโ€“Based Prediction System for a Football Match Result
iosrjce
ย 

Similar to Predicting Football Match Results with Data Mining Techniques (20)

B04124012020
B04124012020B04124012020
B04124012020
ย 
Cricket 2
Cricket 2Cricket 2
Cricket 2
ย 
Cricket Score and Winning Prediction
Cricket Score and Winning PredictionCricket Score and Winning Prediction
Cricket Score and Winning Prediction
ย 
Football Result Prediction using Dixon Coles Algorithm
Football Result Prediction using Dixon Coles AlgorithmFootball Result Prediction using Dixon Coles Algorithm
Football Result Prediction using Dixon Coles Algorithm
ย 
Cricket predictor
Cricket predictorCricket predictor
Cricket predictor
ย 
IRJET-V8I11270.pdf
IRJET-V8I11270.pdfIRJET-V8I11270.pdf
IRJET-V8I11270.pdf
ย 
IPL Match Prediction System Using Machine Learning.pptx
IPL Match Prediction System Using Machine Learning.pptxIPL Match Prediction System Using Machine Learning.pptx
IPL Match Prediction System Using Machine Learning.pptx
ย 
Comparative Analysis of Machine Learning Models for Cricket Score and Win Pre...
Comparative Analysis of Machine Learning Models for Cricket Score and Win Pre...Comparative Analysis of Machine Learning Models for Cricket Score and Win Pre...
Comparative Analysis of Machine Learning Models for Cricket Score and Win Pre...
ย 
I2 madankarky1 jharibabu
I2 madankarky1 jharibabuI2 madankarky1 jharibabu
I2 madankarky1 jharibabu
ย 
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
Metulini, R., Manisera, M., Zuccolotto, P. (2017), Sensor Analytics in Basket...
ย 
Social Networking Site Data Analytics Using Game Theory Model
Social Networking Site Data Analytics Using Game Theory ModelSocial Networking Site Data Analytics Using Game Theory Model
Social Networking Site Data Analytics Using Game Theory Model
ย 
Familiarising Probabilistic Distance Clustering System of Evolving Awale Player
Familiarising Probabilistic Distance Clustering System of Evolving Awale PlayerFamiliarising Probabilistic Distance Clustering System of Evolving Awale Player
Familiarising Probabilistic Distance Clustering System of Evolving Awale Player
ย 
10.1.1.735.795.pdf
10.1.1.735.795.pdf10.1.1.735.795.pdf
10.1.1.735.795.pdf
ย 
Basketball players performance analytic as experiential learning approach
Basketball players performance analytic as experiential learning approachBasketball players performance analytic as experiential learning approach
Basketball players performance analytic as experiential learning approach
ย 
Joseph Moore Dissertation
Joseph Moore DissertationJoseph Moore Dissertation
Joseph Moore Dissertation
ย 
Elg 5100 project report anurag & jayanshu
Elg 5100 project report   anurag & jayanshuElg 5100 project report   anurag & jayanshu
Elg 5100 project report anurag & jayanshu
ย 
INCREASED PREDICTION ACCURACY IN THE GAME OF CRICKETUSING MACHINE LEARNING
INCREASED PREDICTION ACCURACY IN THE GAME OF CRICKETUSING MACHINE LEARNINGINCREASED PREDICTION ACCURACY IN THE GAME OF CRICKETUSING MACHINE LEARNING
INCREASED PREDICTION ACCURACY IN THE GAME OF CRICKETUSING MACHINE LEARNING
ย 
Machine Learning Based Selection of Optimal Sports team based on the Players ...
Machine Learning Based Selection of Optimal Sports team based on the Players ...Machine Learning Based Selection of Optimal Sports team based on the Players ...
Machine Learning Based Selection of Optimal Sports team based on the Players ...
ย 
D017332126
D017332126D017332126
D017332126
ย 
Support Vector Machineโ€“Based Prediction System for a Football Match Result
Support Vector Machineโ€“Based Prediction System for a Football Match ResultSupport Vector Machineโ€“Based Prediction System for a Football Match Result
Support Vector Machineโ€“Based Prediction System for a Football Match Result
ย 

Recently uploaded

Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
MJDuyan
ย 
Simple-Present-Tense xxxxxxxxxxxxxxxxxxx
Simple-Present-Tense xxxxxxxxxxxxxxxxxxxSimple-Present-Tense xxxxxxxxxxxxxxxxxxx
Simple-Present-Tense xxxxxxxxxxxxxxxxxxx
RandolphRadicy
ย 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
RidwanHassanYusuf
ย 
A Free 200-Page eBook ~ Brain and Mind Exercise.pptx
A Free 200-Page eBook ~ Brain and Mind Exercise.pptxA Free 200-Page eBook ~ Brain and Mind Exercise.pptx
A Free 200-Page eBook ~ Brain and Mind Exercise.pptx
OH TEIK BIN
ย 
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...
EduSkills OECD
ย 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Kalna College
ย 
How to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in useHow to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in use
Celine George
ย 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
deepaannamalai16
ย 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
giancarloi8888
ย 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
RamseyBerglund
ย 
skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)
Mohammad Al-Dhahabi
ย 
Accounting for Restricted Grants When and How To Record Properly
Accounting for Restricted Grants  When and How To Record ProperlyAccounting for Restricted Grants  When and How To Record Properly
Accounting for Restricted Grants When and How To Record Properly
TechSoup
ย 
220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx
Kalna College
ย 
MDP on air pollution of class 8 year 2024-2025
MDP on air pollution of class 8 year 2024-2025MDP on air pollution of class 8 year 2024-2025
MDP on air pollution of class 8 year 2024-2025
khuleseema60
ย 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
ImMuslim
ย 
BPSC-105 important questions for june term end exam
BPSC-105 important questions for june term end examBPSC-105 important questions for june term end exam
BPSC-105 important questions for june term end exam
sonukumargpnirsadhan
ย 
Bร€I TแบฌP Bแป” TRแปข TIแบพNG ANH LแปšP 8 - Cแบข Nฤ‚M - FRIENDS PLUS - Nฤ‚M HแปŒC 2023-2024 (B...
Bร€I TแบฌP Bแป” TRแปข TIแบพNG ANH LแปšP 8 - Cแบข Nฤ‚M - FRIENDS PLUS - Nฤ‚M HแปŒC 2023-2024 (B...Bร€I TแบฌP Bแป” TRแปข TIแบพNG ANH LแปšP 8 - Cแบข Nฤ‚M - FRIENDS PLUS - Nฤ‚M HแปŒC 2023-2024 (B...
Bร€I TแบฌP Bแป” TRแปข TIแบพNG ANH LแปšP 8 - Cแบข Nฤ‚M - FRIENDS PLUS - Nฤ‚M HแปŒC 2023-2024 (B...
Nguyen Thanh Tu Collection
ย 
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
Payaamvohra1
ย 
Observational Learning
Observational Learning Observational Learning
Observational Learning
sanamushtaq922
ย 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
nitinpv4ai
ย 

Recently uploaded (20)

Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumPhilippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) Curriculum
ย 
Simple-Present-Tense xxxxxxxxxxxxxxxxxxx
Simple-Present-Tense xxxxxxxxxxxxxxxxxxxSimple-Present-Tense xxxxxxxxxxxxxxxxxxx
Simple-Present-Tense xxxxxxxxxxxxxxxxxxx
ย 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
ย 
A Free 200-Page eBook ~ Brain and Mind Exercise.pptx
A Free 200-Page eBook ~ Brain and Mind Exercise.pptxA Free 200-Page eBook ~ Brain and Mind Exercise.pptx
A Free 200-Page eBook ~ Brain and Mind Exercise.pptx
ย 
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...
ย 
Contiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptxContiguity Of Various Message Forms - Rupam Chandra.pptx
Contiguity Of Various Message Forms - Rupam Chandra.pptx
ย 
How to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in useHow to Fix [Errno 98] address already in use
How to Fix [Errno 98] address already in use
ย 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
ย 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
ย 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
ย 
skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)skeleton System.pdf (skeleton system wow)
skeleton System.pdf (skeleton system wow)
ย 
Accounting for Restricted Grants When and How To Record Properly
Accounting for Restricted Grants  When and How To Record ProperlyAccounting for Restricted Grants  When and How To Record Properly
Accounting for Restricted Grants When and How To Record Properly
ย 
220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx
ย 
MDP on air pollution of class 8 year 2024-2025
MDP on air pollution of class 8 year 2024-2025MDP on air pollution of class 8 year 2024-2025
MDP on air pollution of class 8 year 2024-2025
ย 
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
Geography as a Discipline Chapter 1 __ Class 11 Geography NCERT _ Class Notes...
ย 
BPSC-105 important questions for june term end exam
BPSC-105 important questions for june term end examBPSC-105 important questions for june term end exam
BPSC-105 important questions for june term end exam
ย 
Bร€I TแบฌP Bแป” TRแปข TIแบพNG ANH LแปšP 8 - Cแบข Nฤ‚M - FRIENDS PLUS - Nฤ‚M HแปŒC 2023-2024 (B...
Bร€I TแบฌP Bแป” TRแปข TIแบพNG ANH LแปšP 8 - Cแบข Nฤ‚M - FRIENDS PLUS - Nฤ‚M HแปŒC 2023-2024 (B...Bร€I TแบฌP Bแป” TRแปข TIแบพNG ANH LแปšP 8 - Cแบข Nฤ‚M - FRIENDS PLUS - Nฤ‚M HแปŒC 2023-2024 (B...
Bร€I TแบฌP Bแป” TRแปข TIแบพNG ANH LแปšP 8 - Cแบข Nฤ‚M - FRIENDS PLUS - Nฤ‚M HแปŒC 2023-2024 (B...
ย 
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
NIPER 2024 MEMORY BASED QUESTIONS.ANSWERS TO NIPER 2024 QUESTIONS.NIPER JEE 2...
ย 
Observational Learning
Observational Learning Observational Learning
Observational Learning
ย 
Bonku-Babus-Friend by Sathyajith Ray (9)
Bonku-Babus-Friend by Sathyajith Ray  (9)Bonku-Babus-Friend by Sathyajith Ray  (9)
Bonku-Babus-Friend by Sathyajith Ray (9)
ย 

Predicting Football Match Results with Data Mining Techniques

  • 1. Predicting Football Match Results with Data Mining Techniques O. I. Aladesote, O. Agbelusi & M. Ganiyu Abstract- Data mining techniques are very effective and useful for forecasting in many domains or fields. In this research, prediction of Spanish la liga football match outcomes is carried out using various data mining techniques (Multilayer Perception, Decision Tables, Random Forest, Reptree and Meta. Bagging) to determine the most accurate among these techniques. The experimental results is done with Weka 3.9, shows that all the techniques performed well in terms of accuracy but multilayer Perception was the most successful with an average accuracy of 100%.. I. INTRODUCTION Football is a fast growing sport that is taking over as one the most viewed and richest sport therefore the drive to be more than just a spectator has led to this research of being able to predict the final outcome of any match and simultaneously making sport betting easier. One of the reasons for football being the most popular sport in the planet is its unpredictability. Every day, fans around the world argue over which team is going to win the next game or the next competition. Many of these fans also put their money where their mouths are, by betting large sums on their predictions. Due to the large amount of factors that can affect the result of a football match, it is incredibly difficult to correctly predict its probabilities. With the increasing growth of the amount of money invested in sports betting markets, it is important to verify how far data mining techniques can bring value to this area [9]. To solve this problem we propose building data-driven solutions designed through a data mining process. Data mining is an aspect of computing that is used for extraction of hidden information and to automate the detection of relevant patterns in a database. The data mining process allows us to build models that can give us predictions according to the data that is fed into the system. The study is aimed at using data mining techniques for the prediction of football match result. Every sport has particular rules, number of players, different styles, that is, a set of different features. For a beginner, carrying out predictive model from the scratch with considerable dataset could be somehow challenging. Finally, every individual especially football fans would be able to predict match result based on identified factor at the end of this research. We summarized the contributions of this paper as follows: โ€ข Forecasting of la liga football match outcome using data of five previous seasons โ€ข Comparative analysis to determine the most accurate technique. The remainder of the paper is organized as follows: section 2 presents the literature review. In Section 3, the method used to generate the results is presented. The experimental results for each data mining technique is presented and discussed in section 4. Comparative analysis is done in section 5 and finally, conclusion and future work are presented in section 6. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 46 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 2. II. LITERATURE REVIEW Data mining is an important tool in event prediction. The literature selected and discussed in this section are those that are more related and relevant to the result of football match prediction. A match results prediction system is proposed using four data mining techniques. The author used basketball results of four seasons (from 2005/2006 to 2009/2010) as training data and in order to assess or appraise the models, the result of 2010/2011 season was used as test data. The result shows that the models performed with comparable classification accuracy rate, with 67.8% as the highest [4]. The authors proposed the use of ANN and logistic regression techniques to forecast the outcome of 2014-2015 English premier league results to strengthen the complexity and inaccurate prediction results produced by statistical approaches. The records of nine significant features are randomly selected from the records. The experimental result of the model shows that logistic regression perform better than ANN and that the techniques show higher prediction accuracy [17]. [13] carried out a preliminary investigation to forecast result of National Football League (NFL) using artificial neural network (ANN). Five variables randomly extracted from first eight rounds of the competition was used for the prediction. Teams were classified to be either strong or weak using cluster related methods. The paper proposes data mining techniques to strengthen the limitations introduced by numeric prediction approach. Eight years of data was used. To evaluate the performance of these techniques, both classification and regression models were used. The experimental results clearly show that the accuracy rate of classification model outweigh regression model [15] The researchers carried out a performance evaluation using three classification models (naive Bayes, artificial neural networks (ANNs) and decision trees) [16]. The models was built using different variables of NBA matches. The experimental result shows that the accuracy of the proposed model is very reliable and that defensive fence is the most significant variable among others. Three other variables were also chosen to be the significant. The researcher adopted three data mining approaches to propose models for game outcome using historical data. The purpose for this is to counter the idea of eligibility in ranking winning game based on experience. At the end of the modeling process, all the three models were capable of forecasting the winner of the game and decision tree produces the highest accuracy [5]. A reliable tennis match outcome prediction model is proposed with numerous factors that are systematically prioritized to determine the match accuracy. The result shows that the proposed model with combine data and judgement has 85.1% accuracy outcome of a match [7]. Machine learning method was adopted to forecast the result of future soccer matches based on dataset from past matches. In this research, two important ideals were discovered as a result of some challenges encountered during the modeling process of 2017 soccer match result. These two ideals brought about new feature engineering methods (Recency and rating extraction) for match result forecasting. The author concluded that good forecasting should be based on the knowledge of machine learning [3]. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 47 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 3. The authors developed predictive models to forecast the outcome of football match for 2008/2009 and 2015/2016 seasons. Techniques like artificial neural network (ANN), Random Forest (RF) and Support Vector Machine (SVM) were used to develop models. Comparative analysis was made and the result shows that they are capable of carrying out prediction correctly as compare with the result from the experience of football match analyst [8]. This paper proposes machine learning methods to determine the result of NBA match. The forecasting process was based on the historical data, performance evaluation was done among the models developed and the result shows that defensive rebounds features was an important features demonstrated by all the methods for optimal prediction of the game result. Further research will be carried out using model like function based techniques and deep learning [16]. III. METHODOLOGY This section describes the dataset, classification techniques and performance analysis. The experiment is done using Weka 3.9.2 on five algorithms: Multilayer Perception, Decision Tables, Random Forest, Reptree and Meta. Bagging. In Weka, 10% cross-validation fold is adopted as classifier evaluation option. A. Dataset The dataset used for the implementation was the Spanish La Liga League of 2014/2015 to 2018/2019 seasons [18]. The league consists of twenty teams played both home and away matches, equaled to 380 matches per season and 1900 matches for these five seasons. The data consists of 61 features, in which 22 consists various statistical data such as full and halt time result, home and away team shot, etc. while the remaining 39 consist of football betting details. Out of the 22 features of the dataset, 10 features were randomly selected as predictors while full time results (Home Win, Away Win and Lose) as the target. B. Performance Analysis The performance of these classification algorithms was measured based on the accuracy. Accuracy shows the rate at which the classifier meets the correct target class, that is, it determines the instances of data correctly classified [2]. Accuracy = (1) The total number of correctly predicted Home Win, Away Win and Lose match results is equivalent to the total number of correctly predicted match results. IV. RESULT AND DISCUSSION The results of the experiment carried out on the five classification techniques would be presented and analysed based on the percentage of accuracy of each technique. 10-foldcross validation techniques was adopted because of small size of the data. A. Multilayer Perception Multilayer Perception is a type of neural network or artificial neural networks, which has appeared to be very a valuable alternatives to old statistical techniques and does not create previous assumptions of data distribution [6]. Multilayer Perception is applied to the La Liga datasets using Weka 3.9.2. The percentage accuracy for the seasons is 100% as depicted in Table 1 below and the result of Multilayer Perception for 2018/2019 Season in Figure 1. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 48 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 4. TABLE I PERCENTAGE ACCURACY OF THE MULTILAYER PERCEPTION FOR FIVE SEASONS Season Accuracy (%) 2018/2019 Season 100 2017/2018 Season 100 2016/2017 Season 100 2015/2016 Season 100 2014/2015 Season 100 FIGURE 1: DETAILED OUTPUT OF MULTILAYER PERCEPTION OF 2018/2019 SEASON B. Decision Tables Decision table is a type of rules that indicates actions to be taken when certain conditions are meant [12]. The dataset are imported into Weka 3.9.2 and the data are run sing Decision Tables technique. The percentage accuracy for 2018/2019 season is 97.38%, 91.58% for 2017/2018 season, 94.74% for 2016/2017 season, 98.95% for 2015/2016 season and 96.84% for 2014/2015 season. The percentage for the seasons using Decision Tables is presented in Table 2. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 49 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 5. TABLE II PERCENTAGE ACCURACY OF THE DECISION TABLES FOR FIVE SEASONS Season Accuracy (%) 2018/2019 Season 97.38 2017/2018 Season 91.58 2016/2017 Season 94.74 2015/2016 Season 98.95 2014/2015 Season 96.84 FIGURE 2: DETAILED OUTPUT OF DECISION TABLE OF 2017/2018 SEASON C. Random Forest Random Forest is a statistical learning mode, which is a tree-based ensemble with each node relying on group of random variables. It performs well with small or medium dataset and can perform better than latest algorithms [1], [11]. The dataset are imported into Weka 3.9.2 and the data are run sing Random Forest technique. The percentage accuracy for 2018/2019 season is 98.42%, 98.95% for 2017/2018 season, 98.16% for 2016/2017 season, 97.63% for 2015/2016 season and 99.47% for 2014/2015 season. The percentage accuracy for the seasons using Random Forest is presented in Table 3 below. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 50 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 6. TABLE III PERCENTAGE ACCURACY OF THE RANDOM FOREST FOR FIVE SEASONS Season Accuracy (%) 2018/2019 Season 98.42 2017/2018 Season 98.95 2016/2017 Season 98.16 2015/2016 Season 97.63 2014/2015 Season 99.47 Figure 3: Detailed output of Random Forest of 2016/2017 Season D RepTree Reduced Error Pruning Tree (Reptree) is a fast decision tree learning, which uses regression tree logic to either build a decision using information gain as splitting principle or reduces the variance [10]. The dataset of La Liga football League of 2014/2015 season to 2018/2019 season are implemented into Weka 3.9.2 for the prediction. The percentage accuracy for 2018/2019 season is 98.68%, 98.68% for 2017/2018 season, 98.42% for 2016/2017 season, 97.89% for 2015/2016 season and 98.95% for 2014/2015 season. The percentage accuracy for the seasons is presented in Table 4. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 51 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 7. TABLE IV PERCENTAGE ACCURACY OF THE REPTREE FOR FIVE SEASONS Season Accuracy (%) 2018/2019 Season 98.68 2017/2018 Season 98.16 2016/2017 Season 98.42 2015/2016 Season 97.89 2014/2015 Season 98.95 FIGURE 4: DETAILED OUTPUT OF REPTREE OF 2015/2016 SEASON E Meta Bagging Meta Bagging is a machine learning ensemble algorithm developed to enhance the accuracy of statistical classification and regression of any machine learning based algorithms [14]. The dataset of La Liga football League of 2014/2015 season to 2018/2019 season are implemented into Weka 3.9.2 for the prediction. The percentage accuracy for 2018/2019 season is 99.74%, 98.42% for 2017/2018 season, 98.16% for 2016/2017 season, 98.42% for 2015/2016 season and 99.47% for 2014/2015 season. The percentage accuracy for the seasons is presented in Table 5. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 52 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 8. TABLE V PERCENTAGE ACCURACY OF THE META BAGGING FOR FIVE SEASONS Season Accuracy (%) 2018/2019 Season 99.74 2017/2018 Season 98.42 2016/2017 Season 98.16 2015/2016 Season 98.42 2014/2015 Season 99.47 Figure 5: Detailed output of Meta Bagging of 2014/2015 Season V. COMPARATIVE ANALYSIS The comparative analysis of the result shows that Multilayer Perception has the overall best average percentage accuracy with 100%, Meta Bagging with an average accuracy of 98.84% for the seasons, Random Forest has an average percentage accuracy of 98.53%, Reptree has an average accuracy of 98.42 while Decision Tables has the least average accuracy of 95.90% as presented in Table 6 and Figure 6 below International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 53 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 9. TABLE VI COMPARISON OF AVERAGE PERCENTAGE ACCURACY Accuracy Multilayer Perception Decision Tables Random Forest Reptree Meta Bagging 2018/2019 Season 100% 97.38% 98.42% 98.68% 99.74% 2017/2018 Season 100% 91.58% 98.95% 98.16% 98.42% 2016/2017 Season 100% 94.74% 98.16% 98.42% 98.16% 2015/2016 Season 100% 98.95% 97.63% 97.89% 98.42% 2014/2015 Season 100% 96.84% 99.47% 98.95% 99.47% Average Accuracy 100% 95.90% 98.53% 98.42% 98.84% FIGURE 6: GRAPHICAL REPRESENTATION OF AVERAGE ACCURACY VI. CONCLUSION AND FUTURE WORK This work compared five data mining algorithms on Spanish la liga football match outcome. The experimental results revealed Multilayer Perception has the most successful result, which makes it the best data mining technique to predict la liga football match outcome with 100% accuracy as against Decision Tables with 95.90% accuracy, Random Forest with 98.53%, Reptree with 98.42% and Meta Bagging with 98.84% accuracy. However, all data mining techniques can also be applied in future work, consideration rating of each team as part of the variables. References [1] A. Cutler, D. R. Cutler, and J. R. Stevens, โ€œEnsemble Machine Learning,โ€ Ensemble Mach. Learn., no. January, 2012, doi: 10.1007/978-1-4419-9326-7. [2] C. M. F. Che Mohd Rosli, M. Z. Saringat, N. Razali, and A. Mustapha, โ€œA Comparative Study of Data Mining Techniques on Football International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 54 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 10. Match Prediction,โ€ J. Phys. Conf. Ser., vol. 1020, no. 1, 2018, doi: 10.1088/1742-6596/1020/1/012003. [3] D. Berrar, P. Lopes, and W. Dubitzky, โ€œIncorporating domain knowledge in machine learning for soccer outcome prediction,โ€ Mach. Learn., vol. 108, no. 1, pp. 97โ€“126, 2019, doi: 10.1007/s10994-018-5747-8. [4] C. Cao, โ€œSports data mining technology used in basketball outcome prediction,โ€ Dublin Inst. Technol., pp. 1โ€“86, 2012. [5] D. Delen, D. Cogdell, and N. Kasap, โ€œA comparative analysis of data mining methods in predicting NCAA bowl outcomes,โ€ Int. J. Forecast., vol. 28, no. 2, pp. 543โ€“552, 2012, doi: 10.1016/j.ijforecast.2011.05.002. [6] M. W. Gardner and S. R. Dorling, โ€œArtificial neural networks (the multilayer perceptron) - a review of applications in the atmospheric sciences,โ€ Atmos. Environ., vol. 32, no. 14โ€“15, pp. 2627โ€“2636, 1998, doi: 10.1016/S1352-2310(97)00447-0. [7] W. Gu and T. L. Saaty, โ€œPredicting the Outcome of a Tennis Tournament: Based on Both Data and Judgments,โ€ J. Syst. Sci. Syst. Eng., vol. 28, no. 3, pp. 317โ€“343, 2019, doi: 10.1007/s11518-018-5395-3. [8] H. Chen, โ€œNeural Network Algorithm in Predicting Football Match Outcome Based on Player Ability Index,โ€ Adv. Phys. Educ., vol. 09, no. 04, pp. 215โ€“222, 2019, doi: 10.4236/ape.2019.94015. [9] J. J. Zhang, E. Kim, B. Marstromartino, T. Y. Qian, and J. Nauright, โ€œThe sport industry in growing economies: critical issues and challenges,โ€ Int. J. Sport. Mark. Spons., vol. 19, no. 2, pp. 110โ€“126, 2018, doi: 10.1108/IJSMS-03-2018-0023. [10] S. Kalmegh, โ€œAnalysis of WEKA Data Mining Algorithm REPTree , Simple Cart and RandomTree for Classification of Indian News,โ€ Int. J. Innov. Sci. Eng. Technol., vol. 2, no. 2, pp. 438โ€“446, 2015. [11] R. Kohavi, โ€œThe power of decision tables,โ€ Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 912, pp. 174โ€“189, 1995, doi: 10.1007/3-540-59286-5_57. [12] D. Tables, F. Definition, and S. P. Decision, โ€œ(cf. (cf.,โ€ pp. 68โ€“80, 1991. [13] A. Reso- and K. Self-, โ€œDifferent Training Methods Perform in Calling the Games,โ€ pp. 9โ€“15, 1996. [14] P. Shrivastava and M. Shukla, โ€œUses the Bagging Algorithm of Classification Method Learning and Forest Fire Data,โ€ Int. J. Adv. Comput. Eng. Netw., vol. 01, no. 12, pp. 91โ€“95, 2014. [15] S. J. Lee and K. Siau, โ€œA review of data mining techniques,โ€ Ind. Manag. Data Syst., vol. 101, no. 1, pp. 41โ€“46, 2001, doi: 10.1108/02635570110365989. [16] F. Thabtah, L. Zhang, and N. Abdelhamid, โ€œNBA Game Result Prediction Using Feature Analysis and Machine Learning,โ€ Ann. Data Sci., vol. 6, no. 1, pp. 103โ€“116, 2019, doi: 10.1007/s40745-018-00189-x. [17] C.P. Igiri, E.O. Nwachukwu, "An Improved Prediction System for Football Match Result," IOSR Journal of Engineering, vol. 04, no 12, pp. 12-20, 2014, doi: 10.9790/3021-04124012020 [18] Spanish La Liga (football) dataset [Online]. Available: https://datahub.io/sports-data/spanish-la-liga#data. [Accessed on 17 December, 2019]. International Journal of Computer Science and Information Security (IJCSIS), Vol. 18, No. 6, June 2020 55 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 11. IJCSIS ISSN (online): 1947-5500 Please consider to contribute to and/or forward to the appropriate groups the following opportunity to submit and publish original scientific results. CALL FOR PAPERS International Journal of Computer Science and Information Security (IJCSIS) January-December 2020 Issues The topics suggested by this issue can be discussed in term of concepts, surveys, state of the art, research, standards, implementations, running experiments, applications, and industrial case studies. Authors are invited to submit complete unpublished papers, which are not under review in any other conference or journal in the following, but not limited to, topic areas. See authors guide for manuscript preparation and submission guidelines. Indexed by Google Scholar, DBLP, CiteSeerX, Directory for Open Access Journal (DOAJ), Bielefeld Academic Search Engine (BASE), SCIRUS, Scopus Database, Cornell University Library, ScientificCommons, ProQuest, EBSCO and more. Deadline: see web site Notification: see web site Revision: see web site Publication: see web site For more topics, please see web site https://sites.google.com/site/ijcsis/ For more information, please visit the journal website (https://sites.google.com/site/ijcsis/) ย  Context-aware systems Networking technologies Security in network, systems, and applications Evolutionary computation Industrial systems Evolutionary computation Autonomic and autonomous systems Bio-technologies Knowledge data systems Mobile and distance education Intelligent techniques, logics and systems Knowledge processing Information technologies Internet and web technologies, IoT Digital information processing Cognitive science and knowledgeย  Agent-based systems Mobility and multimedia systems Systems performance Networking and telecommunications Software development and deployment Knowledge virtualization Systems and networks on the chip Knowledge for global defense Information Systems [IS] IPv6 Today - Technology and deployment Modeling Software Engineering Optimization Complexity Natural Language Processing Speech Synthesis Data Miningย