SlideShare a Scribd company logo
1 of 6
Download to read offline
MidTerm Project
1. Data preparation and feature engineering
I did data preparation in Excel. First, I replaced all blanks with zeros.
Then I created these new variables in the dataset:
1) “AvgRatingPlayer 1:14” for all 14 players, which are mean of each player’s five rating values.
2) “AvgCRating”, “AvgGSRating”,“AvgEFRating”,“AvgFFRating” and “AvgPRating”: calculated by 14
players’ corresponding Rating values divided by “NumPlayers”.
3) “AvgTotal”: equals sum of “AvgCRating”, “AvgGSRating”,“AvgEFRating”,“AvgFFRating” and
“AvgPRating” divided by 5.
4) “AvgDiffTotalPlayer 1:14”: calculated by AvgTotal minus AvgRatingPlayer 1:14. If
AvgRatingPlayer 1:14 is 0, the corresponding AvgDiffTotalPlayer is 0.
5) “FirstDiffTotal”: calculated by AvgTotal minus AvgRatingPlayer 1:14 value for the First player.
In the formula, A2 is “First”, W2 is “AvgTotal”, I2:V2 are AvgRatingPlayer 1:14 values.
=IF(A2="P1",W2-I2,(IF(A2="P2",W2-J2,(IF(A2="P3",W2-K2,(IF(A2="P4",W2-L2,(IF(A2="P5",W2-
M2,(IF(A2="P6",W2-N2,(IF(A2="P7",W2-O2,(IF(A2="P8",W2-P2,(IF(A2="P9",W2-
Q2,(IF(A2="P10",W2-R2,(IF(A2="P11",W2-S2,(IF(A2="P12",W2-T2,(IF(A2="P13",W2-
U2,(IF(A2="P14",W2-V2)))))))))))))))))))))))))))
2. Upload data to BigML
After data preparation, I uploaded the dataset to BigML and set First, Second and Third as
Categorical variables. Then I divided the dataset into 80% for Training dataset and 20% for Test
dataset.
3. Build Decision Tree Models, evaluate and choose the best model
In all processes for building models, I did not use variables: Second, Third, CapFirstScore,
CapSecnodScore and CapThirdScore.
I built 7 models in Decision Tree by using different combinations of settings and features while I was
evaluating models and tuning model performance. I used 20% Test dataset to evaluate models,
downloaded the confusion metric and recorded Avg F, Avg Precision and Avg Accuracy for each
model.
The best Decision Tree model which has the highest Avg F score is 523 model. Its Avg F score is
0.268, Avg Precision is 0.242 and Avg Accuracy is 0.285. In this model, I kept all default settings
except Threshold=523, Sample rate=81% and Ordering=linear.
4. Build Logistic Regression Models, evaluate and choose the best model
I built 5 models in Logistic Regression. I used 20% Test dataset to evaluate the model, downloaded the
confusion metric and recorded Avg F, Avg Precision and Avg Accuracy for each model.
The best Logistic Regression model which has the highest Avg F score is 48 model. Its Avg F score is
0.2754, Avg Precision is 0.2557 and Avg Accuracy is 0.2845. In this model, I kept all default settings
except Sampling rate=48% and excluding Bias Term.
5. Build Ensemble Models, evaluate and choose the best model
I built 9 models in Ensemble including 2 models with Weight: CapFirstScore. I used 20% Test dataset
to evaluate the model, downloaded the confusion metric and recorded Avg F, Avg Precision and Avg
Accuracy for each model.
The best Ensemble model which has the highest Avg F score is 333 model. Its Avg F score is 0.2992,
Avg Precision is 0.2724 and Avg Accuracy is 0.3255. In this model, I kept all default settings.
6. Predicted 20% Test dataset and calculated AvgPoints
After model evaluation, I predicted 20% Test dataset using three best models and two Ensemble
models with Weight, and recorded AvgPoints for each prediction.
For 523 model, the best Decision Tree model, its AvgPoints is 6.7533.
For 333 model, the best Ensemble model, its AvgPoints is 7.3021.
For 48 model, the best Logistic Regression model, its AvgPoints is 5.4875.
For 777 model, the Ensemble model with Weight=CapFirstScore, Sampling rate=49% and
Threshold=293, its AvgPoints is 6.6289.
For 888 model, the Ensemble model with Weight=CapFirstScore and Sampling rate=52%, its
AvgPoints is 6.6021.
At this point, the best model is 333 model.
7. More actions after step 6
I predicted 20% Test dataset using some other models and recorded AvgPoints as following.
1) Predicted three more Ensemble models which have high performance values.
For 1618 model, which has the second highest Avg F in Ensembles models, its AvgPoints is 7.5081.
For 555 model, which has the third highest Avg F in Ensembles models, its AvgPoints is 7.3389.
For 913 model, which has the fourth highest Avg F in Ensembles models, its AvgPoints is 7.3389.
Model Name Avg F Avg Precision Avg Accuracy AvgPoints
333 model 0.2992 0.2724 0.3255 7.3021
1618 model 0.2964 0.303 0.3328 7.5081
555 model 0.2956 0.279 0.3235 7.3389
913 model 0.2894 0.2889 0.3313 7.3895
1618 model has the highest AvgPoints, followed by 913 model and 555 model. This table shows
AvgPoints is not only decided by F value, it also related with Precision and Accuracy. 1618 model,
having the highest Presicion and Accuracy, is better than 333 model although 333 model has the
highest F value. Similarly, 555 model has higher Precision value than 333 model and 913 model has
the second highest Accuracy.
2) Predicted two more Decision Tree models which have high performance values.
Model Name Avg F Avg Precision Avg Accuracy AvgPoints
523 model 0.268 0.242 0.285 6.7533
512 model 0.2451 0.2496 0.2775 6.5981
1813 model 0.249 0.2549 0.2857 6.6802
In these three model, 523 model has the highest AvgPoints. Although 1813 model has the higest
Precision and Accuracy among all 5 models, its AvgPoints is not higher than the value of 523 model.
3) Predicted two more Logistic Regression models which have high performance values.
Model Name Avg F Avg Precision Avg Accuracy AvgPoints
48 model 0.2754 0.2557 0.2845 5.4875
111 model 0.2706 0.2531 0.2938 5.4898
222 model 0.2641 0.2741 0.2823 5.5663
222 model has the highest AvgPoints among these three models.
In sum, Ensembles models have better performance than Decision Tree models and Logistic
Regression models overall.
The top three models in this project:
The best model is 1618 model, which has the highest AvgPoints 7.5081. This is an Ensemble model
using all variables and all default settings except Threshold=1618 and Ordering=linear.
The second model is 913 model, which has AvgPoints, 7.3895. This is an Ensemble model using all
variables and all default settings except Threshold=913 and Ordering=Random shuffling.
The third model is 555 model, which has AvgPoints 7.3389. This is an Ensemble model using all
default settings but excluding variables from CRatingPlayer1:14 to PRatingPlayer1:14. This model
only used new variables I created early.

More Related Content

Similar to Game match

IE332Engineering Statistics IINOTES· Show your work,
IE332Engineering Statistics IINOTES· Show your work,IE332Engineering Statistics IINOTES· Show your work,
IE332Engineering Statistics IINOTES· Show your work,MalikPinckney86
 
Data Analytics Project_Eun Seuk Choi (Eric)
Data Analytics Project_Eun Seuk Choi (Eric)Data Analytics Project_Eun Seuk Choi (Eric)
Data Analytics Project_Eun Seuk Choi (Eric)Eric Choi
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using RGregg Barrett
 
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...cscpconf
 
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...Editor IJCATR
 
Data analysis_PredictingActivity_SamsungSensorData
Data analysis_PredictingActivity_SamsungSensorDataData analysis_PredictingActivity_SamsungSensorData
Data analysis_PredictingActivity_SamsungSensorDataKaren Yang
 
Regression Analysis of NBA Points Final
Regression Analysis of NBA Points  FinalRegression Analysis of NBA Points  Final
Regression Analysis of NBA Points FinalJohn Michael Croft
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction modelsMuthu Kumaar Thangavelu
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction modelsMuthu Kumaar Thangavelu
 
Classification modelling review
Classification modelling reviewClassification modelling review
Classification modelling reviewJaideep Adusumelli
 
The Beginnings Of A Search Engine
The Beginnings Of A Search EngineThe Beginnings Of A Search Engine
The Beginnings Of A Search EngineVirenKhandal
 
The Beginnings of a Search Engine
The Beginnings of a Search EngineThe Beginnings of a Search Engine
The Beginnings of a Search EngineVirenKhandal
 
PREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial Approach
PREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial ApproachPREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial Approach
PREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial Approachahmet furkan emrehan
 
Variable Selection Methods
Variable Selection MethodsVariable Selection Methods
Variable Selection Methodsjoycemi_la
 
Variable Selection Methods
Variable Selection MethodsVariable Selection Methods
Variable Selection Methodsjoycemi_la
 
Practical Data Science: Data Modelling and Presentation
Practical Data Science: Data Modelling and PresentationPractical Data Science: Data Modelling and Presentation
Practical Data Science: Data Modelling and PresentationHariniMS1
 
Simple Ensemble Learning
Simple Ensemble LearningSimple Ensemble Learning
Simple Ensemble LearningMushfiq18
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)Abhimanyu Dwivedi
 

Similar to Game match (20)

Machine learning project
Machine learning project Machine learning project
Machine learning project
 
IE332Engineering Statistics IINOTES· Show your work,
IE332Engineering Statistics IINOTES· Show your work,IE332Engineering Statistics IINOTES· Show your work,
IE332Engineering Statistics IINOTES· Show your work,
 
Data Analytics Project_Eun Seuk Choi (Eric)
Data Analytics Project_Eun Seuk Choi (Eric)Data Analytics Project_Eun Seuk Choi (Eric)
Data Analytics Project_Eun Seuk Choi (Eric)
 
Diabetes data - model assessment using R
Diabetes data - model assessment using RDiabetes data - model assessment using R
Diabetes data - model assessment using R
 
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
PROGRAM TEST DATA GENERATION FOR BRANCH COVERAGE WITH GENETIC ALGORITHM: COMP...
 
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
Proposing an Appropriate Pattern for Car Detection by Using Intelligent Algor...
 
Data analysis_PredictingActivity_SamsungSensorData
Data analysis_PredictingActivity_SamsungSensorDataData analysis_PredictingActivity_SamsungSensorData
Data analysis_PredictingActivity_SamsungSensorData
 
Regression Analysis of NBA Points Final
Regression Analysis of NBA Points  FinalRegression Analysis of NBA Points  Final
Regression Analysis of NBA Points Final
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction models
 
Caravan insurance data mining prediction models
Caravan insurance data mining prediction modelsCaravan insurance data mining prediction models
Caravan insurance data mining prediction models
 
Classification modelling review
Classification modelling reviewClassification modelling review
Classification modelling review
 
The Beginnings Of A Search Engine
The Beginnings Of A Search EngineThe Beginnings Of A Search Engine
The Beginnings Of A Search Engine
 
The Beginnings of a Search Engine
The Beginnings of a Search EngineThe Beginnings of a Search Engine
The Beginnings of a Search Engine
 
PREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial Approach
PREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial ApproachPREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial Approach
PREDICTION MODELS BASED ON MAX-STEMS Episode Two: Combinatorial Approach
 
Variable Selection Methods
Variable Selection MethodsVariable Selection Methods
Variable Selection Methods
 
Variable Selection Methods
Variable Selection MethodsVariable Selection Methods
Variable Selection Methods
 
Practical Data Science: Data Modelling and Presentation
Practical Data Science: Data Modelling and PresentationPractical Data Science: Data Modelling and Presentation
Practical Data Science: Data Modelling and Presentation
 
Simple Ensemble Learning
Simple Ensemble LearningSimple Ensemble Learning
Simple Ensemble Learning
 
Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
Competition16
Competition16Competition16
Competition16
 

Recently uploaded

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?RemarkSemacio
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...nirzagarg
 
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime GiridihGiridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridihmeghakumariji156
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token PredictionNABLAS株式会社
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...gajnagarg
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxAniqa Zai
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...gajnagarg
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...vershagrag
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareGraham Ware
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...HyderabadDolls
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...HyderabadDolls
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay
 

Recently uploaded (20)

SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...Fun all Day Call Girls in Jaipur   9332606886  High Profile Call Girls You Ca...
Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...
 
Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?Case Study 4 Where the cry of rebellion happen?
Case Study 4 Where the cry of rebellion happen?
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime GiridihGiridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Latur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Introduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptxIntroduction to Statistics Presentation.pptx
Introduction to Statistics Presentation.pptx
 
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...
 
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Indore [ 7014168258 ] Call Me For Genuine Models We...
 
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
👉 Bhilai Call Girls Service Just Call 🍑👄6378878445 🍑👄 Top Class Call Girl Ser...
 
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...
 
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
Kalyani ? Call Girl in Kolkata | Service-oriented sexy call girls 8005736733 ...
 
Abortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Jeddah | +966572737505 | Get Cytotec
Abortion pills in Jeddah | +966572737505 | Get Cytotec
 
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxRESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx
 

Game match

  • 1. MidTerm Project 1. Data preparation and feature engineering I did data preparation in Excel. First, I replaced all blanks with zeros. Then I created these new variables in the dataset: 1) “AvgRatingPlayer 1:14” for all 14 players, which are mean of each player’s five rating values. 2) “AvgCRating”, “AvgGSRating”,“AvgEFRating”,“AvgFFRating” and “AvgPRating”: calculated by 14 players’ corresponding Rating values divided by “NumPlayers”. 3) “AvgTotal”: equals sum of “AvgCRating”, “AvgGSRating”,“AvgEFRating”,“AvgFFRating” and “AvgPRating” divided by 5. 4) “AvgDiffTotalPlayer 1:14”: calculated by AvgTotal minus AvgRatingPlayer 1:14. If AvgRatingPlayer 1:14 is 0, the corresponding AvgDiffTotalPlayer is 0. 5) “FirstDiffTotal”: calculated by AvgTotal minus AvgRatingPlayer 1:14 value for the First player. In the formula, A2 is “First”, W2 is “AvgTotal”, I2:V2 are AvgRatingPlayer 1:14 values. =IF(A2="P1",W2-I2,(IF(A2="P2",W2-J2,(IF(A2="P3",W2-K2,(IF(A2="P4",W2-L2,(IF(A2="P5",W2- M2,(IF(A2="P6",W2-N2,(IF(A2="P7",W2-O2,(IF(A2="P8",W2-P2,(IF(A2="P9",W2- Q2,(IF(A2="P10",W2-R2,(IF(A2="P11",W2-S2,(IF(A2="P12",W2-T2,(IF(A2="P13",W2- U2,(IF(A2="P14",W2-V2))))))))))))))))))))))))))) 2. Upload data to BigML After data preparation, I uploaded the dataset to BigML and set First, Second and Third as Categorical variables. Then I divided the dataset into 80% for Training dataset and 20% for Test dataset.
  • 2. 3. Build Decision Tree Models, evaluate and choose the best model In all processes for building models, I did not use variables: Second, Third, CapFirstScore, CapSecnodScore and CapThirdScore. I built 7 models in Decision Tree by using different combinations of settings and features while I was evaluating models and tuning model performance. I used 20% Test dataset to evaluate models, downloaded the confusion metric and recorded Avg F, Avg Precision and Avg Accuracy for each model. The best Decision Tree model which has the highest Avg F score is 523 model. Its Avg F score is 0.268, Avg Precision is 0.242 and Avg Accuracy is 0.285. In this model, I kept all default settings except Threshold=523, Sample rate=81% and Ordering=linear.
  • 3. 4. Build Logistic Regression Models, evaluate and choose the best model I built 5 models in Logistic Regression. I used 20% Test dataset to evaluate the model, downloaded the confusion metric and recorded Avg F, Avg Precision and Avg Accuracy for each model. The best Logistic Regression model which has the highest Avg F score is 48 model. Its Avg F score is 0.2754, Avg Precision is 0.2557 and Avg Accuracy is 0.2845. In this model, I kept all default settings except Sampling rate=48% and excluding Bias Term.
  • 4. 5. Build Ensemble Models, evaluate and choose the best model I built 9 models in Ensemble including 2 models with Weight: CapFirstScore. I used 20% Test dataset to evaluate the model, downloaded the confusion metric and recorded Avg F, Avg Precision and Avg Accuracy for each model. The best Ensemble model which has the highest Avg F score is 333 model. Its Avg F score is 0.2992, Avg Precision is 0.2724 and Avg Accuracy is 0.3255. In this model, I kept all default settings.
  • 5. 6. Predicted 20% Test dataset and calculated AvgPoints After model evaluation, I predicted 20% Test dataset using three best models and two Ensemble models with Weight, and recorded AvgPoints for each prediction. For 523 model, the best Decision Tree model, its AvgPoints is 6.7533. For 333 model, the best Ensemble model, its AvgPoints is 7.3021. For 48 model, the best Logistic Regression model, its AvgPoints is 5.4875. For 777 model, the Ensemble model with Weight=CapFirstScore, Sampling rate=49% and Threshold=293, its AvgPoints is 6.6289. For 888 model, the Ensemble model with Weight=CapFirstScore and Sampling rate=52%, its AvgPoints is 6.6021. At this point, the best model is 333 model. 7. More actions after step 6 I predicted 20% Test dataset using some other models and recorded AvgPoints as following. 1) Predicted three more Ensemble models which have high performance values. For 1618 model, which has the second highest Avg F in Ensembles models, its AvgPoints is 7.5081. For 555 model, which has the third highest Avg F in Ensembles models, its AvgPoints is 7.3389. For 913 model, which has the fourth highest Avg F in Ensembles models, its AvgPoints is 7.3389. Model Name Avg F Avg Precision Avg Accuracy AvgPoints 333 model 0.2992 0.2724 0.3255 7.3021 1618 model 0.2964 0.303 0.3328 7.5081 555 model 0.2956 0.279 0.3235 7.3389 913 model 0.2894 0.2889 0.3313 7.3895 1618 model has the highest AvgPoints, followed by 913 model and 555 model. This table shows AvgPoints is not only decided by F value, it also related with Precision and Accuracy. 1618 model, having the highest Presicion and Accuracy, is better than 333 model although 333 model has the highest F value. Similarly, 555 model has higher Precision value than 333 model and 913 model has the second highest Accuracy. 2) Predicted two more Decision Tree models which have high performance values. Model Name Avg F Avg Precision Avg Accuracy AvgPoints 523 model 0.268 0.242 0.285 6.7533 512 model 0.2451 0.2496 0.2775 6.5981 1813 model 0.249 0.2549 0.2857 6.6802
  • 6. In these three model, 523 model has the highest AvgPoints. Although 1813 model has the higest Precision and Accuracy among all 5 models, its AvgPoints is not higher than the value of 523 model. 3) Predicted two more Logistic Regression models which have high performance values. Model Name Avg F Avg Precision Avg Accuracy AvgPoints 48 model 0.2754 0.2557 0.2845 5.4875 111 model 0.2706 0.2531 0.2938 5.4898 222 model 0.2641 0.2741 0.2823 5.5663 222 model has the highest AvgPoints among these three models. In sum, Ensembles models have better performance than Decision Tree models and Logistic Regression models overall. The top three models in this project: The best model is 1618 model, which has the highest AvgPoints 7.5081. This is an Ensemble model using all variables and all default settings except Threshold=1618 and Ordering=linear. The second model is 913 model, which has AvgPoints, 7.3895. This is an Ensemble model using all variables and all default settings except Threshold=913 and Ordering=Random shuffling. The third model is 555 model, which has AvgPoints 7.3389. This is an Ensemble model using all default settings but excluding variables from CRatingPlayer1:14 to PRatingPlayer1:14. This model only used new variables I created early.