Improving the Model’s Predictive Power with Ensemble Approaches
KDD Cup 2010: Overview• The Challenge – How generally or narrowly do students learn? How quickly or slowly? Will the rate of improvement vary between students? What does it mean for one problem to be similar to another? – Is it possible to infer the knowledge requirements of problems directly from student performance data, without human analysis of the tasks? – This years challenge asks you to predict student performance on mathematical problems from logs of student interaction with Intelligent Tutoring Systems.
KDD Cup 2010: Results• Winners of KDD Cup 2010: All Teams – First Place: National Taiwan University Feature engineering and classifier ensembling for KDD CUP 2010 – First Runner Up: Zhang and Su Gradient Boosting Machines with Singular Value Decomposition – Second Runner Up: BigChaos @ KDD Collaborative Filtering Applied to Educational Data Mining
Outline• What is Ensemble Learning?• Why Ensemble?• How good is Ensemble?• What next?
Predictive Modeling• Widely-used in many applications: – Business • Churn modeling, Scoring – Science • Chemometrics – Bio-Science • Efficacy modeling, Classification – Academics • Admission selection, student performance
Predictive Modeling New Data SetTraining Model Predictive Prediction Set Development Rules
Classical Approach: Model Selection Which one is the best?
What is Ensemble?• Single Expert vs Team of Experts
What is Ensemble? Data Set Training Set #1 Training Set #2 …… Training Set #k . Learning Learning Learning …… Model #1 Model #2 Model #k . Combiner Ensemble Prediction
Types of Ensemble• Hybrid Ensemble – Combining several different learning algorithms into one prediction – e.g: combining the result of regression, tree, neural nets, and support vector machine• Non-Hybrid Ensemble – Combining several learning models from the same algorithm into one prediction
Well-Known Ensembles• Bagging – Generate learning models for the bootstrap samples – Aggregate the predictions via averaging or majority-vote• Boosting (AdaBoost) – Generate sequential learning models with higher weight to ‘difficult’ cases – Combine the predictions by concerning the weight• Random Forest – Similar to bagging except the existence of random feature selection for each learning model generation
Bagus SartonoEducational Background Professional Experience• Bachelor of Science in • Lecturer – Dept of Stats Stats – IPB (2000) IPB• Master of Science in • Experienced Trainer in Stats – IPB (2004) Analytics (Bank• PhD in Applied Indonesia, Bank Economics – University of Mandiri, Ganesha Cipta Antwerp (2012) Informatika, CIFOR, LIPI, LPEM-UI, etc)