Predicting food demand in food courts by decision tree approaches

760
-1

Published on

For Full Paper:

http://www.sciencedirect.com/science/article/pii/S1877050910005004

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
760
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Predicting food demand in food courts by decision tree approaches

  1. 1. Ahmet Selman Bozkır Ebru Akçapınar Sezer Hacettepe University Computer Engineering Dept.
  2. 2. <ul><li>The importance of subject </li></ul><ul><li>Data mining and methods involved in the study </li></ul><ul><li>Data properties </li></ul><ul><li>Predicting food demand by decision trees </li></ul><ul><li>Conslusion & future directions </li></ul>
  3. 3. <ul><li>Economics is the social science that analyzes the production , distribution , and consumption of goods and services . (Wikipedia) </li></ul>
  4. 4. Data mining: is the process of extracting hidden & useful information from raw data by utilizing statistics, AI and machine learning techniques and smart algorithms.
  5. 5. <ul><li>Predictive Methods </li></ul><ul><li>Classification (Decision Trees, Bayesian Classification, etc…) </li></ul><ul><li>Regression (CART, Kernel Ridge Regression etc..) </li></ul><ul><li>Artificial Neural Networks </li></ul><ul><li>Kernel Based Methods (SVM, RVM etc...) </li></ul><ul><li>Descriptive Methods </li></ul><ul><li>Clustering (K-Means , Hierarchical Clustering, EM etc…) </li></ul><ul><li>Association Rules (Apriori, GRI etc..) </li></ul>Decision Trees SVM ANN
  6. 6. <ul><li>Chi-squared Automatic Interaction Detection •Invented by Kass , et.al. in 198 0 • Designed for both classification and regression •Works on both categorical & numerical attributes •Uses Karl Pearson's X2 test as splitting criteria •Uses multi -way splits • Missing value handling is provided • Avoids tree pruning </li></ul>
  7. 7. <ul><li>Classification and Regression Trees •Invented by Breiman, et.al. in 1984 •Uses binary recursive partitioning method • Designed for both classification and regression •Works on both categorical & numerical attributes •Uses Gini measure as splitting criteria •Uses two-way splits • Missing value handling is provided • Tree pruning is also provided </li></ul>
  8. 8. <ul><li>Microsoft Decision Trees •Invented by MS , in 19 99 • Designed for both classification and regression •Works on both categorical & numerical attributes • Serves entropy, Bayesian K2, and Bayesian Dirichlet Equivalent with Uniform prior choices as splitting criteria •Uses multi -way splits and support binary splitting • Missing value handling is provided • Avoids tree pruning </li></ul>
  9. 9. Final data contains two year period of 627 cases General data Variable name Type Usage Day Continuous Input Day number ranging from 1 to 31 Month Continuous Input Month number ranging from 1 to 12 Day Name Discrete Input The name of the day ranging from Monday to Sunday Is Holiday Boolean Input A flag variable (0/1) denoting that day is in weekend/holiday or weekday Calorie Continuous Input Total calorie amount of the menu Food1 Discrete Input First food name in menu Food2 Discrete Input Second food name in menu Food3 Discrete Input Third food name in menu Food4 Discrete Input Fourth food name in menu Sales-Student-Lunch Continuous Predict Number of sales for students in lunch session Sales-Students-Dinner Continuous Predict Number of sales for students in dinner session Sales-Academic-Lunch Continuous Predict Number of sales for academic staff in lunch session Sales-Officials-Lunch Continuous Predict Number of sales for officers in lunch session Sales-Contractual-Lunch Continuous Predict Number of sales for contractual employees in lunch session
  10. 10. <ul><li>Target: - Building a predictive decision tree model to predict menu based consumption amounts - I dentifying the factors affecting the consumptions for every type of customers </li></ul><ul><li>Applied Techniques: Microsoft Decision Trees, CART, CHAID </li></ul><ul><li>Software Platform: Spss Clementine 12 and Microsoft Analysis Services </li></ul>
  11. 11. <ul><li>Target Variables: Student-Lunch Students-Dinner Academic-Lunch Officials-Lunch Contractual-Lunch </li></ul><ul><li>Train/test data ratio 85% / 15% </li></ul>
  12. 12. Customer Type MSDT CART CHAID Student-Lunch Session 0.800 0.729 0.800 Student-Dinner Session 0.585 0.549 0.525 Officers-Lunch Session 0.627 0.673 0.721 Academic Staff-Lunch Session 0.616 0.506 0.595 Contractual Employees-Lunch Session 0.722 0.741 0.838 Average: 0.670 0.639 0.695
  13. 13. <ul><li>Food consumption prediction is important due to some remarkable points </li></ul><ul><li>Decision trees is fine candidate for food consumption prediction modelling in food courts </li></ul><ul><li>CHAID performs the best accuracy at all. But is it perfect? </li></ul><ul><li>Multi-way splitted decision trees (e.g., MSDT) also perform better predictions in some situations. </li></ul><ul><li>For evaluating and interpretion of models, multi-way splitted decision tree algorithms are well suits for the problem. </li></ul><ul><li>More and more data will improve the results </li></ul><ul><li>A decision tree powered web-based decision support system is feasible and valuable </li></ul>
  14. 14. <ul><li>Thanks for listening... </li></ul><ul><li>Questions? </li></ul>

×