SlideShare a Scribd company logo
1 of 43
Download to read offline
Classification
Validation and testing
Association rule and evaluation
Week 12
Classification
Classification
Classification
Classification
- Classification methods seek to classify a categorical outcome into one of
two or more categories based on various data attributes
- For each record in a database, categorical variable of interest (e.g.,
purchase or not purchase, high risk or no risk),and a number of additional
predictor variables (age, income, gender, education) are there
- For a given set of predictor variables, assign the best value of the
categorical variable
5
Application
Application
Benefits to Supply Chain
Application
Application
▪ The model (classifier) is learned by finding patterns in training set
▪ Performance on training set does not (necessarily) indicate
generalization power of the model
▪ A validation set (a subset of training set) is used to learn parameters
and tune architecture of classifier and estimate error
▪ For generalization of the model, validation set must be representative
of the input instances
▪ Since test set is never used during training, it provides an unbiased
estimate of generalization error
Classification
Using Training and Validation Data
- Most data-mining projects use large volumes of data
- Before building a model, partition the data into a training data set and a validation data set
- Training data sets have known outcomes and are used to “teach” a data-mining algorithm
- To get a more realistic estimate of how the model would perform with unseen data, set aside a part
of the original data into a validation data set and not use it in the training process
- The validation data set is often used to fine-tune models
- When a model is finally chosen, its accuracy with the validation data set is still an optimistic estimate
of how it would perform with unseen data
- Data miners often set aside another portion of data, which is used neither in training nor in
validation. This set is known as the test data set
- The accuracy of the model on the test data gives a realistic estimate of the performance of the
model on completely unseen data
11
Classification: Training-Validation split
Cross validation
Evaluation metrics
Evaluation metrics
Overfitting
- The phenomenon when model performs very well on training data
but does not generalize to testing data
- The model learns the data and not the underlying function
- Model has too much freedom (many parameters with wider
ranges)
16
Classifier types
K-nearest neighbor
Decision tree
- Fundamentally, an if-then rule set for
classifying objects
- Builds model in the form of a tree
structure
- To classify a test instance x traverse the
tree from root to leaf
- Take branches at internal nodes according
to results of their tests
- Predict the class label at the leaf node
reached
19
Classification methods
- In this database, the categorical variable of interest is the decision to
approve or reject a credit application
- The remaining variables are the predictor variables
- Code the Homeowner and Decision fields numerically. Homeowner
attribute “Y” as 1 and “N” as 0; similarly, Decision attribute “Approve” as 1
and “Reject” as 0
20
Example
21
Example
- To develop an intuitive understanding of classification, consider only the
credit score and years of credit history as predictor variables
- The large bubbles represent the applicants whose credit applications were
rejected; the small bubbles represent those that were approved
- There appears to be a clear separation of the points
- When the credit score is greater than 640, the applications were approved,
but most applications with credit scores of 640 or less were rejected
- Thus, a simple classification rule: approve an application with a credit
score greater than 640
22
Example
- Another way of classifying the groups is to use both the credit score and
years of credit history by visually drawing a straight line to separate the
groups
- This line passes through the points (763, 2) and (595, 18). Using a little
algebra, it can be calculated the equation of the line as
years = −0.095 × credit score + 74.66
- Therefore, a different classification rule can be obtained: whenever
years + 0.095 × credit score ≤ 74.66,
the application is rejected; otherwise, it is approved.
23
y = mx + c
m = (y2 – y1)/(x2 – x1)
x-x1 = m(y-y1)
Example
24
Classifying New Data
- The purpose of developing a classification model is to be able to classify
new data. After a classification scheme is chosen and the best model is
developed based on existing data, use the predictor variables as inputs to
the model to predict the output
- Simple credit-score rule that a score of more than 640 is needed??
- If second rule having both the credit score and years of credit history??
25
Discriminant analysis
- It is a technique for classifying a set of observations into predefined classes
- The purpose is to determine the class of an observation based on a set of
predictor variables
- Based on the training data set, the technique constructs a set of linear
functions of the predictors, known as discriminant functions, which have
the form:
- where the b’s are weights, or discriminant coefficients,
- the X’s are the input variables, or predictors, and c is a constant or the
intercept
- Weights are determined by maximizing the between-group variance
relative to the within-group variance 26
Classifying Credit Decisions Using Discriminant Analysis
- The discriminant analysis procedure incorporates prior assumptions about
how frequently the different classes occur. Three options:
- According to relative occurrences in training data: This option assumes
that the probability of encountering a particular category is the same as the
frequency with which it occurs in the training data
- Use equal prior probabilities: This option assumes that all categories occur
with equal probability
- User specified prior probabilities. This option is available only if the output
variable has two categories
27
Classifying Credit Decisions Using Discriminant Analysis
- The classification (discriminant) functions for the two categories. For
category 1 (approve the loan application), the discriminant function is
- L(1) = −137.48 + 32.295 × homeowner + 0.286 × credit score + 0.833 ×
years of credit history + 0.00010274 × revolving balance + 128.248 ×
revolving utilization
- For category 0 (reject the loan application), the discriminant function is
- L(0) = −157.2 + 30.747 × homeowner + 0.289 × credit score + 0.473 × years
of credit history + 0.0004716 × revolving balance + 167.7 × revolving
utilization
28
Discriminant analysis
- Like many statistical procedures, discriminant analysis requires certain
assumptions, such as normality of the independent variables
- The normality assumption is often violated in practice, but the method is
generally robust to violations of the assumptions
- The lateron we will study a technique, called logistic regression, does not
rely on these assumptions, making it preferred by many analytics
practitioners
29
Association Rule
- Association rule mining, often called affinity analysis, seeks to uncover
interesting associations and/or correlation relationships among large sets of
data. Association rules identify attributes that occur frequently together in
a given data set
- A typical and widely used example of association rule mining is market
basket analysis.
30
Market basket analysis
- For example, supermarkets routinely collect data using bar-code scanner by
a customer for a single-purchase transaction
- Such databases consist of a large number of transaction records
- Managers would be interested to know if certain groups of items are
consistently purchased together
- They could use these data for adjusting store layouts (placing items
optimally with respect to each other), for cross-selling, for promotions, for
catalog design, and to identify customer segments based on buying
patterns
- Association rule mining is how companies such as Netflix and Amazon.com
make recommendations based on past movie rentals or item purchases31
MARKET BASKET ANALYSIS
• INPUT: list of purchases by purchaser
• do not have names
• identify purchase patterns
• what items tend to be purchased together
• obvious: steak-potatoes; beer-pretzels
• what items are purchased sequentially
• obvious: house-furniture; car-tires
• what items tend to be purchased by season
Market Basket Analysis
• Categorize customer purchase behavior
• identify actionable information
• purchase profiles
• profitability of each purchase profile
• use for marketing
• layout or catalogs
• select products for promotion
• space allocation, product placement
Market Basket Analysis
• Market Basket Benefits
• selection of promotions, merchandising strategy
• sensitive to price: Italian entrees, pizza, pies, Oriental entrees, orange juice
• uncover consumer spending patterns
• correlations: orange juice & waffles
• joint promotional opportunities
Market Basket Analysis
• Retail outlets
• Telecommunications
• Banks
• Insurance
• link analysis for fraud
• Medical
• symptom analysis
Market Basket Analysis
• Chain Store Age Executive (1995)
1) Associate products by category
2) what % of each category was in each market basket
• Customers shop on personal needs, not on product groupings
Possible Market Baskets
Customer 1: beer, pretzels, potato chips, aspirin
Customer 2: diapers, baby lotion, grapefruit juice, baby food, milk
Customer 3: soda, potato chips, milk
Customer 4: soup, beer, milk, ice cream
Customer 5: soda, coffee, milk, bread
Customer 6: beer, potato chips
Co-occurrence Table
Beer Pot. Milk Diap. Soda
Chips
Beer 3 2 1 0 0
Pot. Chips 2 3 1 0 1
Milk 1 2 4 1 2
Diapers 0 0 1 1 0
Soda 0 1 2 0 2
beer & potato chips - makes sense milk & soda - probably noise
Purchase Profiles
• Beauty conscious
• cotton balls
• hair dye
• perfumes
• nail polish
Purchase Profiles
• Each profile has an average profit per basket
• Kids’ fashion $15.24 push these
• Men’s fashion $13.41
• ….
• Smoker $2.88 don’t push
• Student/home office $2.55 these
Market Basket Analysis
• Affinity Positioning
• coffee, coffee makers in close proximity
• Cross-Selling
• cold medicines, orange juice
Market Basket Analysis
• LIMITATIONS
• takes over ~18 months to implement
• market basket analysis only identifies hypotheses, which need to be tested
• neural network, regression, decision tree analyses
• measurement of impact needed
• difficult to identify product groupings
• complexity grows exponentially
Market Basket Analysis
• BENEFITS:
• simple computations
• can be undirected (don’t have to have hypotheses before analysis)
• different data forms can be analyzed

More Related Content

Similar to Supply Chain Analytics, Supply Chain Management, Supply Chain Data Analytics

Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike Moin
Tanvir Moin
 

Similar to Supply Chain Analytics, Supply Chain Management, Supply Chain Data Analytics (20)

Lect8 Classification & prediction
Lect8 Classification & predictionLect8 Classification & prediction
Lect8 Classification & prediction
 
Four stage business analytics model
Four stage business analytics modelFour stage business analytics model
Four stage business analytics model
 
Purchase Prediction for Insurance Company
Purchase Prediction for Insurance CompanyPurchase Prediction for Insurance Company
Purchase Prediction for Insurance Company
 
Recency/Frequency and Predictive Analytics in the gaming industry
Recency/Frequency and Predictive Analytics in the gaming industryRecency/Frequency and Predictive Analytics in the gaming industry
Recency/Frequency and Predictive Analytics in the gaming industry
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptx
 
Types of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike MoinTypes of Machine Learning- Tanvir Siddike Moin
Types of Machine Learning- Tanvir Siddike Moin
 
Segmentation
SegmentationSegmentation
Segmentation
 
Segmentation
SegmentationSegmentation
Segmentation
 
Ch._6_pp_industrial.ppt
Ch._6_pp_industrial.pptCh._6_pp_industrial.ppt
Ch._6_pp_industrial.ppt
 
Machine Learning - Algorithms and simple business cases
Machine Learning - Algorithms and simple business casesMachine Learning - Algorithms and simple business cases
Machine Learning - Algorithms and simple business cases
 
DataAnalyticsIntroduction and its ci.pptx
DataAnalyticsIntroduction and its ci.pptxDataAnalyticsIntroduction and its ci.pptx
DataAnalyticsIntroduction and its ci.pptx
 
Research trends in data warehousing and data mining
Research trends in data warehousing and data miningResearch trends in data warehousing and data mining
Research trends in data warehousing and data mining
 
Recommendation Systems : Selection vs Fulfillment
Recommendation Systems : Selection vs FulfillmentRecommendation Systems : Selection vs Fulfillment
Recommendation Systems : Selection vs Fulfillment
 
Data mining
Data miningData mining
Data mining
 
Data mining an introduction
Data mining an introductionData mining an introduction
Data mining an introduction
 
Introduction to Business Analytics---PPT
Introduction to Business Analytics---PPTIntroduction to Business Analytics---PPT
Introduction to Business Analytics---PPT
 
Unit-V-Introduction to Data Mining.pptx
Unit-V-Introduction to  Data Mining.pptxUnit-V-Introduction to  Data Mining.pptx
Unit-V-Introduction to Data Mining.pptx
 
Machine Learning Methods 2.pptx
Machine Learning Methods 2.pptxMachine Learning Methods 2.pptx
Machine Learning Methods 2.pptx
 
The 8 Step Data Mining Process
The 8 Step Data Mining ProcessThe 8 Step Data Mining Process
The 8 Step Data Mining Process
 
segmentda
segmentdasegmentda
segmentda
 

Recently uploaded

Tech Framework Integrated Workspace Management System
Tech Framework Integrated Workspace Management SystemTech Framework Integrated Workspace Management System
Tech Framework Integrated Workspace Management System
Sushant Joshi
 
"NEON LIGHT CITY" AR PC game from M.A.D tronics Studios
"NEON LIGHT CITY" AR PC game from M.A.D tronics Studios"NEON LIGHT CITY" AR PC game from M.A.D tronics Studios
"NEON LIGHT CITY" AR PC game from M.A.D tronics Studios
tess51
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aljazherman
 

Recently uploaded (8)

Dropbox DBX Q1 2020 Investor Presentation.pdf
Dropbox DBX Q1 2020 Investor Presentation.pdfDropbox DBX Q1 2020 Investor Presentation.pdf
Dropbox DBX Q1 2020 Investor Presentation.pdf
 
AI Revolution Industries-Transformed.pdf
AI Revolution Industries-Transformed.pdfAI Revolution Industries-Transformed.pdf
AI Revolution Industries-Transformed.pdf
 
AI Data Engineering for SMEs - some tricks and tools
AI Data Engineering for SMEs - some tricks and toolsAI Data Engineering for SMEs - some tricks and tools
AI Data Engineering for SMEs - some tricks and tools
 
Financial management icar entrepreneurship development
Financial management icar entrepreneurship developmentFinancial management icar entrepreneurship development
Financial management icar entrepreneurship development
 
Facebook_Meta_Q4-2018-Earnings-Presentation.pdf
Facebook_Meta_Q4-2018-Earnings-Presentation.pdfFacebook_Meta_Q4-2018-Earnings-Presentation.pdf
Facebook_Meta_Q4-2018-Earnings-Presentation.pdf
 
Tech Framework Integrated Workspace Management System
Tech Framework Integrated Workspace Management SystemTech Framework Integrated Workspace Management System
Tech Framework Integrated Workspace Management System
 
"NEON LIGHT CITY" AR PC game from M.A.D tronics Studios
"NEON LIGHT CITY" AR PC game from M.A.D tronics Studios"NEON LIGHT CITY" AR PC game from M.A.D tronics Studios
"NEON LIGHT CITY" AR PC game from M.A.D tronics Studios
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 

Supply Chain Analytics, Supply Chain Management, Supply Chain Data Analytics

  • 5. Classification - Classification methods seek to classify a categorical outcome into one of two or more categories based on various data attributes - For each record in a database, categorical variable of interest (e.g., purchase or not purchase, high risk or no risk),and a number of additional predictor variables (age, income, gender, education) are there - For a given set of predictor variables, assign the best value of the categorical variable 5
  • 10. ▪ The model (classifier) is learned by finding patterns in training set ▪ Performance on training set does not (necessarily) indicate generalization power of the model ▪ A validation set (a subset of training set) is used to learn parameters and tune architecture of classifier and estimate error ▪ For generalization of the model, validation set must be representative of the input instances ▪ Since test set is never used during training, it provides an unbiased estimate of generalization error Classification
  • 11. Using Training and Validation Data - Most data-mining projects use large volumes of data - Before building a model, partition the data into a training data set and a validation data set - Training data sets have known outcomes and are used to “teach” a data-mining algorithm - To get a more realistic estimate of how the model would perform with unseen data, set aside a part of the original data into a validation data set and not use it in the training process - The validation data set is often used to fine-tune models - When a model is finally chosen, its accuracy with the validation data set is still an optimistic estimate of how it would perform with unseen data - Data miners often set aside another portion of data, which is used neither in training nor in validation. This set is known as the test data set - The accuracy of the model on the test data gives a realistic estimate of the performance of the model on completely unseen data 11
  • 16. Overfitting - The phenomenon when model performs very well on training data but does not generalize to testing data - The model learns the data and not the underlying function - Model has too much freedom (many parameters with wider ranges) 16
  • 19. Decision tree - Fundamentally, an if-then rule set for classifying objects - Builds model in the form of a tree structure - To classify a test instance x traverse the tree from root to leaf - Take branches at internal nodes according to results of their tests - Predict the class label at the leaf node reached 19
  • 20. Classification methods - In this database, the categorical variable of interest is the decision to approve or reject a credit application - The remaining variables are the predictor variables - Code the Homeowner and Decision fields numerically. Homeowner attribute “Y” as 1 and “N” as 0; similarly, Decision attribute “Approve” as 1 and “Reject” as 0 20
  • 22. Example - To develop an intuitive understanding of classification, consider only the credit score and years of credit history as predictor variables - The large bubbles represent the applicants whose credit applications were rejected; the small bubbles represent those that were approved - There appears to be a clear separation of the points - When the credit score is greater than 640, the applications were approved, but most applications with credit scores of 640 or less were rejected - Thus, a simple classification rule: approve an application with a credit score greater than 640 22
  • 23. Example - Another way of classifying the groups is to use both the credit score and years of credit history by visually drawing a straight line to separate the groups - This line passes through the points (763, 2) and (595, 18). Using a little algebra, it can be calculated the equation of the line as years = −0.095 × credit score + 74.66 - Therefore, a different classification rule can be obtained: whenever years + 0.095 × credit score ≤ 74.66, the application is rejected; otherwise, it is approved. 23 y = mx + c m = (y2 – y1)/(x2 – x1) x-x1 = m(y-y1)
  • 25. Classifying New Data - The purpose of developing a classification model is to be able to classify new data. After a classification scheme is chosen and the best model is developed based on existing data, use the predictor variables as inputs to the model to predict the output - Simple credit-score rule that a score of more than 640 is needed?? - If second rule having both the credit score and years of credit history?? 25
  • 26. Discriminant analysis - It is a technique for classifying a set of observations into predefined classes - The purpose is to determine the class of an observation based on a set of predictor variables - Based on the training data set, the technique constructs a set of linear functions of the predictors, known as discriminant functions, which have the form: - where the b’s are weights, or discriminant coefficients, - the X’s are the input variables, or predictors, and c is a constant or the intercept - Weights are determined by maximizing the between-group variance relative to the within-group variance 26
  • 27. Classifying Credit Decisions Using Discriminant Analysis - The discriminant analysis procedure incorporates prior assumptions about how frequently the different classes occur. Three options: - According to relative occurrences in training data: This option assumes that the probability of encountering a particular category is the same as the frequency with which it occurs in the training data - Use equal prior probabilities: This option assumes that all categories occur with equal probability - User specified prior probabilities. This option is available only if the output variable has two categories 27
  • 28. Classifying Credit Decisions Using Discriminant Analysis - The classification (discriminant) functions for the two categories. For category 1 (approve the loan application), the discriminant function is - L(1) = −137.48 + 32.295 × homeowner + 0.286 × credit score + 0.833 × years of credit history + 0.00010274 × revolving balance + 128.248 × revolving utilization - For category 0 (reject the loan application), the discriminant function is - L(0) = −157.2 + 30.747 × homeowner + 0.289 × credit score + 0.473 × years of credit history + 0.0004716 × revolving balance + 167.7 × revolving utilization 28
  • 29. Discriminant analysis - Like many statistical procedures, discriminant analysis requires certain assumptions, such as normality of the independent variables - The normality assumption is often violated in practice, but the method is generally robust to violations of the assumptions - The lateron we will study a technique, called logistic regression, does not rely on these assumptions, making it preferred by many analytics practitioners 29
  • 30. Association Rule - Association rule mining, often called affinity analysis, seeks to uncover interesting associations and/or correlation relationships among large sets of data. Association rules identify attributes that occur frequently together in a given data set - A typical and widely used example of association rule mining is market basket analysis. 30
  • 31. Market basket analysis - For example, supermarkets routinely collect data using bar-code scanner by a customer for a single-purchase transaction - Such databases consist of a large number of transaction records - Managers would be interested to know if certain groups of items are consistently purchased together - They could use these data for adjusting store layouts (placing items optimally with respect to each other), for cross-selling, for promotions, for catalog design, and to identify customer segments based on buying patterns - Association rule mining is how companies such as Netflix and Amazon.com make recommendations based on past movie rentals or item purchases31
  • 32. MARKET BASKET ANALYSIS • INPUT: list of purchases by purchaser • do not have names • identify purchase patterns • what items tend to be purchased together • obvious: steak-potatoes; beer-pretzels • what items are purchased sequentially • obvious: house-furniture; car-tires • what items tend to be purchased by season
  • 33. Market Basket Analysis • Categorize customer purchase behavior • identify actionable information • purchase profiles • profitability of each purchase profile • use for marketing • layout or catalogs • select products for promotion • space allocation, product placement
  • 34. Market Basket Analysis • Market Basket Benefits • selection of promotions, merchandising strategy • sensitive to price: Italian entrees, pizza, pies, Oriental entrees, orange juice • uncover consumer spending patterns • correlations: orange juice & waffles • joint promotional opportunities
  • 35. Market Basket Analysis • Retail outlets • Telecommunications • Banks • Insurance • link analysis for fraud • Medical • symptom analysis
  • 36. Market Basket Analysis • Chain Store Age Executive (1995) 1) Associate products by category 2) what % of each category was in each market basket • Customers shop on personal needs, not on product groupings
  • 37. Possible Market Baskets Customer 1: beer, pretzels, potato chips, aspirin Customer 2: diapers, baby lotion, grapefruit juice, baby food, milk Customer 3: soda, potato chips, milk Customer 4: soup, beer, milk, ice cream Customer 5: soda, coffee, milk, bread Customer 6: beer, potato chips
  • 38. Co-occurrence Table Beer Pot. Milk Diap. Soda Chips Beer 3 2 1 0 0 Pot. Chips 2 3 1 0 1 Milk 1 2 4 1 2 Diapers 0 0 1 1 0 Soda 0 1 2 0 2 beer & potato chips - makes sense milk & soda - probably noise
  • 39. Purchase Profiles • Beauty conscious • cotton balls • hair dye • perfumes • nail polish
  • 40. Purchase Profiles • Each profile has an average profit per basket • Kids’ fashion $15.24 push these • Men’s fashion $13.41 • …. • Smoker $2.88 don’t push • Student/home office $2.55 these
  • 41. Market Basket Analysis • Affinity Positioning • coffee, coffee makers in close proximity • Cross-Selling • cold medicines, orange juice
  • 42. Market Basket Analysis • LIMITATIONS • takes over ~18 months to implement • market basket analysis only identifies hypotheses, which need to be tested • neural network, regression, decision tree analyses • measurement of impact needed • difficult to identify product groupings • complexity grows exponentially
  • 43. Market Basket Analysis • BENEFITS: • simple computations • can be undirected (don’t have to have hypotheses before analysis) • different data forms can be analyzed