SlideShare a Scribd company logo
Statistics:
Your Most Important Class
Aaron Sankey
About Me
• Started CEMBA in 2012, switched to part-time after year 1
• Graduated from Carlson in Spring of 2014
• VP, Team Manager at Bank of America within the IT department
• By graduation, presented three stats-based projects, each should
improve net income by $1M to $25M with investment < $500K
Statistics can be immediately monetized!
Sales and Marketing
The basic idea:
Customer purchasing should be predictable based on other customer’s past
purchasing
Possible independent variables for regression:
• Frequency of purchase (of any product, or of each product)
• Total purchases (normalized by corporate earnings or zip code
average income)
• Days since last purchase
• Preferred contact method
• Advertisement used
Sales and Marketing
• Remember: Use samples and verify on the whole!
• Use “clustering”, if you can, to identify similar customers:
http://www.jmp.com/support/help/K-Means_Clustering.shtml
http://www.jmp.com/support/help/Hierarchical_Clustering.shtml#110036
• Correlation will provide customer targets with higher sales closure
rates and, consequently, targets that are not profitable
• Acceptable p-values and large betas on “cross products” of
independent variables ( i.e. ϒ = βχiχj ) could indicate product
synergies/interactions
New York Times, February 19, 2012 (About Target):
“Psst, You in Aisle 5”
Project Management
The basic idea:
Actual Project Cost should be a function of, at least, Project Estimate
Possible independent variables for regression:
• Estimated project cost
• Percentage of work done by contractors and contractor hourly rate
(normalized by employee salary)
• How many silos/which silos are involved
• Expected duration (calendar time or hours of work) of the effort
• Implementing standard tools vs. customization
Project Management
Possible results:
• Little to no correlation between estimates and actuals
– Estimation process is a waste of money!
• Reasonable correlation
– Identify subsets where correlation is weaker than most and improve
estimation process
• High correlation
– Could provide possible areas for improvement (look for high betas)
– Could replace/augment portions of the estimation process (enter in all of the
independent variables and generate results)
– Could also mean “cooked” numbers 
Project Management
Given reasonable or better correlation, expected return on the project,
and identified confidence intervals
• Avoid projects that would be taken without statistical analysis
– If the return for the project is too small to justify the undertaking given a
broad confidence interval, do not do the project
• Take on projects that normally would be skipped
– If the confidence intervals are very narrow, the estimate should be
considered “a lock” and the ROI requirements can be less stringent
Project Management: Case Study
Implemented at a Fortune 100 Firm
• Large areas of low correlation
• The pool of independent variables was limited by data availability
and politics
• Instead of a statistician, an expensive, automated software package
was used
– No second-order variables and no cross products (software limitation)
– No discretion in p-value measurement ( 0.051 gets just as rejected as 6.051 )
– High investment leads to sunk-cost fallacy, so statistical solutions are not
being investigated and root cause of low correlation isn’t getting identified
Develop a New Offering
MIT Sloan Management Review, Winter 2004:
“The Seller’s Hidden Advantage”
Toyota:
Benchmarked all of its suppliers and made them all more efficient, which
made the suppliers more competitive, which resulted in better prices for
Toyota
Orica:
Developed a 20 variable model from customer use of their explosives that
made each subsequent customer more accurate in their purchase and use
of Orica explosives
Develop a New Offering
IT Consulting Firm:
Benchmark your clients IT services
• Examine common services provided by each client – this is very
different and more difficult than manufacturing!
• Build a model based on available factors:
– Number of employees, locations, costs, level of service, etc.
• Results are a great starting point, but isn’t the holy grail
– Statistically suggesting costs are above benchmark prediction could be
indicative of a level of service not provided at other clients – but it could
also mean that there is inefficiency afoot
Tips and Tricks
1. Find out where the business unit or company makes or spends a
great deal of money
2. Find out what data can be had
3. Build a model on a sample if data is hard to get or is large
4. Ask for funding and justify with new, interesting results
5. Use the project in this class
6. Avoid using statistics terms (95% confident, Regression, etc.)
7. Expect surprising ignorance
The End
And they all lived happily, ever after…
Appendix: Clustering
Clustering data is using an algorithm to break a large data set into
smaller data sets:
This data set splits well into two clusters – it isn’t likely that real-life
data sets will be this contrived
Appendix: Clustering
Regression will not paint a good picture of the data as a whole:
Splitting the data into the appropriate clusters can lead to more
accurate modeling
Appendix: Why Brains Beat Tools
The process to implement a data mining/business intelligence tool:
1. Collect and organize data – usually in a repeatable, programmatic
(automatic) fashion
2. Purchase licenses and install and configure tool set – usually start
with a sample of the data from step 1
3. Examine results and tune tools
4. Act on results
Before any of this happens, a statistician should look to see if there is
actionable data relationships – steps 1 and 2 are very expensive!!
Appendix: Why Brains Beat Tools
Possible reasons for detecting a weak relationship:
1. Software does not perform clustering
2. Software does not examine 2nd order or cross product factors
3. Software incorrectly acts on multicollinearity
4. Tool set is improperly tuned/configured
5. Data aggregation mechanism is not functioning properly
6. The data is too random
Without a statistician, all six of these reasons look the same!

More Related Content

What's hot

1115 track1 harmanos
1115 track1 harmanos1115 track1 harmanos
1115 track1 harmanos
Rising Media, Inc.
 
Market Research Process
Market Research ProcessMarket Research Process
Market Research Process
Raymond99
 
DMAI Analytics Solutions Guide
DMAI Analytics Solutions GuideDMAI Analytics Solutions Guide
DMAI Analytics Solutions Guide
Dan Meyer
 
How Apps Create Return on Investment (ROI) for Analytics
How Apps Create Return on Investment (ROI) for AnalyticsHow Apps Create Return on Investment (ROI) for Analytics
How Apps Create Return on Investment (ROI) for Analytics
Riaktr
 
What is Analytics for Your Business?
What is Analytics for Your Business?What is Analytics for Your Business?
What is Analytics for Your Business?
tricia eunice baylon
 
Ka connect -Where's the Intelligence in BI
Ka connect -Where's the Intelligence in BIKa connect -Where's the Intelligence in BI
Ka connect -Where's the Intelligence in BIJJ Brantingham
 
1530 track2 reid
1530 track2 reid1530 track2 reid
1530 track2 reid
Rising Media, Inc.
 
Data Science Roadmap
Data Science RoadmapData Science Roadmap
Data Science Roadmap
SupportGCI
 
Executive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationExecutive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic Experimentation
Metre22
 
Conversion Rate Optimization for Business Growth
Conversion Rate Optimization for Business GrowthConversion Rate Optimization for Business Growth
Conversion Rate Optimization for Business Growth
ReapDigital
 
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...Megan van der Hoven
 
Predictive Conversion Modeling - Lifting Web Analytics to the next level
Predictive Conversion Modeling - Lifting Web Analytics to the next levelPredictive Conversion Modeling - Lifting Web Analytics to the next level
Predictive Conversion Modeling - Lifting Web Analytics to the next level
Petri Mertanen
 
When is Driver-Based Modeling Applicable
When is Driver-Based Modeling ApplicableWhen is Driver-Based Modeling Applicable
When is Driver-Based Modeling ApplicableNilly Essaides
 
7 noble paul wood collections (4 3) revise 51015
7 noble paul wood collections (4 3) revise 510157 noble paul wood collections (4 3) revise 51015
7 noble paul wood collections (4 3) revise 51015
CCR-interactive
 
Measuring the effectiveness of your digital assets
Measuring the effectiveness of your digital assetsMeasuring the effectiveness of your digital assets
Measuring the effectiveness of your digital assets
Medullan
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target setting
Noelle Ibrahim
 
Business analysis 0.0 ver 2.1
Business analysis 0.0 ver 2.1Business analysis 0.0 ver 2.1
Business analysis 0.0 ver 2.1
Naseha Sameen
 
Identifying and Measuring KPIs
Identifying and Measuring KPIsIdentifying and Measuring KPIs
Identifying and Measuring KPIs
Blackbaud
 
Customer Insight Powerpoint Presentation Slides
Customer Insight Powerpoint Presentation SlidesCustomer Insight Powerpoint Presentation Slides
Customer Insight Powerpoint Presentation Slides
SlideTeam
 
Making advanced analytics work for you
Making advanced analytics work for youMaking advanced analytics work for you
Making advanced analytics work for you
Girish Nookella
 

What's hot (20)

1115 track1 harmanos
1115 track1 harmanos1115 track1 harmanos
1115 track1 harmanos
 
Market Research Process
Market Research ProcessMarket Research Process
Market Research Process
 
DMAI Analytics Solutions Guide
DMAI Analytics Solutions GuideDMAI Analytics Solutions Guide
DMAI Analytics Solutions Guide
 
How Apps Create Return on Investment (ROI) for Analytics
How Apps Create Return on Investment (ROI) for AnalyticsHow Apps Create Return on Investment (ROI) for Analytics
How Apps Create Return on Investment (ROI) for Analytics
 
What is Analytics for Your Business?
What is Analytics for Your Business?What is Analytics for Your Business?
What is Analytics for Your Business?
 
Ka connect -Where's the Intelligence in BI
Ka connect -Where's the Intelligence in BIKa connect -Where's the Intelligence in BI
Ka connect -Where's the Intelligence in BI
 
1530 track2 reid
1530 track2 reid1530 track2 reid
1530 track2 reid
 
Data Science Roadmap
Data Science RoadmapData Science Roadmap
Data Science Roadmap
 
Executive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationExecutive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic Experimentation
 
Conversion Rate Optimization for Business Growth
Conversion Rate Optimization for Business GrowthConversion Rate Optimization for Business Growth
Conversion Rate Optimization for Business Growth
 
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
 
Predictive Conversion Modeling - Lifting Web Analytics to the next level
Predictive Conversion Modeling - Lifting Web Analytics to the next levelPredictive Conversion Modeling - Lifting Web Analytics to the next level
Predictive Conversion Modeling - Lifting Web Analytics to the next level
 
When is Driver-Based Modeling Applicable
When is Driver-Based Modeling ApplicableWhen is Driver-Based Modeling Applicable
When is Driver-Based Modeling Applicable
 
7 noble paul wood collections (4 3) revise 51015
7 noble paul wood collections (4 3) revise 510157 noble paul wood collections (4 3) revise 51015
7 noble paul wood collections (4 3) revise 51015
 
Measuring the effectiveness of your digital assets
Measuring the effectiveness of your digital assetsMeasuring the effectiveness of your digital assets
Measuring the effectiveness of your digital assets
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target setting
 
Business analysis 0.0 ver 2.1
Business analysis 0.0 ver 2.1Business analysis 0.0 ver 2.1
Business analysis 0.0 ver 2.1
 
Identifying and Measuring KPIs
Identifying and Measuring KPIsIdentifying and Measuring KPIs
Identifying and Measuring KPIs
 
Customer Insight Powerpoint Presentation Slides
Customer Insight Powerpoint Presentation SlidesCustomer Insight Powerpoint Presentation Slides
Customer Insight Powerpoint Presentation Slides
 
Making advanced analytics work for you
Making advanced analytics work for youMaking advanced analytics work for you
Making advanced analytics work for you
 

Viewers also liked

Hipotiroidismo
HipotiroidismoHipotiroidismo
Hipotiroidismo
Aly Pacheco
 
Gerencia industrial importancia
Gerencia industrial importanciaGerencia industrial importancia
Gerencia industrial importancia
Manuel Asuaje
 
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO by Dr N...
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO  by Dr N...SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO  by Dr N...
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO by Dr N...
Nirala Jacobi
 

Viewers also liked (6)

Hipotiroidismo
HipotiroidismoHipotiroidismo
Hipotiroidismo
 
Gerencia industrial importancia
Gerencia industrial importanciaGerencia industrial importancia
Gerencia industrial importancia
 
phy212308
phy212308phy212308
phy212308
 
Kapeel Rao CV
Kapeel Rao CVKapeel Rao CV
Kapeel Rao CV
 
2558 project
2558 project 2558 project
2558 project
 
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO by Dr N...
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO  by Dr N...SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO  by Dr N...
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO by Dr N...
 

Similar to MonetizingStatistics

WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnWHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
RohitKumar639388
 
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptxit_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
Abdulelah Aljabri
 
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
Michael Tarnowski
 
Hero Conf London 2018 - Frameworks for Insights and Impact
Hero Conf London 2018 - Frameworks for Insights and ImpactHero Conf London 2018 - Frameworks for Insights and Impact
Hero Conf London 2018 - Frameworks for Insights and Impact
Wijnand Meijer
 
Operationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BIOperationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BI
CCG
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptx
AsadAli104515
 
Analytics
AnalyticsAnalytics
Benchmarking
BenchmarkingBenchmarking
Benchmarking
navya sree
 
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PMFeature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
Product School
 
Business analytics workshop presentation final
Business analytics workshop presentation   finalBusiness analytics workshop presentation   final
Business analytics workshop presentation finalBrian Beveridge
 
Ba process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDBa process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTD
Debarata Basu
 
The five essential steps to building a data product
The five essential steps to building a data productThe five essential steps to building a data product
The five essential steps to building a data product
Birst
 
Improvement as Data Analyst.pptx
Improvement as Data Analyst.pptxImprovement as Data Analyst.pptx
Improvement as Data Analyst.pptx
Elyada Wigati Pramaresti
 
Financial Modeling
Financial ModelingFinancial Modeling
Financial Modeling
Jeremy Horn
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
Roger Barga
 
Data Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India AnalyticsData Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India Analytics
AyeshaSharma29
 
Giving Organisations new Capabilities to ask the Right Business Questions
Giving Organisations new Capabilities to ask the Right Business QuestionsGiving Organisations new Capabilities to ask the Right Business Questions
Giving Organisations new Capabilities to ask the Right Business Questions
OReillyStrata
 
Product Management Playbook product inception to launch
Product Management Playbook   product inception to launchProduct Management Playbook   product inception to launch
Product Management Playbook product inception to launchjhassemer
 
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM InvestmenteSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
eSavvy
 

Similar to MonetizingStatistics (20)

WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnWHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
 
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptxit_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
 
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
 
Hero Conf London 2018 - Frameworks for Insights and Impact
Hero Conf London 2018 - Frameworks for Insights and ImpactHero Conf London 2018 - Frameworks for Insights and Impact
Hero Conf London 2018 - Frameworks for Insights and Impact
 
Operationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BIOperationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BI
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptx
 
Analytics
AnalyticsAnalytics
Analytics
 
Benchmarking
BenchmarkingBenchmarking
Benchmarking
 
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PMFeature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
 
Business analytics workshop presentation final
Business analytics workshop presentation   finalBusiness analytics workshop presentation   final
Business analytics workshop presentation final
 
Ba process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDBa process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTD
 
The five essential steps to building a data product
The five essential steps to building a data productThe five essential steps to building a data product
The five essential steps to building a data product
 
Improvement as Data Analyst.pptx
Improvement as Data Analyst.pptxImprovement as Data Analyst.pptx
Improvement as Data Analyst.pptx
 
Financial Modeling
Financial ModelingFinancial Modeling
Financial Modeling
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Data Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India AnalyticsData Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India Analytics
 
Giving Organisations new Capabilities to ask the Right Business Questions
Giving Organisations new Capabilities to ask the Right Business QuestionsGiving Organisations new Capabilities to ask the Right Business Questions
Giving Organisations new Capabilities to ask the Right Business Questions
 
Product Management Playbook product inception to launch
Product Management Playbook   product inception to launchProduct Management Playbook   product inception to launch
Product Management Playbook product inception to launch
 
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM InvestmenteSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
 
Data mining wrhousing-lec
Data mining wrhousing-lecData mining wrhousing-lec
Data mining wrhousing-lec
 

MonetizingStatistics

  • 1. Statistics: Your Most Important Class Aaron Sankey
  • 2. About Me • Started CEMBA in 2012, switched to part-time after year 1 • Graduated from Carlson in Spring of 2014 • VP, Team Manager at Bank of America within the IT department • By graduation, presented three stats-based projects, each should improve net income by $1M to $25M with investment < $500K Statistics can be immediately monetized!
  • 3. Sales and Marketing The basic idea: Customer purchasing should be predictable based on other customer’s past purchasing Possible independent variables for regression: • Frequency of purchase (of any product, or of each product) • Total purchases (normalized by corporate earnings or zip code average income) • Days since last purchase • Preferred contact method • Advertisement used
  • 4. Sales and Marketing • Remember: Use samples and verify on the whole! • Use “clustering”, if you can, to identify similar customers: http://www.jmp.com/support/help/K-Means_Clustering.shtml http://www.jmp.com/support/help/Hierarchical_Clustering.shtml#110036 • Correlation will provide customer targets with higher sales closure rates and, consequently, targets that are not profitable • Acceptable p-values and large betas on “cross products” of independent variables ( i.e. ϒ = βχiχj ) could indicate product synergies/interactions New York Times, February 19, 2012 (About Target): “Psst, You in Aisle 5”
  • 5. Project Management The basic idea: Actual Project Cost should be a function of, at least, Project Estimate Possible independent variables for regression: • Estimated project cost • Percentage of work done by contractors and contractor hourly rate (normalized by employee salary) • How many silos/which silos are involved • Expected duration (calendar time or hours of work) of the effort • Implementing standard tools vs. customization
  • 6. Project Management Possible results: • Little to no correlation between estimates and actuals – Estimation process is a waste of money! • Reasonable correlation – Identify subsets where correlation is weaker than most and improve estimation process • High correlation – Could provide possible areas for improvement (look for high betas) – Could replace/augment portions of the estimation process (enter in all of the independent variables and generate results) – Could also mean “cooked” numbers 
  • 7. Project Management Given reasonable or better correlation, expected return on the project, and identified confidence intervals • Avoid projects that would be taken without statistical analysis – If the return for the project is too small to justify the undertaking given a broad confidence interval, do not do the project • Take on projects that normally would be skipped – If the confidence intervals are very narrow, the estimate should be considered “a lock” and the ROI requirements can be less stringent
  • 8. Project Management: Case Study Implemented at a Fortune 100 Firm • Large areas of low correlation • The pool of independent variables was limited by data availability and politics • Instead of a statistician, an expensive, automated software package was used – No second-order variables and no cross products (software limitation) – No discretion in p-value measurement ( 0.051 gets just as rejected as 6.051 ) – High investment leads to sunk-cost fallacy, so statistical solutions are not being investigated and root cause of low correlation isn’t getting identified
  • 9. Develop a New Offering MIT Sloan Management Review, Winter 2004: “The Seller’s Hidden Advantage” Toyota: Benchmarked all of its suppliers and made them all more efficient, which made the suppliers more competitive, which resulted in better prices for Toyota Orica: Developed a 20 variable model from customer use of their explosives that made each subsequent customer more accurate in their purchase and use of Orica explosives
  • 10. Develop a New Offering IT Consulting Firm: Benchmark your clients IT services • Examine common services provided by each client – this is very different and more difficult than manufacturing! • Build a model based on available factors: – Number of employees, locations, costs, level of service, etc. • Results are a great starting point, but isn’t the holy grail – Statistically suggesting costs are above benchmark prediction could be indicative of a level of service not provided at other clients – but it could also mean that there is inefficiency afoot
  • 11. Tips and Tricks 1. Find out where the business unit or company makes or spends a great deal of money 2. Find out what data can be had 3. Build a model on a sample if data is hard to get or is large 4. Ask for funding and justify with new, interesting results 5. Use the project in this class 6. Avoid using statistics terms (95% confident, Regression, etc.) 7. Expect surprising ignorance
  • 12. The End And they all lived happily, ever after…
  • 13. Appendix: Clustering Clustering data is using an algorithm to break a large data set into smaller data sets: This data set splits well into two clusters – it isn’t likely that real-life data sets will be this contrived
  • 14. Appendix: Clustering Regression will not paint a good picture of the data as a whole: Splitting the data into the appropriate clusters can lead to more accurate modeling
  • 15. Appendix: Why Brains Beat Tools The process to implement a data mining/business intelligence tool: 1. Collect and organize data – usually in a repeatable, programmatic (automatic) fashion 2. Purchase licenses and install and configure tool set – usually start with a sample of the data from step 1 3. Examine results and tune tools 4. Act on results Before any of this happens, a statistician should look to see if there is actionable data relationships – steps 1 and 2 are very expensive!!
  • 16. Appendix: Why Brains Beat Tools Possible reasons for detecting a weak relationship: 1. Software does not perform clustering 2. Software does not examine 2nd order or cross product factors 3. Software incorrectly acts on multicollinearity 4. Tool set is improperly tuned/configured 5. Data aggregation mechanism is not functioning properly 6. The data is too random Without a statistician, all six of these reasons look the same!