SlideShare a Scribd company logo
1 of 16
Statistics:
Your Most Important Class
Aaron Sankey
About Me
• Started CEMBA in 2012, switched to part-time after year 1
• Graduated from Carlson in Spring of 2014
• VP, Team Manager at Bank of America within the IT department
• By graduation, presented three stats-based projects, each should
improve net income by $1M to $25M with investment < $500K
Statistics can be immediately monetized!
Sales and Marketing
The basic idea:
Customer purchasing should be predictable based on other customer’s past
purchasing
Possible independent variables for regression:
• Frequency of purchase (of any product, or of each product)
• Total purchases (normalized by corporate earnings or zip code
average income)
• Days since last purchase
• Preferred contact method
• Advertisement used
Sales and Marketing
• Remember: Use samples and verify on the whole!
• Use “clustering”, if you can, to identify similar customers:
http://www.jmp.com/support/help/K-Means_Clustering.shtml
http://www.jmp.com/support/help/Hierarchical_Clustering.shtml#110036
• Correlation will provide customer targets with higher sales closure
rates and, consequently, targets that are not profitable
• Acceptable p-values and large betas on “cross products” of
independent variables ( i.e. ϒ = βχiχj ) could indicate product
synergies/interactions
New York Times, February 19, 2012 (About Target):
“Psst, You in Aisle 5”
Project Management
The basic idea:
Actual Project Cost should be a function of, at least, Project Estimate
Possible independent variables for regression:
• Estimated project cost
• Percentage of work done by contractors and contractor hourly rate
(normalized by employee salary)
• How many silos/which silos are involved
• Expected duration (calendar time or hours of work) of the effort
• Implementing standard tools vs. customization
Project Management
Possible results:
• Little to no correlation between estimates and actuals
– Estimation process is a waste of money!
• Reasonable correlation
– Identify subsets where correlation is weaker than most and improve
estimation process
• High correlation
– Could provide possible areas for improvement (look for high betas)
– Could replace/augment portions of the estimation process (enter in all of the
independent variables and generate results)
– Could also mean “cooked” numbers 
Project Management
Given reasonable or better correlation, expected return on the project,
and identified confidence intervals
• Avoid projects that would be taken without statistical analysis
– If the return for the project is too small to justify the undertaking given a
broad confidence interval, do not do the project
• Take on projects that normally would be skipped
– If the confidence intervals are very narrow, the estimate should be
considered “a lock” and the ROI requirements can be less stringent
Project Management: Case Study
Implemented at a Fortune 100 Firm
• Large areas of low correlation
• The pool of independent variables was limited by data availability
and politics
• Instead of a statistician, an expensive, automated software package
was used
– No second-order variables and no cross products (software limitation)
– No discretion in p-value measurement ( 0.051 gets just as rejected as 6.051 )
– High investment leads to sunk-cost fallacy, so statistical solutions are not
being investigated and root cause of low correlation isn’t getting identified
Develop a New Offering
MIT Sloan Management Review, Winter 2004:
“The Seller’s Hidden Advantage”
Toyota:
Benchmarked all of its suppliers and made them all more efficient, which
made the suppliers more competitive, which resulted in better prices for
Toyota
Orica:
Developed a 20 variable model from customer use of their explosives that
made each subsequent customer more accurate in their purchase and use
of Orica explosives
Develop a New Offering
IT Consulting Firm:
Benchmark your clients IT services
• Examine common services provided by each client – this is very
different and more difficult than manufacturing!
• Build a model based on available factors:
– Number of employees, locations, costs, level of service, etc.
• Results are a great starting point, but isn’t the holy grail
– Statistically suggesting costs are above benchmark prediction could be
indicative of a level of service not provided at other clients – but it could
also mean that there is inefficiency afoot
Tips and Tricks
1. Find out where the business unit or company makes or spends a
great deal of money
2. Find out what data can be had
3. Build a model on a sample if data is hard to get or is large
4. Ask for funding and justify with new, interesting results
5. Use the project in this class
6. Avoid using statistics terms (95% confident, Regression, etc.)
7. Expect surprising ignorance
The End
And they all lived happily, ever after…
Appendix: Clustering
Clustering data is using an algorithm to break a large data set into
smaller data sets:
This data set splits well into two clusters – it isn’t likely that real-life
data sets will be this contrived
Appendix: Clustering
Regression will not paint a good picture of the data as a whole:
Splitting the data into the appropriate clusters can lead to more
accurate modeling
Appendix: Why Brains Beat Tools
The process to implement a data mining/business intelligence tool:
1. Collect and organize data – usually in a repeatable, programmatic
(automatic) fashion
2. Purchase licenses and install and configure tool set – usually start
with a sample of the data from step 1
3. Examine results and tune tools
4. Act on results
Before any of this happens, a statistician should look to see if there is
actionable data relationships – steps 1 and 2 are very expensive!!
Appendix: Why Brains Beat Tools
Possible reasons for detecting a weak relationship:
1. Software does not perform clustering
2. Software does not examine 2nd order or cross product factors
3. Software incorrectly acts on multicollinearity
4. Tool set is improperly tuned/configured
5. Data aggregation mechanism is not functioning properly
6. The data is too random
Without a statistician, all six of these reasons look the same!

More Related Content

What's hot

Market Research Process
Market Research ProcessMarket Research Process
Market Research ProcessRaymond99
 
DMAI Analytics Solutions Guide
DMAI Analytics Solutions GuideDMAI Analytics Solutions Guide
DMAI Analytics Solutions GuideDan Meyer
 
How Apps Create Return on Investment (ROI) for Analytics
How Apps Create Return on Investment (ROI) for AnalyticsHow Apps Create Return on Investment (ROI) for Analytics
How Apps Create Return on Investment (ROI) for AnalyticsRiaktr
 
What is Analytics for Your Business?
What is Analytics for Your Business?What is Analytics for Your Business?
What is Analytics for Your Business?tricia eunice baylon
 
Ka connect -Where's the Intelligence in BI
Ka connect -Where's the Intelligence in BIKa connect -Where's the Intelligence in BI
Ka connect -Where's the Intelligence in BIJJ Brantingham
 
Data Science Roadmap
Data Science RoadmapData Science Roadmap
Data Science RoadmapSupportGCI
 
Executive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationExecutive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationMetre22
 
Conversion Rate Optimization for Business Growth
Conversion Rate Optimization for Business GrowthConversion Rate Optimization for Business Growth
Conversion Rate Optimization for Business GrowthReapDigital
 
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...Megan van der Hoven
 
Predictive Conversion Modeling - Lifting Web Analytics to the next level
Predictive Conversion Modeling - Lifting Web Analytics to the next levelPredictive Conversion Modeling - Lifting Web Analytics to the next level
Predictive Conversion Modeling - Lifting Web Analytics to the next levelPetri Mertanen
 
When is Driver-Based Modeling Applicable
When is Driver-Based Modeling ApplicableWhen is Driver-Based Modeling Applicable
When is Driver-Based Modeling ApplicableNilly Essaides
 
7 noble paul wood collections (4 3) revise 51015
7 noble paul wood collections (4 3) revise 510157 noble paul wood collections (4 3) revise 51015
7 noble paul wood collections (4 3) revise 51015CCR-interactive
 
Measuring the effectiveness of your digital assets
Measuring the effectiveness of your digital assetsMeasuring the effectiveness of your digital assets
Measuring the effectiveness of your digital assetsMedullan
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingNoelle Ibrahim
 
Business analysis 0.0 ver 2.1
Business analysis 0.0 ver 2.1Business analysis 0.0 ver 2.1
Business analysis 0.0 ver 2.1Naseha Sameen
 
Identifying and Measuring KPIs
Identifying and Measuring KPIsIdentifying and Measuring KPIs
Identifying and Measuring KPIsBlackbaud
 
Customer Insight Powerpoint Presentation Slides
Customer Insight Powerpoint Presentation SlidesCustomer Insight Powerpoint Presentation Slides
Customer Insight Powerpoint Presentation SlidesSlideTeam
 
Making advanced analytics work for you
Making advanced analytics work for youMaking advanced analytics work for you
Making advanced analytics work for youGirish Nookella
 

What's hot (20)

1115 track1 harmanos
1115 track1 harmanos1115 track1 harmanos
1115 track1 harmanos
 
Market Research Process
Market Research ProcessMarket Research Process
Market Research Process
 
DMAI Analytics Solutions Guide
DMAI Analytics Solutions GuideDMAI Analytics Solutions Guide
DMAI Analytics Solutions Guide
 
How Apps Create Return on Investment (ROI) for Analytics
How Apps Create Return on Investment (ROI) for AnalyticsHow Apps Create Return on Investment (ROI) for Analytics
How Apps Create Return on Investment (ROI) for Analytics
 
What is Analytics for Your Business?
What is Analytics for Your Business?What is Analytics for Your Business?
What is Analytics for Your Business?
 
Ka connect -Where's the Intelligence in BI
Ka connect -Where's the Intelligence in BIKa connect -Where's the Intelligence in BI
Ka connect -Where's the Intelligence in BI
 
1530 track2 reid
1530 track2 reid1530 track2 reid
1530 track2 reid
 
Data Science Roadmap
Data Science RoadmapData Science Roadmap
Data Science Roadmap
 
Executive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic ExperimentationExecutive Briefing: Introduction to Strategic Experimentation
Executive Briefing: Introduction to Strategic Experimentation
 
Conversion Rate Optimization for Business Growth
Conversion Rate Optimization for Business GrowthConversion Rate Optimization for Business Growth
Conversion Rate Optimization for Business Growth
 
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
4th Enhancing Customer Experience ,Loyalty and Retention in Telecom 4th-5th A...
 
Predictive Conversion Modeling - Lifting Web Analytics to the next level
Predictive Conversion Modeling - Lifting Web Analytics to the next levelPredictive Conversion Modeling - Lifting Web Analytics to the next level
Predictive Conversion Modeling - Lifting Web Analytics to the next level
 
When is Driver-Based Modeling Applicable
When is Driver-Based Modeling ApplicableWhen is Driver-Based Modeling Applicable
When is Driver-Based Modeling Applicable
 
7 noble paul wood collections (4 3) revise 51015
7 noble paul wood collections (4 3) revise 510157 noble paul wood collections (4 3) revise 51015
7 noble paul wood collections (4 3) revise 51015
 
Measuring the effectiveness of your digital assets
Measuring the effectiveness of your digital assetsMeasuring the effectiveness of your digital assets
Measuring the effectiveness of your digital assets
 
Managing uncertainty in ai performance target setting
Managing uncertainty in ai performance target settingManaging uncertainty in ai performance target setting
Managing uncertainty in ai performance target setting
 
Business analysis 0.0 ver 2.1
Business analysis 0.0 ver 2.1Business analysis 0.0 ver 2.1
Business analysis 0.0 ver 2.1
 
Identifying and Measuring KPIs
Identifying and Measuring KPIsIdentifying and Measuring KPIs
Identifying and Measuring KPIs
 
Customer Insight Powerpoint Presentation Slides
Customer Insight Powerpoint Presentation SlidesCustomer Insight Powerpoint Presentation Slides
Customer Insight Powerpoint Presentation Slides
 
Making advanced analytics work for you
Making advanced analytics work for youMaking advanced analytics work for you
Making advanced analytics work for you
 

Viewers also liked

Gerencia industrial importancia
Gerencia industrial importanciaGerencia industrial importancia
Gerencia industrial importanciaManuel Asuaje
 
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO by Dr N...
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO  by Dr N...SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO  by Dr N...
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO by Dr N...Nirala Jacobi
 

Viewers also liked (6)

Hipotiroidismo
HipotiroidismoHipotiroidismo
Hipotiroidismo
 
Gerencia industrial importancia
Gerencia industrial importanciaGerencia industrial importancia
Gerencia industrial importancia
 
phy212308
phy212308phy212308
phy212308
 
Kapeel Rao CV
Kapeel Rao CVKapeel Rao CV
Kapeel Rao CV
 
2558 project
2558 project 2558 project
2558 project
 
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO by Dr N...
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO  by Dr N...SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO  by Dr N...
SIBO Webinar 2015 - Advances in the Treatment and Management of SIBO by Dr N...
 

Similar to Statistics: Your Most Important Class

WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnWHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnRohitKumar639388
 
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptxit_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptxAbdulelah Aljabri
 
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.Michael Tarnowski
 
Hero Conf London 2018 - Frameworks for Insights and Impact
Hero Conf London 2018 - Frameworks for Insights and ImpactHero Conf London 2018 - Frameworks for Insights and Impact
Hero Conf London 2018 - Frameworks for Insights and ImpactWijnand Meijer
 
Operationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BIOperationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BICCG
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxAsadAli104515
 
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PMFeature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PMProduct School
 
Business analytics workshop presentation final
Business analytics workshop presentation   finalBusiness analytics workshop presentation   final
Business analytics workshop presentation finalBrian Beveridge
 
Ba process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDBa process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDDebarata Basu
 
The five essential steps to building a data product
The five essential steps to building a data productThe five essential steps to building a data product
The five essential steps to building a data productBirst
 
Financial Modeling
Financial ModelingFinancial Modeling
Financial ModelingJeremy Horn
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Roger Barga
 
Data Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India AnalyticsData Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India AnalyticsAyeshaSharma29
 
Giving Organisations new Capabilities to ask the Right Business Questions
Giving Organisations new Capabilities to ask the Right Business QuestionsGiving Organisations new Capabilities to ask the Right Business Questions
Giving Organisations new Capabilities to ask the Right Business QuestionsOReillyStrata
 
Product Management Playbook product inception to launch
Product Management Playbook   product inception to launchProduct Management Playbook   product inception to launch
Product Management Playbook product inception to launchjhassemer
 
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM InvestmenteSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM InvestmenteSavvy
 

Similar to Statistics: Your Most Important Class (20)

WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjnWHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
WHAT IS BUSINESS ANALYTICS um hj mnjh nit 1 ppt only kjjn
 
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptxit_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
it_Define_Service_Desk_Metrics_That_Matter_Storyboard.pptx
 
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
Metrics & KPIs. Tips To Setup Your Measurement Initiative Right.
 
Hero Conf London 2018 - Frameworks for Insights and Impact
Hero Conf London 2018 - Frameworks for Insights and ImpactHero Conf London 2018 - Frameworks for Insights and Impact
Hero Conf London 2018 - Frameworks for Insights and Impact
 
Operationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BIOperationalizing Customer Analytics with Azure and Power BI
Operationalizing Customer Analytics with Azure and Power BI
 
Hair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptxHair_EOMA_1e_Chap001_PPT.pptx
Hair_EOMA_1e_Chap001_PPT.pptx
 
Analytics
AnalyticsAnalytics
Analytics
 
Benchmarking
BenchmarkingBenchmarking
Benchmarking
 
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PMFeature Prioritization Techniques for an Agile PMs by Microsoft PM
Feature Prioritization Techniques for an Agile PMs by Microsoft PM
 
Business analytics workshop presentation final
Business analytics workshop presentation   finalBusiness analytics workshop presentation   final
Business analytics workshop presentation final
 
Ba process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTDBa process plan- IGATE Global Solutions LTD
Ba process plan- IGATE Global Solutions LTD
 
The five essential steps to building a data product
The five essential steps to building a data productThe five essential steps to building a data product
The five essential steps to building a data product
 
Improvement as Data Analyst.pptx
Improvement as Data Analyst.pptxImprovement as Data Analyst.pptx
Improvement as Data Analyst.pptx
 
Financial Modeling
Financial ModelingFinancial Modeling
Financial Modeling
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Data Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India AnalyticsData Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India Analytics
 
Giving Organisations new Capabilities to ask the Right Business Questions
Giving Organisations new Capabilities to ask the Right Business QuestionsGiving Organisations new Capabilities to ask the Right Business Questions
Giving Organisations new Capabilities to ask the Right Business Questions
 
Product Management Playbook product inception to launch
Product Management Playbook   product inception to launchProduct Management Playbook   product inception to launch
Product Management Playbook product inception to launch
 
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM InvestmenteSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
eSavvy webinar: Top 5+1 Tips of How to Maximize the ROI of a CRM Investment
 
Data mining wrhousing-lec
Data mining wrhousing-lecData mining wrhousing-lec
Data mining wrhousing-lec
 

Statistics: Your Most Important Class

  • 1. Statistics: Your Most Important Class Aaron Sankey
  • 2. About Me • Started CEMBA in 2012, switched to part-time after year 1 • Graduated from Carlson in Spring of 2014 • VP, Team Manager at Bank of America within the IT department • By graduation, presented three stats-based projects, each should improve net income by $1M to $25M with investment < $500K Statistics can be immediately monetized!
  • 3. Sales and Marketing The basic idea: Customer purchasing should be predictable based on other customer’s past purchasing Possible independent variables for regression: • Frequency of purchase (of any product, or of each product) • Total purchases (normalized by corporate earnings or zip code average income) • Days since last purchase • Preferred contact method • Advertisement used
  • 4. Sales and Marketing • Remember: Use samples and verify on the whole! • Use “clustering”, if you can, to identify similar customers: http://www.jmp.com/support/help/K-Means_Clustering.shtml http://www.jmp.com/support/help/Hierarchical_Clustering.shtml#110036 • Correlation will provide customer targets with higher sales closure rates and, consequently, targets that are not profitable • Acceptable p-values and large betas on “cross products” of independent variables ( i.e. ϒ = βχiχj ) could indicate product synergies/interactions New York Times, February 19, 2012 (About Target): “Psst, You in Aisle 5”
  • 5. Project Management The basic idea: Actual Project Cost should be a function of, at least, Project Estimate Possible independent variables for regression: • Estimated project cost • Percentage of work done by contractors and contractor hourly rate (normalized by employee salary) • How many silos/which silos are involved • Expected duration (calendar time or hours of work) of the effort • Implementing standard tools vs. customization
  • 6. Project Management Possible results: • Little to no correlation between estimates and actuals – Estimation process is a waste of money! • Reasonable correlation – Identify subsets where correlation is weaker than most and improve estimation process • High correlation – Could provide possible areas for improvement (look for high betas) – Could replace/augment portions of the estimation process (enter in all of the independent variables and generate results) – Could also mean “cooked” numbers 
  • 7. Project Management Given reasonable or better correlation, expected return on the project, and identified confidence intervals • Avoid projects that would be taken without statistical analysis – If the return for the project is too small to justify the undertaking given a broad confidence interval, do not do the project • Take on projects that normally would be skipped – If the confidence intervals are very narrow, the estimate should be considered “a lock” and the ROI requirements can be less stringent
  • 8. Project Management: Case Study Implemented at a Fortune 100 Firm • Large areas of low correlation • The pool of independent variables was limited by data availability and politics • Instead of a statistician, an expensive, automated software package was used – No second-order variables and no cross products (software limitation) – No discretion in p-value measurement ( 0.051 gets just as rejected as 6.051 ) – High investment leads to sunk-cost fallacy, so statistical solutions are not being investigated and root cause of low correlation isn’t getting identified
  • 9. Develop a New Offering MIT Sloan Management Review, Winter 2004: “The Seller’s Hidden Advantage” Toyota: Benchmarked all of its suppliers and made them all more efficient, which made the suppliers more competitive, which resulted in better prices for Toyota Orica: Developed a 20 variable model from customer use of their explosives that made each subsequent customer more accurate in their purchase and use of Orica explosives
  • 10. Develop a New Offering IT Consulting Firm: Benchmark your clients IT services • Examine common services provided by each client – this is very different and more difficult than manufacturing! • Build a model based on available factors: – Number of employees, locations, costs, level of service, etc. • Results are a great starting point, but isn’t the holy grail – Statistically suggesting costs are above benchmark prediction could be indicative of a level of service not provided at other clients – but it could also mean that there is inefficiency afoot
  • 11. Tips and Tricks 1. Find out where the business unit or company makes or spends a great deal of money 2. Find out what data can be had 3. Build a model on a sample if data is hard to get or is large 4. Ask for funding and justify with new, interesting results 5. Use the project in this class 6. Avoid using statistics terms (95% confident, Regression, etc.) 7. Expect surprising ignorance
  • 12. The End And they all lived happily, ever after…
  • 13. Appendix: Clustering Clustering data is using an algorithm to break a large data set into smaller data sets: This data set splits well into two clusters – it isn’t likely that real-life data sets will be this contrived
  • 14. Appendix: Clustering Regression will not paint a good picture of the data as a whole: Splitting the data into the appropriate clusters can lead to more accurate modeling
  • 15. Appendix: Why Brains Beat Tools The process to implement a data mining/business intelligence tool: 1. Collect and organize data – usually in a repeatable, programmatic (automatic) fashion 2. Purchase licenses and install and configure tool set – usually start with a sample of the data from step 1 3. Examine results and tune tools 4. Act on results Before any of this happens, a statistician should look to see if there is actionable data relationships – steps 1 and 2 are very expensive!!
  • 16. Appendix: Why Brains Beat Tools Possible reasons for detecting a weak relationship: 1. Software does not perform clustering 2. Software does not examine 2nd order or cross product factors 3. Software incorrectly acts on multicollinearity 4. Tool set is improperly tuned/configured 5. Data aggregation mechanism is not functioning properly 6. The data is too random Without a statistician, all six of these reasons look the same!