SlideShare a Scribd company logo
Data Science Consulting
or
Science meets business, again.
Third time a charm?
David Johnston
ThoughtWorks
March 17, 2014
Postdocs drive the worlds economy –
Young scientists become…
Professors
ThoughtWorks
• Global software consulting company
• HQ in Chicago. Major offices in NY, San Fran, Dallas, India, Brazil,
Australia, China - over 30 worldwide.
• Privately owned by Roy Singham
• Flat hierarchy of passionate people
Agile Analytics at TW
• Practiced started 2011
• Led by Ken Collier and John Spens
• About a dozen people involved
Key Theme of Ken’s book
• BI, data warehousing and analytics has largely
missed the revolution in agile methodologies. We
can do it differently.
• Probabilistic modeling
• Predictive analytics / machine learning
• Advanced BI, prescriptive analysis
• Big Data technologies
• Advanced algorithms and data structures, streaming
What we do
Case Studies:
Recommendation systems for a
retailer customer. Our Bayesian
model (blue)
Healthcare group purchasing
Organization
• Problem is matching medical
products by text description. Fuzzy
matching.
• In place solution. Rules engine.
Complicated. 60% match rate, one
day required for run
• In 3 weeks we delivered a
lightweight solution in python. >80%
match rate, runtime of a few
minutes (on a laptop).
• Later moved to Elastic Search for
even better results.
What exactly is data
science?
• Is this really new? - Not really
• Does the term “data science” make any sense? - Not really
but so what?
• Is it just a fad? Over-hyped? – No, some times.
• Why did this term just become popular a few years back? -
Productivity
• Where is this going?
• Should scientists/engineers/math-types really go and make
a career doing this? Yes for most
Is it new?
Of course not
Combination of many subjects:
• Mathematics and statistics – probability theory
• Machine learning
• Computer science – algorithms, data structures, data bases
• Operations research - process optimization
• Business consulting
• Software development
Where we have seen this before?
Business: Finance, Insurance, Sports, Government accounting, Retail,
Google
Science: Physics, Astronomy, Biology
Why now? : Data scientist productivity growth
crosses critical threshold for new job creation
• Salary increase over postdoc requires
~2.5 x
• Salaries in Industry are set by
productivity and supply/demand
• Crossing the threshold in productivity
Leads to new job creation
• Eventual slowing in productivity
and/or changes in supply/demand
will eventually end this burst in job
creation
• Nothing magical happened in 2005!
Productivity Drivers for Data-
science
Long time scale
• Compute , Moore’s law
• The internet (duh!)
• HD and RAM price drop
• Science learns to deal with
Big Data
• Growing importance of
statistics
More recent
• Git , code –sharing
• Libraries machine learning
• Python/ R Open source
• Hadoop and ecosystem
• The Cloud, AWS
• NoSQL databases, in-mem
• Growing community in “data
science” cohesion, feedback
effects of popularity
So what is data science now
My definition of data science:
An interdisciplinary field utilizing statistics, computer
science and the methods of scientific research in
areas outside of science.
Misses only the first one
Are we there yet?
Overhyped, underhyped, mis-
hyped?
• No, probably not
• Productivity growth is real
• We are solving important
problems. Plenty left.
• Big Data will probably
peak in the hype cycle
before data science
• Just watched my first
analytics commercial. IBM.
“Math is not a fad”
- Aaron Erickson , ThoughtWorks
Case study : Particle Physics
Data reduction par excellence
• 600 million collisions per second
• Most are boring events and are not saved
• Save ~ 100 petabytes per year
Determine existence of Higg-boson – 1 bit
Measure it’s mass to 1% ~ 1 byte
Data = Exabytes
Information = 9 bits
Compression 10^18
Goal
$9 billion per byte!
Data science consulting
The good
• Always something new,
always learning.
• Exposed to many different
people.
• Get to see how everything
works on the inside.
• See the world!
• Low career risk but still
fun.
The bad
• Your clients choose you
• People problems often
more important than math
problems
• Travel can be extreme
• Your great ideas will rarely
be credited to you.
Challenges in data science
consulting
• Business’s don’t yet
understand the terminology,
process or techniques. Much
teaching involved
• Visionary CEO sends you into
a not-so-visionary environment
• Problems can be vague
• Communication with business
stakeholders takes much of
your time
• We are still developing an
effective model. More than just
agile techniques
• “Built us a platform for analytics
so we can become a data-
driven company” Non-sequitur
• Wanting prediction of the un-
predicable
• Attempting to use ML on noisy
data
• When incentives and opinions
are all over the map
• Convinced that the problem
has been solved 20 years ago.
E.g. linear regression,
segmentation model, SAS.
Common challenges Red flags
Keep offering up bold
ideas
• Look for ways for major
productivity enhancement
• Keep up on cutting-edge
literature in stats/ML
• All my best ideas for web-
apps are now successful
companies.
• Everybody laughed at
them! Data science is NOT going to be
productized.
FIN

More Related Content

What's hot

Data Collection for Research Based Organizations to Aid Research!
Data Collection for Research Based Organizations to Aid Research!Data Collection for Research Based Organizations to Aid Research!
Data Collection for Research Based Organizations to Aid Research!
NTEN
 
Dr Bonnie Cheuk IDC Future of Work Keynote: Workforce Transformation Human Ma...
Dr Bonnie Cheuk IDC Future of Work Keynote: Workforce Transformation Human Ma...Dr Bonnie Cheuk IDC Future of Work Keynote: Workforce Transformation Human Ma...
Dr Bonnie Cheuk IDC Future of Work Keynote: Workforce Transformation Human Ma...
Bonnie Cheuk
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)
mark madsen
 

What's hot (20)

Data Collection for Research Based Organizations to Aid Research!
Data Collection for Research Based Organizations to Aid Research!Data Collection for Research Based Organizations to Aid Research!
Data Collection for Research Based Organizations to Aid Research!
 
Agile Big Data Practices
Agile Big Data PracticesAgile Big Data Practices
Agile Big Data Practices
 
Week4 Day3
Week4 Day3Week4 Day3
Week4 Day3
 
Digital Economics
Digital EconomicsDigital Economics
Digital Economics
 
Creating A Company Wide Data Science Learning Environment
Creating A Company Wide Data Science Learning EnvironmentCreating A Company Wide Data Science Learning Environment
Creating A Company Wide Data Science Learning Environment
 
Accretive Health - Quality Management in Health Care
Accretive Health - Quality Management in Health CareAccretive Health - Quality Management in Health Care
Accretive Health - Quality Management in Health Care
 
Agile Analytics
Agile AnalyticsAgile Analytics
Agile Analytics
 
Data Driven Talk - Salt Lake City
Data Driven Talk - Salt Lake CityData Driven Talk - Salt Lake City
Data Driven Talk - Salt Lake City
 
Data Science Overview
Data Science OverviewData Science Overview
Data Science Overview
 
KDD 2019 IADSS Workshop - How Data Scientists can bridge the gap between Data...
KDD 2019 IADSS Workshop - How Data Scientists can bridge the gap between Data...KDD 2019 IADSS Workshop - How Data Scientists can bridge the gap between Data...
KDD 2019 IADSS Workshop - How Data Scientists can bridge the gap between Data...
 
Ian Cameron
Ian CameronIan Cameron
Ian Cameron
 
Introduction
IntroductionIntroduction
Introduction
 
Data Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data AnalystData Science Day New York: Data Scientist - The New Data Analyst
Data Science Day New York: Data Scientist - The New Data Analyst
 
Bit by Bit: Effective Use of People, Processes and Computer Technology in the...
Bit by Bit: Effective Use of People, Processes and Computer Technology in the...Bit by Bit: Effective Use of People, Processes and Computer Technology in the...
Bit by Bit: Effective Use of People, Processes and Computer Technology in the...
 
Dr Bonnie Cheuk IDC Future of Work Keynote: Workforce Transformation Human Ma...
Dr Bonnie Cheuk IDC Future of Work Keynote: Workforce Transformation Human Ma...Dr Bonnie Cheuk IDC Future of Work Keynote: Workforce Transformation Human Ma...
Dr Bonnie Cheuk IDC Future of Work Keynote: Workforce Transformation Human Ma...
 
Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)Bi isn't big data and big data isn't BI (updated)
Bi isn't big data and big data isn't BI (updated)
 
Performance support
Performance supportPerformance support
Performance support
 
Technologies and Innovation - Introduction
Technologies and Innovation - IntroductionTechnologies and Innovation - Introduction
Technologies and Innovation - Introduction
 
Data science for business leaders executive program
Data science for business leaders executive programData science for business leaders executive program
Data science for business leaders executive program
 
Patterson Consulting: What is Artificial Intelligence?
Patterson Consulting: What is Artificial Intelligence?Patterson Consulting: What is Artificial Intelligence?
Patterson Consulting: What is Artificial Intelligence?
 

Similar to Data Science Consulting at ThoughtWorks -- NYC Open Data Meetup

Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...
Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...
Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...
Sandra Fernandes
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
Thinkful
 

Similar to Data Science Consulting at ThoughtWorks -- NYC Open Data Meetup (20)

Data Science towards the Digital Enterprise
Data Science towards the Digital EnterpriseData Science towards the Digital Enterprise
Data Science towards the Digital Enterprise
 
How to crack Big Data and Data Science roles
How to crack Big Data and Data Science rolesHow to crack Big Data and Data Science roles
How to crack Big Data and Data Science roles
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
Krish Swamy + Balaji Gopalakrishnan, Wells Fargo - Building a World Class Dat...
 
Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...
Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...
Nasscomilf2014 thedigitalenterprise-bigdataandanalyticsleadtheway-thomashdave...
 
DataScience_introduction.pdf
DataScience_introduction.pdfDataScience_introduction.pdf
DataScience_introduction.pdf
 
Adopting data-driven strategies in learning analytics
Adopting data-driven strategies in learning analyticsAdopting data-driven strategies in learning analytics
Adopting data-driven strategies in learning analytics
 
Big data
Big dataBig data
Big data
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
Big Data Innovation
Big Data InnovationBig Data Innovation
Big Data Innovation
 
Big Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar SemwalBig Data By Vijay Bhaskar Semwal
Big Data By Vijay Bhaskar Semwal
 
Five Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data StrategyFive Attributes to a Successful Big Data Strategy
Five Attributes to a Successful Big Data Strategy
 
Data science intro deck
Data science intro deckData science intro deck
Data science intro deck
 
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & ImpactData Science - An emerging Stream of Science with its Spreading Reach & Impact
Data Science - An emerging Stream of Science with its Spreading Reach & Impact
 
Which institute is best for data science?
Which institute is best for data science?Which institute is best for data science?
Which institute is best for data science?
 
Best Selenium certification course
Best Selenium certification courseBest Selenium certification course
Best Selenium certification course
 
Data science training in hyd ppt (1)
Data science training in hyd ppt (1)Data science training in hyd ppt (1)
Data science training in hyd ppt (1)
 
Data science training institute in hyderabad
Data science training institute in hyderabadData science training institute in hyderabad
Data science training institute in hyderabad
 
Data science training in Hyderabad
Data science  training in HyderabadData science  training in Hyderabad
Data science training in Hyderabad
 
Data science training Hyderabad
Data science training HyderabadData science training Hyderabad
Data science training Hyderabad
 

Recently uploaded

一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 

Recently uploaded (20)

Uber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis ReportUber Ride Supply Demand Gap Analysis Report
Uber Ride Supply Demand Gap Analysis Report
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?How can I successfully sell my pi coins in Philippines?
How can I successfully sell my pi coins in Philippines?
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
2024-05-14 - Tableau User Group - TC24 Hot Topics - Tableau Pulse and Einstei...
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
Using PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDBUsing PDB Relocation to Move a Single PDB to Another Existing CDB
Using PDB Relocation to Move a Single PDB to Another Existing CDB
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 

Data Science Consulting at ThoughtWorks -- NYC Open Data Meetup

  • 1. Data Science Consulting or Science meets business, again. Third time a charm? David Johnston ThoughtWorks March 17, 2014
  • 2. Postdocs drive the worlds economy – Young scientists become… Professors
  • 3. ThoughtWorks • Global software consulting company • HQ in Chicago. Major offices in NY, San Fran, Dallas, India, Brazil, Australia, China - over 30 worldwide. • Privately owned by Roy Singham • Flat hierarchy of passionate people
  • 4. Agile Analytics at TW • Practiced started 2011 • Led by Ken Collier and John Spens • About a dozen people involved Key Theme of Ken’s book • BI, data warehousing and analytics has largely missed the revolution in agile methodologies. We can do it differently. • Probabilistic modeling • Predictive analytics / machine learning • Advanced BI, prescriptive analysis • Big Data technologies • Advanced algorithms and data structures, streaming What we do
  • 5. Case Studies: Recommendation systems for a retailer customer. Our Bayesian model (blue) Healthcare group purchasing Organization • Problem is matching medical products by text description. Fuzzy matching. • In place solution. Rules engine. Complicated. 60% match rate, one day required for run • In 3 weeks we delivered a lightweight solution in python. >80% match rate, runtime of a few minutes (on a laptop). • Later moved to Elastic Search for even better results.
  • 6. What exactly is data science? • Is this really new? - Not really • Does the term “data science” make any sense? - Not really but so what? • Is it just a fad? Over-hyped? – No, some times. • Why did this term just become popular a few years back? - Productivity • Where is this going? • Should scientists/engineers/math-types really go and make a career doing this? Yes for most
  • 7. Is it new? Of course not Combination of many subjects: • Mathematics and statistics – probability theory • Machine learning • Computer science – algorithms, data structures, data bases • Operations research - process optimization • Business consulting • Software development Where we have seen this before? Business: Finance, Insurance, Sports, Government accounting, Retail, Google Science: Physics, Astronomy, Biology
  • 8. Why now? : Data scientist productivity growth crosses critical threshold for new job creation • Salary increase over postdoc requires ~2.5 x • Salaries in Industry are set by productivity and supply/demand • Crossing the threshold in productivity Leads to new job creation • Eventual slowing in productivity and/or changes in supply/demand will eventually end this burst in job creation • Nothing magical happened in 2005!
  • 9. Productivity Drivers for Data- science Long time scale • Compute , Moore’s law • The internet (duh!) • HD and RAM price drop • Science learns to deal with Big Data • Growing importance of statistics More recent • Git , code –sharing • Libraries machine learning • Python/ R Open source • Hadoop and ecosystem • The Cloud, AWS • NoSQL databases, in-mem • Growing community in “data science” cohesion, feedback effects of popularity
  • 10. So what is data science now My definition of data science: An interdisciplinary field utilizing statistics, computer science and the methods of scientific research in areas outside of science. Misses only the first one
  • 11. Are we there yet? Overhyped, underhyped, mis- hyped? • No, probably not • Productivity growth is real • We are solving important problems. Plenty left. • Big Data will probably peak in the hype cycle before data science • Just watched my first analytics commercial. IBM. “Math is not a fad” - Aaron Erickson , ThoughtWorks
  • 12. Case study : Particle Physics Data reduction par excellence • 600 million collisions per second • Most are boring events and are not saved • Save ~ 100 petabytes per year Determine existence of Higg-boson – 1 bit Measure it’s mass to 1% ~ 1 byte Data = Exabytes Information = 9 bits Compression 10^18 Goal $9 billion per byte!
  • 13. Data science consulting The good • Always something new, always learning. • Exposed to many different people. • Get to see how everything works on the inside. • See the world! • Low career risk but still fun. The bad • Your clients choose you • People problems often more important than math problems • Travel can be extreme • Your great ideas will rarely be credited to you.
  • 14. Challenges in data science consulting • Business’s don’t yet understand the terminology, process or techniques. Much teaching involved • Visionary CEO sends you into a not-so-visionary environment • Problems can be vague • Communication with business stakeholders takes much of your time • We are still developing an effective model. More than just agile techniques • “Built us a platform for analytics so we can become a data- driven company” Non-sequitur • Wanting prediction of the un- predicable • Attempting to use ML on noisy data • When incentives and opinions are all over the map • Convinced that the problem has been solved 20 years ago. E.g. linear regression, segmentation model, SAS. Common challenges Red flags
  • 15. Keep offering up bold ideas • Look for ways for major productivity enhancement • Keep up on cutting-edge literature in stats/ML • All my best ideas for web- apps are now successful companies. • Everybody laughed at them! Data science is NOT going to be productized. FIN