Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
The Evolving Data Science Landscape
Kyle Polich
Data Science, Inc.
LIGO
One of the most
advanced metrology
projects; one of the
more precise
instruments ever
created
2
Measures changes
1 / ...
LIGO
According to Scientific American, cost $1.1 billion over last 40 years
3
Turned on in 2002Construction took
8 years
M...
Value Delivery
“Despite the hype of big data, a majority of the business value
produced by data still happens in this more...
Bias, Variance, Heterogeneity
“Up until late last year, tracking would be done unpredictably after
almost every release.”
...
Excitement Scale
6
Excitement hierarchy
7
Report generation
ML on 10k observations, 20 features
ML on 1 billion observations, 1500 features
M...
Measures of effectiveness?
8
F1-scoreAccuracy and
precision
Area under an ROC
curve (AUC)
Bias-variance
tradeoff
Measure of effectiveness
 Return on Investment (ROI)
 Revenue savings from automation
 Lift
 Impact Factor*
 Causal I...
Goodhart’s Law
10
When a measure becomes a target, it ceases to be a good measure
Value of Information
11
Expected Revenue
if information know=Value
(information)
Expected revenue if
Information NOT
know-...
Iteration and precision
Early objectives
• Maximize conversion rate
• Send / don’t send offer
• Raise / lower budget
• Pre...
Business Conversations
Optimize within the constraints of your product
Discuss opportunities with product owner
13
The Evolution of Software Engineering
Angular developer, UI/UX architect, AWS Infrastructure engineer, Spring
integration ...
Scope of Data Science
15
Pre
2008
2008-
2016
2016-

Statistician, ML researcher, etc.
Data scientist, data engineer
2016 ...
An arbitrary timeline
1950s
1993
1993
1995
2001
2004
2007
2010
2011
2014
2015
16
Perceptron algorithm
R first appearance
C...
17
Data science community
 Meetups
 Events
 MOOCs
 Bootcamps
18
 Podcasts
 Blogs
 Books
DataForward Event Series
DataForward is a gathering of professionals across industries who are passionate about data
scien...
DataScience
20
facebook.com/datascience
twitter.com/datascienceinc
linkedin.com/company/datascience-inc
(310) 579 - 6200
Upcoming SlideShare
Loading in …5
×

Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscape, Kyle Polich - Principal Consulting Engineer, Datascience Inc

496 views

Published on

The impact of data science on business is undeniable, and the value it provides is growing without signs of slowing. To keep up with this rapidly evolving technology landscape, data scientists must adapt and specialize through continuous learning. This talk focuses on how they can do that in a way that maximizes the positive impact data science will have on their organization.

Published in: Technology
  • Be the first to comment

Big Data Day LA 2016/ Data Science Track - The Evolving Data Science Landscape, Kyle Polich - Principal Consulting Engineer, Datascience Inc

  1. 1. The Evolving Data Science Landscape Kyle Polich Data Science, Inc.
  2. 2. LIGO One of the most advanced metrology projects; one of the more precise instruments ever created 2 Measures changes 1 / 10,000th the width of a proton 4km interferometer to measure gravitational fluctuations from cosmic explosions
  3. 3. LIGO According to Scientific American, cost $1.1 billion over last 40 years 3 Turned on in 2002Construction took 8 years Managed by ~1k scientists Gravity waves detected 2016
  4. 4. Value Delivery “Despite the hype of big data, a majority of the business value produced by data still happens in this more traditional setting, and we would like to support these communities.” - Szilard Pafka (Dec, 2014) announcing new - DW/BI/Analytics Meetup 4
  5. 5. Bias, Variance, Heterogeneity “Up until late last year, tracking would be done unpredictably after almost every release.” “We changed the way we capture that last April and again this January.” “We have four divisions that all do their analytics differently.” 5
  6. 6. Excitement Scale 6
  7. 7. Excitement hierarchy 7 Report generation ML on 10k observations, 20 features ML on 1 billion observations, 1500 features ML on 1 million observations, 100 features 1000 node clustered computing A/B testing High performance computing Econometric modeling for adtech Deep learning SQL queriesLogistic regression Off the shelf OpenCV implementation Online multi-armed bandit Online streaming algorithms Commercial opportunities for quantum computing
  8. 8. Measures of effectiveness? 8 F1-scoreAccuracy and precision Area under an ROC curve (AUC) Bias-variance tradeoff
  9. 9. Measure of effectiveness  Return on Investment (ROI)  Revenue savings from automation  Lift  Impact Factor*  Causal Impact  Value of information 9
  10. 10. Goodhart’s Law 10 When a measure becomes a target, it ceases to be a good measure
  11. 11. Value of Information 11 Expected Revenue if information know=Value (information) Expected revenue if Information NOT know- - Cost of Information
  12. 12. Iteration and precision Early objectives • Maximize conversion rate • Send / don’t send offer • Raise / lower budget • Predict number of machine failures • Find available service provider Late objectives • Maximize lifetime value • Personalized offer • Real time bid optimization • Optimize factory environmental controls • Global service pairing optimization 12
  13. 13. Business Conversations Optimize within the constraints of your product Discuss opportunities with product owner 13
  14. 14. The Evolution of Software Engineering Angular developer, UI/UX architect, AWS Infrastructure engineer, Spring integration manager, DevOps engineer, Change management / continuous integration specialist, Security engineer, Unity developer, Serverless evangelist, mobile developer, wordpress developer, Accessibility specialist 14 Pre 1990s 1990s 2000s DBA, VLSI engineer, Embedded systems programmer, Front end, back end, QA “Computer Programmer”
  15. 15. Scope of Data Science 15 Pre 2008 2008- 2016 2016-  Statistician, ML researcher, etc. Data scientist, data engineer 2016 – Future ???
  16. 16. An arbitrary timeline 1950s 1993 1993 1995 2001 2004 2007 2010 2011 2014 2015 16 Perceptron algorithm R first appearance C4.5 described False discovery rates Weka MapReduce paper Scikit learn initial release Theano h2o launched Spark initial release, XGBoost on github Tensorflow
  17. 17. 17
  18. 18. Data science community  Meetups  Events  MOOCs  Bootcamps 18  Podcasts  Blogs  Books
  19. 19. DataForward Event Series DataForward is a gathering of professionals across industries who are passionate about data science, big data technologies, and data driven businesses. The group meets once a month at keynote events featuring talks and presentations by industry leaders. The DataForward events are hosted and organized by DataScience Inc, and livestreamed to audiences all over the world. The monthly events are dedicated to key topics facing data-driven organizations- disruptive technologies, data-driven culture, investment trends, and insights into how existing organizations can unlock the value from their data. To signup for our first keynote event in August, please visit meetup.com/DataForward. 19
  20. 20. DataScience 20 facebook.com/datascience twitter.com/datascienceinc linkedin.com/company/datascience-inc (310) 579 - 6200

×