Applied Data Science
Making insights accessible and actionable
PRESENTED BY
Colin Ristig
Product Manager
colin@yhathq.com
Austin Ogilvie
Founder & CEO
a@yhathq.com
Agenda
Quick Intro to Data Science
Understanding the Value Chain
Designing Your Data Science Process
About Us
We help data scientists
build & deploy apps
Founded 2013
Headquarters in NYC
You may know us from
Data
Science
in 30 seconds
Data Science in 30 Seconds
Broadly…
A multidisciplinary field concerning
problem solving using data,
statistics & software.
“ What distinguishes data science itself from
the tools and techniques is the central goal
of deploying effective decision-making
models to a production environment. ”
Data Science is not “Interesting Research”
~ Nina Zumel & John Mount, Practical Data Science with R
It’s about day-to-day problems
Carl wants to watch
a good movie.
And practical, real-world solutions
Carl wants to watch
a good movie.
Hey, Carl.
Check these out!
Explanation isn’t always important
Carl wants to watch
a good movie.
Carl
Cindy
http://courses.washington.edu/css490/2012.Winter/lecture_slides/08b_collaborative_filtering_1_r1.pdf
Carl would like Frozen
because Cindy liked it.
Data
Science
Challenges
30%
Why?
Key obstacles data science teams face
Lack of Understanding
Key obstacles data science teams face
Difficulty of Experimentation
Hey, Trey. Online sales
are down. What can
we do to keep users
engaged and shopping
carts full?
Trey is asked to “look into something”
I’ll look into it.
Hm...cool. Can
you talk to the
dev team?
Here’s what
we should do:
Trey uncovers a bunch of things we didn’t know
Trey hands his work to deployment engineers
“Throw it over the wall” projects
Execs Data Science Application Developers
Common reasons these types of projects stall
- Unclear benefits
- Skepticism about effectiveness
- Too complex to operationalize
- Too time-consuming
- Unclear how to measure ROI
Data
Science
Value Chain
Making data valuable
Collect and display individual records
Structure, link,
metadata, interact, share
Understand,
infer, learn
Drive
value,
change
Clean, aggregate, visualize
Actions
Predictions
Reports
Charts
Records
Extracting value
from data is like any
other value chain.
Value
Like a raw material,
data has no obvious
utility to start out.
Collect and display individual records
Structure, link,
metadata, interact, share
Understand,
infer, learn
Drive
value,
change
Clean, aggregate, visualize
Actions
Predictions
Reports
Charts
Records
Value
Making data valuable
We make it valuable
through sequential
refinement.
Collect and display individual records
Structure, link,
metadata, interact, share
Understand,
infer, learn
Drive
value,
change
Clean, aggregate, visualize
Actions
Predictions
Reports
Charts
Records
Value
Making data valuable
Cost of Creating that Value
Building data products requires lots of work
Cost of Creating that Value
But most of the value is generated at the end
Cost of Creating that Value
Data Teams
Managers
Customers
Everyone has to see past a lot of challenges
Data
Science
Customers
- Consumers
Several types of customers
Carl wants to watch a good movie.
- Consumers
- App Developers
Cambria needs to call credit models from Salesforce.
Several types of customers
Douglas needs 3 AM server outages to stop.
Several types of customers
- Consumers
- App Developers
- Infrastructure Admins
Gordon wants sales reps calling the hottest leads.
Several types of customers
- Consumers
- App Developers
- Infrastructure Admins
- Sales & Marketing
Data
Science
5 Attributes
for Success
1. Focus on the customer
5 Attributes of Successful Data Science Teams
1. Focus on the customer
2. Identify practical constraints
5 Attributes of Successful Data Science Teams
1. Focus on the customer
2. Identify practical constraints
3. Start small but ship quickly
5 Attributes of Successful Data Science Teams
1. Focus on the customer
2. Identify practical constraints
3. Start small but ship quickly
4. Measure the impact
5 Attributes of Successful Data Science Teams
1. Focus on the customer
2. Identify practical constraints
3. Start small but ship quickly
4. Measure the impact
5. Relentless iteration
5 Attributes of Successful Data Science Teams
1. Focus on the customer
2. Identify practical constraints
3. Start small but ship quickly
4. Measure the impact
5. Relentless iteration
5 Attributes of Successful Data Science Teams
Demo
Hm...cool. Can
you talk to the
dev team?
Here’s what
we should do:
Trey uncovers a bunch of things we didn’t know
Trey hands his work to deployment engineers
“Throw it over the wall” projects
Data Science Application Developers
Deploy Models Faster
Data Science Application Developers
Yhat - Applied Data Science - Feb 2016

Yhat - Applied Data Science - Feb 2016

  • 1.
    Applied Data Science Makinginsights accessible and actionable PRESENTED BY Colin Ristig Product Manager colin@yhathq.com Austin Ogilvie Founder & CEO a@yhathq.com
  • 2.
    Agenda Quick Intro toData Science Understanding the Value Chain Designing Your Data Science Process
  • 3.
  • 4.
    We help datascientists build & deploy apps
  • 5.
  • 6.
    You may knowus from
  • 7.
  • 8.
    Data Science in30 Seconds Broadly… A multidisciplinary field concerning problem solving using data, statistics & software.
  • 9.
    “ What distinguishesdata science itself from the tools and techniques is the central goal of deploying effective decision-making models to a production environment. ” Data Science is not “Interesting Research” ~ Nina Zumel & John Mount, Practical Data Science with R
  • 10.
    It’s about day-to-dayproblems Carl wants to watch a good movie.
  • 11.
    And practical, real-worldsolutions Carl wants to watch a good movie. Hey, Carl. Check these out!
  • 12.
    Explanation isn’t alwaysimportant Carl wants to watch a good movie. Carl Cindy http://courses.washington.edu/css490/2012.Winter/lecture_slides/08b_collaborative_filtering_1_r1.pdf Carl would like Frozen because Cindy liked it.
  • 13.
  • 14.
  • 15.
  • 16.
    Key obstacles datascience teams face Lack of Understanding
  • 17.
    Key obstacles datascience teams face Difficulty of Experimentation
  • 18.
    Hey, Trey. Onlinesales are down. What can we do to keep users engaged and shopping carts full? Trey is asked to “look into something” I’ll look into it.
  • 19.
    Hm...cool. Can you talkto the dev team? Here’s what we should do: Trey uncovers a bunch of things we didn’t know
  • 20.
    Trey hands hiswork to deployment engineers
  • 21.
    “Throw it overthe wall” projects Execs Data Science Application Developers
  • 22.
    Common reasons thesetypes of projects stall - Unclear benefits - Skepticism about effectiveness - Too complex to operationalize - Too time-consuming - Unclear how to measure ROI
  • 23.
  • 24.
    Making data valuable Collectand display individual records Structure, link, metadata, interact, share Understand, infer, learn Drive value, change Clean, aggregate, visualize Actions Predictions Reports Charts Records Extracting value from data is like any other value chain. Value
  • 25.
    Like a rawmaterial, data has no obvious utility to start out. Collect and display individual records Structure, link, metadata, interact, share Understand, infer, learn Drive value, change Clean, aggregate, visualize Actions Predictions Reports Charts Records Value Making data valuable
  • 26.
    We make itvaluable through sequential refinement. Collect and display individual records Structure, link, metadata, interact, share Understand, infer, learn Drive value, change Clean, aggregate, visualize Actions Predictions Reports Charts Records Value Making data valuable
  • 27.
    Cost of Creatingthat Value Building data products requires lots of work
  • 28.
    Cost of Creatingthat Value But most of the value is generated at the end
  • 29.
    Cost of Creatingthat Value Data Teams Managers Customers Everyone has to see past a lot of challenges
  • 30.
  • 31.
    - Consumers Several typesof customers Carl wants to watch a good movie.
  • 32.
    - Consumers - AppDevelopers Cambria needs to call credit models from Salesforce. Several types of customers
  • 33.
    Douglas needs 3AM server outages to stop. Several types of customers - Consumers - App Developers - Infrastructure Admins
  • 34.
    Gordon wants salesreps calling the hottest leads. Several types of customers - Consumers - App Developers - Infrastructure Admins - Sales & Marketing
  • 35.
  • 36.
    1. Focus onthe customer 5 Attributes of Successful Data Science Teams
  • 37.
    1. Focus onthe customer 2. Identify practical constraints 5 Attributes of Successful Data Science Teams
  • 38.
    1. Focus onthe customer 2. Identify practical constraints 3. Start small but ship quickly 5 Attributes of Successful Data Science Teams
  • 39.
    1. Focus onthe customer 2. Identify practical constraints 3. Start small but ship quickly 4. Measure the impact 5 Attributes of Successful Data Science Teams
  • 40.
    1. Focus onthe customer 2. Identify practical constraints 3. Start small but ship quickly 4. Measure the impact 5. Relentless iteration 5 Attributes of Successful Data Science Teams
  • 41.
    1. Focus onthe customer 2. Identify practical constraints 3. Start small but ship quickly 4. Measure the impact 5. Relentless iteration 5 Attributes of Successful Data Science Teams
  • 42.
  • 43.
    Hm...cool. Can you talkto the dev team? Here’s what we should do: Trey uncovers a bunch of things we didn’t know
  • 44.
    Trey hands hiswork to deployment engineers
  • 45.
    “Throw it overthe wall” projects Data Science Application Developers
  • 46.
    Deploy Models Faster DataScience Application Developers