Better Living
Through Data
   Science        Scott
                Nicholson
                            @scootrous
                           snicholson@
                        accretivehealth.com
                            lnkd.in/scott
Helping people
and businesses
 make better
  decisions
Does big data help people make better decisions?
No, insights do.

BD is a realization that we can do more with data than we previously
thought, just as much as it is about more data being available

Companies in 2000 who didn’t know what to do with their “small”
data won’t be any better off with big/huge/fat data today.

It’s about insights, and data scientists are well-suited to create
them.

I’d prefer an brilliant Excel/SQL guru who asks the right questions
than a deeply technical ‘big data’ engineer who focuses on elegance
and algorithms.
What is data
        science?

        Project phases
Today
        Where do you find
        people who can do
        it?
/Hila




         “Data Scientist”
         means different
        things to different
             people
/Hila




                               “Data Scientist”
                               means different
                              things to different
                                   people

Credit: Drew Conway
/Hila




                                “Data Scientist”
                                means different
                               things to different
                                    people

Credit: Hilary Mason
“Data Scientist”
 means different
things to different
     people
My definition of a data scientist:
Someone who uses data to solve
problems end-to-end, from asking
  the right questions to making
       insights actionable.
End-to-end data science: five stages




 Ask the     Choose    Extract &             Deploy,
                                   Build a
  right       your       clean                learn,
                                   model
questions   approach   your data             iterate
One of the hardest
                things to find in a
                data scientist

  Phase 1
Ask the Right
 Questions
Do we always
            need to build a
            model?

 Phase 2
Choose an
Approach
Leverage other
disciplines and
   intuition
Is building a
                    model the first
                   thing you should
                          do?




Credit: Sam Shah
The g(l)ory of data
              science: most of
              the work is here

 Phase 3
Extract and
Clean Data
ddd
ddd




      In the trenches, dirty jobs, porta-potty
      Vs
      Luxury, rocket science, fast cars
Health Care
  EHR is not
 designed for
data extraction
LinkedIn
On the frontier,
but still difficult
to do agile data




  Grab better/new logos
For most
           problems, a wheel
           has already been
           invented…
Phase 4
           …just recognize
 Model     the wheel!

Building
           Example: missing
           charges on bill
Always use
                                                 workhorse
                                                 models first




Online advertising: logistic regression in production at Yahoo for a long time
LinkedIn
Skills universe
LinkedIn
Skills universe
Health Care
Networked data
 also common
Focus on quick solutions to identify bogeys and get feedback
Think like Eric Ries                                           Agile Data
Photo of sand trap?
                                                                   dd
Deployment and
                 execution of
                 predictive models
                 is crucial
  Phase 5
   Deploy,
Learn, Iterate   Iteration is key,
                 especially in an
                 agile analytics
                 framework
LinkedIn
Subscriber churn
prevention emails
Health Care
Population health
 management &
 quality of care
LinkedIn
                        Build a viewer
                             app




Picture of viewmaster
Well that’s great but who is going to do all of that wo




Who is good at this stuff?
Just as physicists moved to
Wall Street to be quants and
then on to online advertising
and consumer web, there will
   be a significant talent
migration into health care in
     the next few years.
But huge
opportunities
One of the fundamental
problems of our time

18% of GDP! 0.01% is giant
revenue potential

Data availability and
richness only increasing

                               But huge
The right people are         opportunities
realizing data and data
science are core to the
solution.
Take-aways
Data science is
industry-agnostic
There are many
challenges, but
 this is just the
   beginning.
EHR data extraction and
                    updates difficult

                    Implementation barriers

There are many      Nothing scales
challenges, but
                    Privacy issues
 this is just the
   beginning.       Data aggregation difficult

                    Not all hospitals are
                    Stanford, Vanderbilt, etc.
What can we do
 about these
 challenges?
Daily/hourly decision
support?

Communicate value
of data mining to
patients
                        What can we do
SMART, roll-your-own
EHRs
                         about these
                         challenges?
bit.ly/accretive-data-science-job




Thank you!
 (we’re hiring)


                        Scott
                      Nicholson
                                        @scootrous
                                        snicholson@
                                     accretivehealth.com
                                        lnkd.in/scott

Accretive Health - Quality Management in Health Care

  • 1.
    Better Living Through Data Science Scott Nicholson @scootrous snicholson@ accretivehealth.com lnkd.in/scott
  • 2.
    Helping people and businesses make better decisions
  • 3.
    Does big datahelp people make better decisions? No, insights do. BD is a realization that we can do more with data than we previously thought, just as much as it is about more data being available Companies in 2000 who didn’t know what to do with their “small” data won’t be any better off with big/huge/fat data today. It’s about insights, and data scientists are well-suited to create them. I’d prefer an brilliant Excel/SQL guru who asks the right questions than a deeply technical ‘big data’ engineer who focuses on elegance and algorithms.
  • 4.
    What is data science? Project phases Today Where do you find people who can do it?
  • 5.
    /Hila “Data Scientist” means different things to different people
  • 6.
    /Hila “Data Scientist” means different things to different people Credit: Drew Conway
  • 7.
    /Hila “Data Scientist” means different things to different people Credit: Hilary Mason
  • 8.
    “Data Scientist” meansdifferent things to different people
  • 9.
    My definition ofa data scientist: Someone who uses data to solve problems end-to-end, from asking the right questions to making insights actionable.
  • 10.
    End-to-end data science:five stages Ask the Choose Extract & Deploy, Build a right your clean learn, model questions approach your data iterate
  • 11.
    One of thehardest things to find in a data scientist Phase 1 Ask the Right Questions
  • 12.
    Do we always need to build a model? Phase 2 Choose an Approach
  • 13.
  • 14.
    Is building a model the first thing you should do? Credit: Sam Shah
  • 15.
    The g(l)ory ofdata science: most of the work is here Phase 3 Extract and Clean Data
  • 16.
    ddd ddd In the trenches, dirty jobs, porta-potty Vs Luxury, rocket science, fast cars
  • 17.
    Health Care EHR is not designed for data extraction
  • 18.
    LinkedIn On the frontier, butstill difficult to do agile data Grab better/new logos
  • 19.
    For most problems, a wheel has already been invented… Phase 4 …just recognize Model the wheel! Building Example: missing charges on bill
  • 20.
    Always use workhorse models first Online advertising: logistic regression in production at Yahoo for a long time
  • 21.
  • 22.
  • 23.
  • 24.
    Focus on quicksolutions to identify bogeys and get feedback Think like Eric Ries Agile Data Photo of sand trap? dd
  • 25.
    Deployment and execution of predictive models is crucial Phase 5 Deploy, Learn, Iterate Iteration is key, especially in an agile analytics framework
  • 26.
  • 27.
    Health Care Population health management & quality of care
  • 28.
    LinkedIn Build a viewer app Picture of viewmaster
  • 29.
    Well that’s greatbut who is going to do all of that wo Who is good at this stuff?
  • 30.
    Just as physicistsmoved to Wall Street to be quants and then on to online advertising and consumer web, there will be a significant talent migration into health care in the next few years.
  • 31.
  • 32.
    One of thefundamental problems of our time 18% of GDP! 0.01% is giant revenue potential Data availability and richness only increasing But huge The right people are opportunities realizing data and data science are core to the solution.
  • 33.
  • 34.
  • 35.
    There are many challenges,but this is just the beginning.
  • 36.
    EHR data extractionand updates difficult Implementation barriers There are many Nothing scales challenges, but Privacy issues this is just the beginning. Data aggregation difficult Not all hospitals are Stanford, Vanderbilt, etc.
  • 37.
    What can wedo about these challenges?
  • 38.
    Daily/hourly decision support? Communicate value ofdata mining to patients What can we do SMART, roll-your-own EHRs about these challenges?
  • 39.
    bit.ly/accretive-data-science-job Thank you! (we’rehiring) Scott Nicholson @scootrous snicholson@ accretivehealth.com lnkd.in/scott