A SWOT Analysis of Data
Science @ NIH
Philip E. Bourne, PhD, FACMI
Associate Director for Data Science
PSB, Hawaii  January 07, 2016
First a Little Context
BD2K is Implementing the ACD Data & Informatics Recommendations*
DIWG Recommendations
1. Sharing data & software through
indexes
2. Advance big methods, tools &
applications
3. Expand data science training
4. Continued support throughout the
data & software lifecycle
BD2K Implementation
1. Implement the Commons (indices,
standards, etc.)
2. Data science research programs
(Centers, U01s, etc.)
3. Training and workforce development
programs
4. Addressing sustainability of science,
technology, and funding mechanisms
* http://acd.od.nih.gov/diwg.htm
The BD2K Program
$0
$20,000,000
$40,000,000
$60,000,000
$80,000,000
$100,000,000
$120,000,000
FY14 FY15 FY16 FY17 FY18 FY19 FY20 FY21
total available
BD2K Budget
Opportunities & Threats -
Photography
Digitization
Deception
Disruption
Demonetization
Dematerialization
Democratization
Time
Volume,Velocity,Variety
Digital camera invented by
Kodak but shelved
Megapixels & quality improve slowly;
Kodak slow to react
Film market collapses;
Kodak goes bankrupt
Phones replace
cameras
Instagram,
Flickr become the
value proposition
Digital media becomes bona fide
form of communication
Opportunities & Threats:
Biomedical Research
Digitization of Basic &
Clinical Research & EHR’s
Deception
We Are Here
Disruption
Demonetization
Dematerialization
Democratization
Open science
Patient centered health care
Opportunities & Threats
• O: “Disruption” - data & analytics will become
more central to the biomedical enterprise
• T: The time to this realization is much longer
than it need be
• T: The efficiency of the enterprise is not what
it should be
• T: We do too little to address existing & future
pain points
Weaknesses
• Access vs privacy of human subjects data
• Gender and race inequality
• Valuing scholarship / reward systems
• Appropriate review
• Sustainability
• Insufficient resources
Sustainability
• Revised governance
structure
• Inventory of NIH
data repositories and
costs
• The Commons
• Interoperability pilots
• Sustainability FOAs
• Policy
recommendations
ADDS Team
IC Representatives
Leadership
Insufficient Resources
Strengths
 439 participants
 167 remote viewers
 Breakout sessions
 133 Posters
 16 Demos
 3 BOFs
http://www.scgcorp.com/bd2k2015/Default
Strengths
• Large datasets, e.g., 46M Aetna EHRs
• Data integration, e.g., Mobile health + Yelp
• Analysis, e.g., machine learning to predict
phenotype from EHRs
• Diverse data types, e.g., genomics, proteomics,
imaging, clinical trials, EHRs
• Collaboration, e.g., joint API development, use and
requests for metadata templates, data sharing
• Depth of training
• International
Additional Reading
• Strategic plan for 2016-17
• ADDS 2015 Blog
• The Office of Data Science
NIH…
Turning Discovery Into Health
philip.bourne@nih.gov
https://datascience.nih.gov/

A SWOT Analysis of Data Science @ NIH

  • 1.
    A SWOT Analysisof Data Science @ NIH Philip E. Bourne, PhD, FACMI Associate Director for Data Science PSB, Hawaii  January 07, 2016
  • 2.
  • 3.
    BD2K is Implementingthe ACD Data & Informatics Recommendations* DIWG Recommendations 1. Sharing data & software through indexes 2. Advance big methods, tools & applications 3. Expand data science training 4. Continued support throughout the data & software lifecycle BD2K Implementation 1. Implement the Commons (indices, standards, etc.) 2. Data science research programs (Centers, U01s, etc.) 3. Training and workforce development programs 4. Addressing sustainability of science, technology, and funding mechanisms * http://acd.od.nih.gov/diwg.htm
  • 4.
  • 5.
    Opportunities & Threats- Photography Digitization Deception Disruption Demonetization Dematerialization Democratization Time Volume,Velocity,Variety Digital camera invented by Kodak but shelved Megapixels & quality improve slowly; Kodak slow to react Film market collapses; Kodak goes bankrupt Phones replace cameras Instagram, Flickr become the value proposition Digital media becomes bona fide form of communication
  • 6.
    Opportunities & Threats: BiomedicalResearch Digitization of Basic & Clinical Research & EHR’s Deception We Are Here Disruption Demonetization Dematerialization Democratization Open science Patient centered health care
  • 7.
    Opportunities & Threats •O: “Disruption” - data & analytics will become more central to the biomedical enterprise • T: The time to this realization is much longer than it need be • T: The efficiency of the enterprise is not what it should be • T: We do too little to address existing & future pain points
  • 8.
    Weaknesses • Access vsprivacy of human subjects data • Gender and race inequality • Valuing scholarship / reward systems • Appropriate review • Sustainability • Insufficient resources
  • 9.
    Sustainability • Revised governance structure •Inventory of NIH data repositories and costs • The Commons • Interoperability pilots • Sustainability FOAs • Policy recommendations
  • 10.
  • 11.
    Strengths  439 participants 167 remote viewers  Breakout sessions  133 Posters  16 Demos  3 BOFs http://www.scgcorp.com/bd2k2015/Default
  • 12.
    Strengths • Large datasets,e.g., 46M Aetna EHRs • Data integration, e.g., Mobile health + Yelp • Analysis, e.g., machine learning to predict phenotype from EHRs • Diverse data types, e.g., genomics, proteomics, imaging, clinical trials, EHRs • Collaboration, e.g., joint API development, use and requests for metadata templates, data sharing • Depth of training • International
  • 13.
    Additional Reading • Strategicplan for 2016-17 • ADDS 2015 Blog • The Office of Data Science
  • 14.
    NIH… Turning Discovery IntoHealth philip.bourne@nih.gov https://datascience.nih.gov/