Summarizes the Associate Director for Data Science (ADDS) team's thinking on the strategy to use to positively impact how the NIH thinks about data science
2. Accomplishments & Challenges
to Date
• Accomplishments:
– Fulfilling all the recommendations of the 2012 ACD
Working Group report
– Delivering a vibrant extramural program (BD2K) to
establish the needed data science ecosystem
– Established a new office in support of trans-NIH data
science
• Challenges:
– Managing short-term expectations against a
predominantly outward-facing program with 3-5 year
objectives
– Convincing IC’s we are not the solution but part of the
solution
3. What Drives Our Strategic Thinking
• Be Prepared - Responding to take advantage of
the opportunities offered by a major disruption in
the biomedical research enterprise arising
through digitization and exponential growth
• Accelerating discovery during this time of
disruptive development
• Continually catalyzing a cultural shift towards a
more analytical enterprise while managing
expectations
4. Mission Statement
To use data science to foster an
open digital ecosystem that will accelerate
efficient, cost-effective biomedical
research
to enhance health, lengthen life, and
reduce illness and disability
5. Overall Goals by 2020
• Enable major scientific discovery through the BD2K
initiative
• Establish and provide evidence of a more sustainable,
efficient and productive data science ecosystem both
internal and external to NIH
• Establish and provide evidence of a well-trained and
diverse workforce able to use and develop biomedical
data science tools and methods
• Build upon NIH’s leadership and reputation in data
science
6. Data Science Strategy
• We have identified 5
actionable areas
• The relative emphasis
and growth of these
areas is being
addressed as part of
the strategic planning
process
• Success cuts across
all these areas …
Sustainability
Workforce
Development
& Diversity
Discovery &
Innovation
Policy &
Process
Leadership
7. Research Objects in the Commons
Voxel Wide Genome Scanning
MRI standardization
Over 100 Public Lectures
Collaboration with a Minority Institution
185 Institutions Involved
Genomic Data Sharing
Policy
Cross-cutting Success -
BD2K Center
Sustainability
Workforce
Development &
Diversity
Discovery &
Innovation
Policy &
Process
Leadership
Actionable Areas
8. • Propose a sustainability and preservation strategy in
FY16-17
• Gathering and sharing information about existing data
resources and best practices for supporting resources
• Measuring value and costs of data resources
o Commons to enable and measure usage, as well as
to promote sharing and leverage new technologies
o Identifying and measuring contributors to value
other than usage
o Cloud-broker credit model to measure costs
Sustainability
Workforce
Development
& Diversity
Discovery &
Innovation
Policy &
Process
Leadership
Goal: To foster a sustainable, efficient, and
productive data science ecosystem
9. Sustainability
Workforce
Development
& Diversity
Discovery &
Innovation
Policy &
Process
Leadership
Strengthening a diverse
biomedical workforce to
utilize data science
BD2K funding of Short
Courses and Open
Educational Resources
Building a diverse
workforce in biomedical
data science
BD2K Training programs
and Individual Career
Awards
Fostering Collaborations
BD2K Training
Coordination Center,
NSF/NIH IDEAs Lab
Expanding NIH Data
Science Workforce
Development Center
Local courses, e.g.
Software Carpentry
Discovery of Educational Resources
BD2K Training Coordination Center
Goal: To strengthen the ability of a diverse
biomedical workforce to develop and
benefit from data science
10. Funded in FY14
12 BD2K
Centers of
Excellence
Data
Discovery
Index
Consortium
Funding in FY15
Centers
Coordinating
Center
Targeted
software
awards
FY16 or beyond
Building
Collaborations
to harness
new skill sets
Standards
Coordination
Center
Software
index
Sustainability
Workforce
Development
& Diversity
Discovery &
Innovation
Policy &
Process
Leadership
Goal: To enable major scientific discovery
and innovation through the BD2K Initiative
11. Sustainability
Workforce
Development
& Diversity
Discovery &
Innovation
Policy &
Process
Leadership
Improving
processes to
increase
efficiency
•Example: revised
BD2K governance
model
Updating
policies to
keep up with
changing
technologies
•Example: dbGaP
in the cloud
Increasing the
value of data
through
sharing and
citation
•Example:
Instigating
machine-
readable data
sharing plans on
all grants
Goal: To contribute to policies & processes
involving data that further the NIH
mission
12. Sustainability
Workforce
Development
& Diversity
Discovery &
Innovation
Policy &
Process
Leadership
ICs
Community:
Academia, Industry,
and Gov’t
ADDS
Goal: Further visibility of NIH leadership in
data science by the public, DHHS, USG at
large, and international funders
Network of
data science
expertise
consultation &
coordination Communication
website, listserv, blogs, Twitter
Community Building
Data Science Seminar Series
Pi Day
Public Engagement
13. Next Steps
• If agreement on goals & approach, develop a strategic
plan & timelines for further discussion (drafted)
• Short term early wins
– Hire clinical biomedical informatics specialist to interface
with PMI & IC’s
– Redirect existing funds for
• An IC & government relations liaison
• IRP Commons support
• IRP training support
• Long term
– Depends somewhat on the NLM vision
Editor's Notes
For discussion with Francis Collins & Larry Tabak
Bullet 2 (Internal Review): Status: Internal Review: first pass at inventory complete (80 resources, $500K/yr); RFI about best practices
Next Steps: Report to IC directors, work w/ other agencies, Review Commons and cloud-broker model pilots, Review findings from sustainability supplements, Identify best practices for resources
Long term: Propose sustainability & preservation strategy through new & revised funding models
Bullet 3: Commons:
Status: Cloud pilots on-going (NCI, HMP, others); BD2K pilot funding to begin Commons development
Next steps:Evaluate cost-benefit of using cloud resources over traditional approaches for analytical research; Further Commons pilots and development
Long-term: Assuming Commons is a value proposition, transition to this model and continue to evaluate
Removed bullets:
Increasing value of data by increasing interoperability of data resources
Collaborating with other agencies to find shared solutions and to consider new funding models
Examples of ADDS-Community interactions: HIROs workshop, support for the Global Alliance for Genomics and Health and other community organizations.