An Informal Discussion About Big Data
Better Stated as

A Vision for Biomedical
Research
Digitally enabling the length and...
The Context for This Discussion
• On March 3, 2014 I will begin as the first
Associate Director of the NIH devoted to data...
What Do I Mean By the Digital
Enterprise?
An organization that succeeds by
maximizing the use of its digital assets
to ach...
Why the Digital Enterprise Now?
• Biomedical research is increasingly digital –
the talk of “Big Data” is one manifestatio...
Actions on Data Implies:
•
•
•
•
•
•
•
•
•

Insuring data quality and hence trust
Making data sustainable
Making data open...
Boundaries on Data Implies:
• Working across biological scales
• Working across biomedical disciplines
• Working across ba...
Where to Start?

An external advisory group provided a
valuable blueprint for what should be
done
http://acd.od.nih.gov/Da...
Blueprint Recommendations
• Promote central and federated catalogs
– Establish minimal metadata framework
– Tools to facil...
What is Under Way?
•

Now:
–
–
–
–
–

Data centers (under review)
Data science training grants (call Q1 14)
Pilot data cat...
Longer Term Strategy: Support for
The Research Lifecycle
Authoring
Tools

Data
Capture

Lab
Notebooks

Software
Repositori...
Longer Term Strategy: Support for
The Research Lifecycle
Authoring
Tools

Data
Capture

Lab
Notebooks

Software
Repositori...
References
• http://bd2k.nih.gov/
• http://pebourne.wordpress.com/2013/12/21/
taking-on-the-role-of-associate-director-for...
pbourne@ucsd.edu

Discussion
Back Pocket Slides
The Role of Associate Director for Data
Science
1.

2.
3.
4.
5.
6.
7.

provide broad trans-NIH programmatic leadership in ...
Strategy
•
•
•
•

Use the Blueprint as a starting point
Work with IC’s to determine science drivers
Define developments ne...
Ways to Sell the NIH Data Science
Vision
• Developed in response to well recognized scientific needs
• Support for the com...
General Features of NIH Data Science
• Lightweight metadata standards
• Data & software registries
• Expanded policies on ...
Upcoming SlideShare
Loading in …5
×

PSB2014 A Vision for Biomedical Research

2,877 views

Published on

Some preliminary thoughts about my role as Associate Director for Data Science at the NIH so as to have a discussion with attendees at the Pacific Symposium on Biocomputing on Jan 4, 2014, The Big Island of Hawaii.

Published in: Education, Technology

PSB2014 A Vision for Biomedical Research

  1. 1. An Informal Discussion About Big Data Better Stated as A Vision for Biomedical Research Digitally enabling the length and quality of life Philip E. Bourne pbourne@ucsd.edu http://pebourne.wordpress.com/2013/12/21/taking-on-the-role-of-associate-director-for-data-science-at-the-nih-my-originalvision-statement/
  2. 2. The Context for This Discussion • On March 3, 2014 I will begin as the first Associate Director of the NIH devoted to data science • I am giving up tenure and the sun because I believe this is the right time for change • The change that I will try and instill at NIH and beyond is that of a Digital Enterprise http://www.nih.gov/news/health/dec2013/od-09.htm
  3. 3. What Do I Mean By the Digital Enterprise? An organization that succeeds by maximizing the use of its digital assets to achieve its goals
  4. 4. Why the Digital Enterprise Now? • Biomedical research is increasingly digital – the talk of “Big Data” is one manifestation • Fulfillment of the NIH mission (among others) will increasingly be tied to actions taken on digital data across boundaries • History already has lessons to teach us to make the job easier
  5. 5. Actions on Data Implies: • • • • • • • • • Insuring data quality and hence trust Making data sustainable Making data open and accessible Making data findable Providing suitable metadata and annotation Making data queryable Making data analyzable Presenting data as to maximize its value Rewarding good data practices
  6. 6. Boundaries on Data Implies: • Working across biological scales • Working across biomedical disciplines • Working across basic and clinical research and practice • Working across institutional boundaries • Working across public and private sectors • Working across national and international borders • Working across funding agencies
  7. 7. Where to Start? An external advisory group provided a valuable blueprint for what should be done http://acd.od.nih.gov/Data%20and%20Informatics%20Working%20Group%20Report.pdf
  8. 8. Blueprint Recommendations • Promote central and federated catalogs – Establish minimal metadata framework – Tools to facilitate data sharing – Elaborate on existing data sharing policies • Support methods and applications – Fund all phases of software development – Leverage lessons from National Centers • Training – More funding – Enhance review of training apps – Quantitative component to all awards • On campus IT strategic plan – Catalog of existing tools – Informatics laboratory – Ditto big data • Sustainable funding commitment
  9. 9. What is Under Way? • Now: – – – – – Data centers (under review) Data science training grants (call Q1 14) Pilot data catalog consortium (call out) Genomic Research Data Alliance (being finalized) Piloting “NIH-drive” • In Year One: – – – – – – Extended public-private programs specifically for data science activities Interagency activities International exchange programs Programs for better data descriptions Reward institutions/communities Policies to get clinical trial data into the public domain
  10. 10. Longer Term Strategy: Support for The Research Lifecycle Authoring Tools Data Capture Lab Notebooks Software Repositories Analysis Tools Scholarly Communication Visualization IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Commercial & Public Tools DisciplineBased Metadata Standards Community Portals Git-like Resources By Discipline Training Institutional Repositories Commercial Repositories Data Journals New Reward Systems
  11. 11. Longer Term Strategy: Support for The Research Lifecycle Authoring Tools Data Capture Lab Notebooks Software Repositories Analysis Tools Scholarly Communication Visualization IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Commercial & Public Tools DisciplineBased Metadata Standards Community Portals Git-like Resources By Discipline Training Institutional Repositories Commercial Repositories Data Journals New Reward Systems
  12. 12. References • http://bd2k.nih.gov/ • http://pebourne.wordpress.com/2013/12/21/ taking-on-the-role-of-associate-director-fordata-science-at-the-nih-my-original-visionstatement/ • http://rd-alliance.org/ • http://www.genomeinformaticsalliance.org/ • http://www.force11.org/
  13. 13. pbourne@ucsd.edu Discussion
  14. 14. Back Pocket Slides
  15. 15. The Role of Associate Director for Data Science 1. 2. 3. 4. 5. 6. 7. provide broad trans-NIH programmatic leadership in the area of data science; lead long-term NIH strategic planning in areas of data science; provide oversight of the BD2K Initiative; establish and nurture a trans-NIH intellectual and programmatic ‘hub’ for coordinating and enhancing data science activities; coordinate with data science activities beyond NIH (e.g., other government agencies, other funding agencies, and the private sector); play a major role in data sharing policy development and oversight at NIH; and interact with the Chief Information Officer, NIH to generate synergy between BD2K and the Infrastructure Plus program.
  16. 16. Strategy • • • • Use the Blueprint as a starting point Work with IC’s to determine science drivers Define developments needed for these drivers Look for commonalities across IC’s – make those a priority • Manage and enable emergent developments – data catalog – used to define the minimal data description and a home for domain definitions – Centers of excellence – test beds and exemplars for best practices
  17. 17. Ways to Sell the NIH Data Science Vision • Developed in response to well recognized scientific needs • Support for the complete research lifecycle – this is more than just data • Simple and well understood by all stakeholders (i.e., branded) • A shared vision • As ubiquitous as TCP/IP is to the Internet – a backbone for the digital enterprise • To data what PLOS is to knowledge – a movement that people believe in and get behind • An app store for the research enterprise
  18. 18. General Features of NIH Data Science • Lightweight metadata standards • Data & software registries • Expanded policies on data sharing, open source software • Training programs & reward systems • Institutional incentives • Private sector incentives • Data centers serving community needs

×