Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

BD2K @ NIH - A Vision Through 2020

1,258 views

Published on

Opening remarks at the BD2K All-hands meeting in Bethesda, MD, USA on November 29, 2016.

Published in: Education
  • Be the first to comment

BD2K @ NIH - A Vision Through 2020

  1. 1. BD2K @ NIH – A Vision Through 2020 Philip E. Bourne, PhD, FACMI Associate Director for Data Science philip.bourne@nih.gov
  2. 2. First and foremost you should see this meeting as a celebration of the hard work of the past two years Yes these are uncertain times, but … There is a commitment to the BD2K program through 2020
  3. 3. BD2K cannot be viewed in isolation, but rather as part of a broader view of data science @ NIH … Particularly as funding is increasingly from the IC’s
  4. 4. A View Which Includes: • A vibrant research program of: – Fundamental developments in data science – Application of those fundamental developments – Flagship projects to which developments are applied: • PMI, Brain, Moonshot, ECHO • A sustainable data ecosystem – Commons and the FAIR Principles adoption – Cross-cutting activities • Increased workforce training • A changing governance model
  5. 5. A Strategic Response can be Modeled on Three Axes: Research Resources Outcomes
  6. 6. A Strategic Response Research Resources Outcomes • Fundamental • Machine learning • Data mining • Indexing • Predictive modeling … • Applied • Sustainability, governance, economics of data • Privacy and security • Effective use of clouds …
  7. 7. A Strategic Response Research Resources Outcomes • Standards • Commons APIs Reference data sets Workflows Access & Authentication • Workforce • Fundamental • Machine learning • Data mining • Indexing • Predictive modeling … • Applied • Sustainability, governance, economics of data • Privacy and security • Effective use of clouds …
  8. 8. A Strategic Response Research Resources Outcomes • Standards • Commons APIs Reference data sets Workflows Access & Authentication • Workforce • Fundamental • Machine learning • Data mining • Indexing • Predictive modeling … • Applied • Sustainability, governance, economics of data • Privacy and security • Effective use of clouds … • Evaluated pilots • FAIR data • Trained workforce • Best practices • Policies • Effective use of clouds • On-ramps for all IC’s
  9. 9. A View Which Includes: • A vibrant research program of: – Fundamental developments in data science – Application of those fundamental developments – Flagship projects to which developments are applied: • PMI, Brain, Moonshot, ECHO • A sustainable data ecosystem – Commons and the FAIR Principles adoption – Cross-cutting activities • Increased workforce training • A changing governance model
  10. 10. The Current Situation • NIH Funded Data – Total data from NIH-funded research currently estimated at 650 PB* – 20 PB of that is in NCBI/NLM (3%) and it is expected to grow by 10 PB this year • Dark Data – Only 12% of data described in published papers is in recognized archives – 88% is dark data^ • Cost – 2007-2014: NIH spent ~$1.2Bn extramurally on maintaining data archives * In 2012 Library of Congress was 3 PB ^ http://www.ncbi.nlm.nih.gov/pubmed/26207759
  11. 11. The Commons - Status • Commons and FAIR principles* adopted across NIH • Development and public release of a prototype Data Discovery Index – DataMed • Feb. v 1.0 • Nov v 1.5 • Cloud credits being issued for work in the Commons • FOA’s for Commons Framework being issued • Commons pilots under way * https://www.ncbi.nlm.nih.gov/pubmed/26978244
  12. 12. Sustainability – Sample Other Activities • Request for Information: Metrics to Assess Value of Biomedical Digital Repositories (NOT-OD-16-133) – To be discussed at Sustainability Session, Wed 1pm • RFA to support community based standards work was released in the fall for May 2017 award, session today 1pm • Funding opportunity announcement: (BD2K) Enhancing the Efficiency and Effectiveness of Digital Curation for Biomedical Big Data (RFA-LM-17-001) Applications due Dec 15
  13. 13. Sustainability – Looking Forward • International collaboration on business models for sustainable data repositories – Sustainable Business Models for Data Repositories (OECD Global Science Forum) – Future of Life Sciences and Biomedical Databases (International Human Science Frontiers Program) • NIH long-term data repository support – Federal interagency Workshop on Measuring the Impact of Data Repositories, 2017 – Recommend mechanism(s), review criteria, implementation plan
  14. 14. Example Cross-cutting Activities • International partnerships • Count everything – Secure count query framework • California centers regional meetings • GA4GH – Beacon project
  15. 15. A View Which Includes: • A vibrant research program of: – Fundamental developments in data science – Application of those fundamental developments – Flagship projects to which developments are applied: • PMI, Brain, Moonshot, ECHO • A sustainable data ecosystem – Commons and the FAIR Principles adoption – Cross-cutting activities • Increased workforce training • A changing governance model
  16. 16. NLM • Working Group Report – http://acd.od.nih.gov/reports/Report-NLM- 06112015-ACD.pdf – Recommendation – NLM should become the programmatic epicenter for data science at NIH … • Patti Brennan – New NLM director
  17. 17. What We Hope to See in 2020 • New innovations bought about by large and complex data • Evidence of translation i.e. real application at the point of care • Broad Commons adoption leading to – Improved sharing, reuse and hence cost effectiveness and reproducibility • A balance between what is spent on data vs what is gained from that data • Policies that are supportive of the above
  18. 18. … for your hard work and to the NIH staff from the ADDS office and from across the IC’s who have toiled to make BD2K a success

×