One Funder’s View for Advancing Open SciencePhilip Bourne
Robert Wood Johnson Foundation & SPARC Workshop on October 19, 2015 intended to catalyze a dialogue about opportunities for philanthropy and other funders in open access.
From Research to Practice - New Models for Data-sharing and Collaboration to ...Health Data Consortium
Watch the webinar here: http://encore.meetingbridge.com/MB005418/140528/
Webinar transcript: http://hdc.membershipsoftware.org/Files/webinars/HDC-PwC%20NIH%20&%20PCORI%20Webinar%20Transcript%205_28_14.pdf
Patient-Centered Outcomes Research Institute (PCORI) Executive Director Joe Selby, MD, MPH; National Institutes of Health (NIH) Director and PCORI Board of Governors member Francis Collins, MD, PhD; and NIH Associate Director for Data Science Philip Bourne, PhD discussed new and emerging trends in big data for health, including:
- How researchers, patients, clinicians, and others are forging new models for data-sharing.
- Leveraging the quantity, variety, and analytic potential of health-related data for research and practice.
- Addressing patients’ perspectives, needs, and concerns in creating new opportunities for innovation and translational science.
- Exciting initiatives such as PCORnet, the National Patient-Centered Clinical Research Network initiative that PCORI is now helping to develop, and related open data and technology efforts such - as the NIH Health Systems Collaboratory and Big Data to Knowledge (BD2K) initiative.
Discover more health data resources on our website at http://www.healthdataconsortium.org/
One Funder’s View for Advancing Open SciencePhilip Bourne
Robert Wood Johnson Foundation & SPARC Workshop on October 19, 2015 intended to catalyze a dialogue about opportunities for philanthropy and other funders in open access.
From Research to Practice - New Models for Data-sharing and Collaboration to ...Health Data Consortium
Watch the webinar here: http://encore.meetingbridge.com/MB005418/140528/
Webinar transcript: http://hdc.membershipsoftware.org/Files/webinars/HDC-PwC%20NIH%20&%20PCORI%20Webinar%20Transcript%205_28_14.pdf
Patient-Centered Outcomes Research Institute (PCORI) Executive Director Joe Selby, MD, MPH; National Institutes of Health (NIH) Director and PCORI Board of Governors member Francis Collins, MD, PhD; and NIH Associate Director for Data Science Philip Bourne, PhD discussed new and emerging trends in big data for health, including:
- How researchers, patients, clinicians, and others are forging new models for data-sharing.
- Leveraging the quantity, variety, and analytic potential of health-related data for research and practice.
- Addressing patients’ perspectives, needs, and concerns in creating new opportunities for innovation and translational science.
- Exciting initiatives such as PCORnet, the National Patient-Centered Clinical Research Network initiative that PCORI is now helping to develop, and related open data and technology efforts such - as the NIH Health Systems Collaboratory and Big Data to Knowledge (BD2K) initiative.
Discover more health data resources on our website at http://www.healthdataconsortium.org/
2016 Data Commons and Data Science Workshop June 7th and June 8th 2016. Genomic Data Commons, FAIR, NCI and making data more findable, publicly accessible, interoperable (machine readable), reusable and support recognition and attribution
Digital transformation to enable a FAIR approach for health data scienceVarsha Khodiyar
Invited talk for ConTech Pharma on 1st March 2022
Abstract
Health Data Research UK is the UK’s national institute for health data science, with a mission to unite the UK’s health data to enable discoveries that improve people’s lives. In this talk, Dr Varsha Khodiyar will outline how HDR UK is bringing together disparate health data from all four countries of the United Kingdom, creating the infrastructure to enable discovery of and access to health data, and the convening standards making bodies to improve data linkage and data reuse. Varsha will also discuss how HDR UK is moving beyond the traditional confines of FAIR data to also ensure that data sharing and data use is transparent and ‘fair’ for the patients and lay public who are the subjects of these datasets.
DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...Warren Kibbe
May 2016 FNLAC presentation of the DOE-NCI partnership around three pilots focused on existing projects in NCI and existing NSCI directives and activities in DOE.
Quality analysis of NSF DMP plans - Wayne State Universityrds-wayne-edu
With assistance from WSU's office of Sponsored Programs Administration, 119 data management plans (half funded and half unfunded) from NSF grants submitted between 2012-2014 were obtained for scrutiny. Each DMP was evaluated by two reviewers, who reached consensus on a rating of how well the plan integrated elements required by NSF guidance. Descriptive statistics of ratings were formed, and statistical association between ratings and funding status were conducted.
Understanding users’ latent intents behind search queries is essential for satisfying a user’s search needs. Search intent mining can help search engines to enhance its ranking of search results, enabling new search features like instant answers, personalization, search result diversification, and the recommendation of more relevant ads. Consequently, there has been increasing attention on studying how to effectively mine search intents by analyzing search engine query logs. While state-of-the-art techniques can identify the domain of the queries (e.g. sports, movies, health), identifying domain-specific intent is still an open problem. Among all the topics available on the Internet, health is one of the most important in terms of impact on the user and it is one of the most frequently searched areas. This dissertation presents a knowledge-driven approach for domain-specific search intent mining with a focus on health-related search queries.
First, we identified 14 consumer-oriented health search intent classes based on inputs from focus group studies and based on analyses of popular health websites, literature surveys, and an empirical study of search queries. We defined the problem of classifying millions of health search queries into zero or more intent classes as a multi-label classification problem. Popular machine learning approaches for multi-label classification tasks (namely, problem transformation and algorithm adaptation methods) were not feasible due to the limitation of label data creations and health domain constraints. Another challenge in solving the search intent identification problem was mapping terms used by laymen to medical terms. To address these challenges, we developed a semantics-driven, rule-based search intent mining approach leveraging rich background knowledge encoded in Unified Medical Language System (UMLS) and a crowd sourced encyclopedia (Wikipedia). The approach can identify search intent in a disease-agnostic manner and has been evaluated on three major diseases.
While users often turn to search engines to learn about health conditions, a surprising amount of health information is also shared and consumed via social media, such as public social platforms like Twitter. Although Twitter is an excellent information source, the identification of informative tweets from the deluge of tweets is the major challenge. We used a hybrid approach consisting of supervised machine learning, rule-based classifiers, and biomedical domain knowledge to facilitate the retrieval of relevant and reliable health information shared on Twitter in real time. Furthermore, we extended our search intent mining algorithm to classify health-related tweets into health categories. Finally, we performed a large-scale study to compare health search intents and features that contribute in the expression of search intent from 100+ million search queries from smarts devices (smartphones/tablets) and personal computers (desktops/laptops)
Why is the NIH investing $100M at the intersection of data science and health research? The NIH seeks to invest in ways to help researchers easily find, access, analyze, and curate research data. Researchers want visual analytics, and to build the database into a “social network” – being able to “friend” or “like” the data.
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET
Abstract
In this presentation, Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health, will share the NIH’s vision for a modernized, integrated FAIR biomedical data ecosystem and the strategic roadmap that NIH is following to achieve this vision. Dr. Gregurick will highlight projects being implemented by team members across the NIH’s 27 institutes and centers and will ways that industry, academia, and other communities can help NIH enable a FAIR data ecosystem. Finally, she will weave in how this strategy is being leveraged to address the COVID-19 pandemic.
Presenter: Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health
dkNET Webinar Information: https://dknet.org/about/webinar
Data sharing promotes many goals of the NIH research endeavor. It is particularly important for unique data that cannot be readily replicated. Data sharing allows scientists to expedite the translation of research results into knowledge, products, and procedures to improve human health. Do you know what a data sharing plan should include? Are you aware of common practices and standards for data sharing? Do you know what services are available to help share your data responsibly? This workshop will begin to address these questions. Q&A will follow the presentation. Anyone interested in or planning to apply for NIH funding should attend. Note: The NIH data-sharing policy applies to applicants seeking $500,000 or more in direct costs in any year of the proposed research.
Data Harmonization for a Molecularly Driven Health SystemWarren Kibbe
Maximizing the value of data, computing, data science in an academic medical center, or 'towards a molecularly informed Learning Health System. Given in October at the University of Florida in Gainesville
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Barry Smith
Presentation to the Clinical and Research Ethics Seminar, Clinical and Translational Science Center, Buffalo, January 21, 2014
https://immport.niaid.nih.gov/
http://youtu.be/booqxkpvJMg
The slide presentation that preceded of the annual Health Datapalooza in Washington DC, PCORI was pleased to participate in the latest installment in the Health Data Consortium and PricewaterhouseCoopers (PwC) Innovators in Health Data Series, a webinar featuring PCORI Executive Director Joe Selby, MD, MPH; NIH Director and PCORI Board of Governors member Francis Collins, MD, PhD; and Philip Bourne, PhD, NIH’s Associate Director for Data Science.
2016 Data Commons and Data Science Workshop June 7th and June 8th 2016. Genomic Data Commons, FAIR, NCI and making data more findable, publicly accessible, interoperable (machine readable), reusable and support recognition and attribution
Digital transformation to enable a FAIR approach for health data scienceVarsha Khodiyar
Invited talk for ConTech Pharma on 1st March 2022
Abstract
Health Data Research UK is the UK’s national institute for health data science, with a mission to unite the UK’s health data to enable discoveries that improve people’s lives. In this talk, Dr Varsha Khodiyar will outline how HDR UK is bringing together disparate health data from all four countries of the United Kingdom, creating the infrastructure to enable discovery of and access to health data, and the convening standards making bodies to improve data linkage and data reuse. Varsha will also discuss how HDR UK is moving beyond the traditional confines of FAIR data to also ensure that data sharing and data use is transparent and ‘fair’ for the patients and lay public who are the subjects of these datasets.
DOE-NCI Pilots presentation at the Frederick National Laboratory Advisory Com...Warren Kibbe
May 2016 FNLAC presentation of the DOE-NCI partnership around three pilots focused on existing projects in NCI and existing NSCI directives and activities in DOE.
Quality analysis of NSF DMP plans - Wayne State Universityrds-wayne-edu
With assistance from WSU's office of Sponsored Programs Administration, 119 data management plans (half funded and half unfunded) from NSF grants submitted between 2012-2014 were obtained for scrutiny. Each DMP was evaluated by two reviewers, who reached consensus on a rating of how well the plan integrated elements required by NSF guidance. Descriptive statistics of ratings were formed, and statistical association between ratings and funding status were conducted.
Understanding users’ latent intents behind search queries is essential for satisfying a user’s search needs. Search intent mining can help search engines to enhance its ranking of search results, enabling new search features like instant answers, personalization, search result diversification, and the recommendation of more relevant ads. Consequently, there has been increasing attention on studying how to effectively mine search intents by analyzing search engine query logs. While state-of-the-art techniques can identify the domain of the queries (e.g. sports, movies, health), identifying domain-specific intent is still an open problem. Among all the topics available on the Internet, health is one of the most important in terms of impact on the user and it is one of the most frequently searched areas. This dissertation presents a knowledge-driven approach for domain-specific search intent mining with a focus on health-related search queries.
First, we identified 14 consumer-oriented health search intent classes based on inputs from focus group studies and based on analyses of popular health websites, literature surveys, and an empirical study of search queries. We defined the problem of classifying millions of health search queries into zero or more intent classes as a multi-label classification problem. Popular machine learning approaches for multi-label classification tasks (namely, problem transformation and algorithm adaptation methods) were not feasible due to the limitation of label data creations and health domain constraints. Another challenge in solving the search intent identification problem was mapping terms used by laymen to medical terms. To address these challenges, we developed a semantics-driven, rule-based search intent mining approach leveraging rich background knowledge encoded in Unified Medical Language System (UMLS) and a crowd sourced encyclopedia (Wikipedia). The approach can identify search intent in a disease-agnostic manner and has been evaluated on three major diseases.
While users often turn to search engines to learn about health conditions, a surprising amount of health information is also shared and consumed via social media, such as public social platforms like Twitter. Although Twitter is an excellent information source, the identification of informative tweets from the deluge of tweets is the major challenge. We used a hybrid approach consisting of supervised machine learning, rule-based classifiers, and biomedical domain knowledge to facilitate the retrieval of relevant and reliable health information shared on Twitter in real time. Furthermore, we extended our search intent mining algorithm to classify health-related tweets into health categories. Finally, we performed a large-scale study to compare health search intents and features that contribute in the expression of search intent from 100+ million search queries from smarts devices (smartphones/tablets) and personal computers (desktops/laptops)
Why is the NIH investing $100M at the intersection of data science and health research? The NIH seeks to invest in ways to help researchers easily find, access, analyze, and curate research data. Researchers want visual analytics, and to build the database into a “social network” – being able to “friend” or “like” the data.
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET
Abstract
In this presentation, Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health, will share the NIH’s vision for a modernized, integrated FAIR biomedical data ecosystem and the strategic roadmap that NIH is following to achieve this vision. Dr. Gregurick will highlight projects being implemented by team members across the NIH’s 27 institutes and centers and will ways that industry, academia, and other communities can help NIH enable a FAIR data ecosystem. Finally, she will weave in how this strategy is being leveraged to address the COVID-19 pandemic.
Presenter: Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health
dkNET Webinar Information: https://dknet.org/about/webinar
Data sharing promotes many goals of the NIH research endeavor. It is particularly important for unique data that cannot be readily replicated. Data sharing allows scientists to expedite the translation of research results into knowledge, products, and procedures to improve human health. Do you know what a data sharing plan should include? Are you aware of common practices and standards for data sharing? Do you know what services are available to help share your data responsibly? This workshop will begin to address these questions. Q&A will follow the presentation. Anyone interested in or planning to apply for NIH funding should attend. Note: The NIH data-sharing policy applies to applicants seeking $500,000 or more in direct costs in any year of the proposed research.
Data Harmonization for a Molecularly Driven Health SystemWarren Kibbe
Maximizing the value of data, computing, data science in an academic medical center, or 'towards a molecularly informed Learning Health System. Given in October at the University of Florida in Gainesville
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Barry Smith
Presentation to the Clinical and Research Ethics Seminar, Clinical and Translational Science Center, Buffalo, January 21, 2014
https://immport.niaid.nih.gov/
http://youtu.be/booqxkpvJMg
The slide presentation that preceded of the annual Health Datapalooza in Washington DC, PCORI was pleased to participate in the latest installment in the Health Data Consortium and PricewaterhouseCoopers (PwC) Innovators in Health Data Series, a webinar featuring PCORI Executive Director Joe Selby, MD, MPH; NIH Director and PCORI Board of Governors member Francis Collins, MD, PhD; and Philip Bourne, PhD, NIH’s Associate Director for Data Science.
Methodologies for Addressing Privacy and Social Issues in Health Data: A Case...Trilateral Research
Huge quantities of complex and diverse data are generated everyday in healthcare institutions, including clinical documentation (diagnostics, lab data, imaging data, etc.), administrative data, activities and cost data, and R&D data from clinical trials.
Improving health care outcomes with responsible data scienceWessel Kraaij
Keynote presentation by Wessel Kraaij at the Dutch pattern recognition and impage processing society (NVPBV) 29/5/2018, Eindhoven.
This talk discusses
1. trends in health care and respondible data science and their intersection
2. Secure federated analytics on distributed data repositories
3. Generating clinically relevant hypotheses from patient forum discussions.
This presentation outlines a mechanism for using the power of "Big Data", social networking and technology infrastructure to speed the process of curing a horrible disease.
CINECA webinar slides: Ethics/ELSI considerations - From FAIR to fair data sh...CINECAProject
The FAIR principles – standing for Findability, Accessibility, Interoperability, and Reusability – have become the guiding principles for the wider sharing of research data in the life sciences. While FAIR provides guidance for the management of data as well as tools and workflows, the institutional conditions and organizational challenges associated with data sharing need to be taken into account to ensure responsible and fair data practices. This requires considering the context of legal requirements, for instance the principle of fairness and transparency in GDPR, expectations of research participants/data subjects, societal aspects and the “ethics work” that is an integral part of data flows, as well as fairness, equity and benefit sharing within transnational collaborations, which is of utmost importance. This webinar will, from the perspective of ethical, legal and societal implications (ELSI), discuss this broader context of responsible and fair data sharing associated with FAIR.
The “How FAIR are you” webinar series and hackathon aim at increasing and facilitating the uptake of FAIR approaches into software, training materials and cohort data, to facilitate responsible and ethical data and resource sharing and implementation of federated applications for data analysis.
The CINECA webinar series aims to discuss ways to address common challenges and share best practices in the field of cohort data analysis, as well as distribute CINECA project results. All CINECA webinars include an audience Q&A session during which attendees can ask questions and make suggestions. Please note that all webinars are recorded and available for posterior viewing.
This webinar took place on 15th April 2021 and is part of the CINECA webinar series.
For previous and upcoming CINECA webinars see:
https://www.cineca-project.eu/webinars
Wake up Pharma and look into your Big data Yigal Aviv
The vast volumes of medical data collected offers pharma the opportunity to harness the information in big data sets
Unlocking the potential in these data sources can ultimately lead to improved patients outcomes
This presentation describes consideration how to maximize the impact of Big Data.
its methodology, practical challenges and implications.
Lessons from the UK: Data access, patient trust & real-world impact with heal...Varsha Khodiyar
Slides supporting presentation given at the virtual Beilstein Open Science Symposium in October 2021.
Abstract:
Health Data Research UK’s mission is to unite the UK’s health data to enable discoveries that improve people’s lives. Our 20-year vision is for large scale data and advanced analytics to benefit every patient interaction, clinical trial, biomedical discovery and enhance public health. A key part of HDR UK’s vision is our data portal, the Innovation Gateway. The Gateway facilitates discovery of healthcare data and simplifies data request procedures across multiple data custodians. The Gateway contains metadata on a variety of datasets, including those related to COVID-19, cardiovascular, maternal health, emergency care, primary care, secondary care, acute care, palliative care, biobanks, research cohorts and deeply phenotyped patient cohorts.
From the outset HDR UK has sought the voices, views and experiences of patient and lay-public groups to ensure there is transparency and clear public benefit in the use of the UK’s health data. Patient and public involvement is key to making the Gateway accessible, transparent and to ensure public confidence in research access to health data. The importance of public outreach combined with providing research access to data is illustrated with HDR UK’s contribution to the UK’s coronavirus pandemic response. HDR UK was tasked by the UK’s Chief Scientific Office to build and facilitate the infrastructure to support the National Core Studies, providing key insights on the evolving situation to UK policy makers during the course of the pandemic.
In this talk, I will show how HDR UK is enabling open science by facilitating the discovery of health data, and simplifying the process of requesting access to multiple datasets. I’ll discuss HDR UK’s approach to embedding transparency on research data usage for patients and public, and summarise some of the key ways in which HDR UK has contributed to the coronavirus pandemic.
The State of Open Data Report by @figshare.
A selection of analyses and articles about open data, curated by Figshare
Foreword by Professor Sir Nigel Shadbolt
OCTOBER 2016
From personal health data to a personalized adviceWessel Kraaij
Invited talk at the health track of ICT.OPEN 2018, 20-3-2018
1. Related Data science challenges to Digital Health trends
2. Designing an infrastructure to support secure learning from distributed health data repositories, for personalized health advice
3. Supporting patients with rare diseases with patient driven research and the generation of new hypotheses based on patient experiences.
May 2016 NCI Cancer Center Directors meeting. Data Sharing and the Cancer Genomic Data Commons (GDC). Focus is on cancer genomic and clinical phenotype data.
ODF III - 3.15.16 - Day Two Morning SessionsMichael Kerr
Slide presentations delivered during morning sessions of Day Two of the California Statewide Health and Human Services Open DataFest - March 14 - 15, 2016, Sacramento, CA
Trusted! Quest for data-driven and fair health solutions Sitra / Hyvinvointi
An inspiring online event on 3 February 2021. We are discussing the future of data-driven health solutions that focus on fairness for all stakeholders: people, business and the public sector. We are asking questions such as: What is fairness in health? What role does trust play in data-driven health services? What needs to change and who needs to act? Most of all, we are launching “The Fair Health Data Challenge“.
Event speakers:
- Jaana Sinipuro, Project Director, IHAN – Human-driven data economy, Sitra
- Dipak Kalra, President, The European Institute for Innovation through Health Data (i~HD)
- Pekka Kahri, Technology Officer, HUS Helsinki University Hospital
- Markus Kalliola, Project Director, Health data 2030, Sitra
- Tiina Härkönen, Leading Specialist, Sitra
Similar to Secure Data Sharing and Related Matters – An NIH View (20)
Presented online as part of the NASM series in Advancing Drug Discovery see https://www.nationalacademies.org/event/40883_09-2023_advancing-drug-discovery-data-science-meets-drug-discovery
For a panel discussion at the Associate Research Libraries Spring meeting April 27, 2022, Montreal https://www.arl.org/schedule-for-spring-2022-association-meeting/
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
3 basic points when establishing a new biomedical initiative. Presented at Frontiers of Computing in Health and Society, George Mason University, September 21, 2021.
NITRD Big Data Interagency Working Group Workshop: Pioneering the Future of Federally Supported Data Repositories Jan 13, 2021 - Opening comments on where we are and one suggestion of where we might go with an International Data Science Institute (IDSI) - A blue sky view.
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Acetabularia Information For Class 9 .docxvaibhavrinwa19
Acetabularia acetabulum is a single-celled green alga that in its vegetative state is morphologically differentiated into a basal rhizoid and an axially elongated stalk, which bears whorls of branching hairs. The single diploid nucleus resides in the rhizoid.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Unit 8 - Information and Communication Technology (Paper I).pdf
Secure Data Sharing and Related Matters – An NIH View
1. Secure Data Sharing and Related
Matters – An NIH View
Philip E. Bourne, PhD, FACMI
Associate Director for Data Science
National Institutes of Health
October 26, 2015
2. Disclaimer…
I am not a cybersecurity expert, and
as an informatician previously
working primarily in the pre-clinical
space not an expert in security
associated with human subjects
3. Conversation Cards
What is happening that makes a
discussion of security important
What is secure anyway?
How is the NIH responding in this
changing landscape
4. Conversation Cards
What is happening that makes a
discussion of security important
What is secure anyway?
How is the NIH responding in this
changing landscape
5. Big Data in the Life Sciences …
This speaks to something more
fundamental that more data …
It speaks to new methodologies, new
skills, new emphasis, new cultures,
new modes of discovery …
7. The History of Computational
Biomedicine According to Bourne
1980s 1990s 2000s 2010s 2020
Discipline:
Unknown Expt. Driven Emergent Over-sold A Service A Partner A Driver
The Raw Material:
Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated
The People:
No name Technicians Industry recognition data scientists Academics
Searls (ed) The Roots in Bioinformatics Series PLOS Comp Biol
9. We are at a Point of Deception …
Evidence:
– Google car
– 3D printers
– Waze
– Robotics
– Sensors
From: The Second Machine Age: Work, Progress,
and Prosperity in a Time of Brilliant Technologies
by Erik Brynjolfsson & Andrew McAfee
11. We Are At a Point of Deception
The 6D Exponential Framework
Digitization of Basic &
Clinical Research & EHR’s
Deception
We Are Here
Disruption
Demonetization
Dematerialization
Democratization
Open science
Patient centered health care
12. What Are Some General Implications
of Such a Future?
Open collaborative science becomes of increasing
importance
The value of data and associated analytics becomes
of increasing value to scholarship
Opportunities exist to improve the efficiency of the
research enterprise and hence fund more research
Cooperation between funders will be needed to
sustain the emergent digital enterprise
Current training content and modalities will not match
supply to demand
Balancing accessibility vs security becomes more
important yet more complex
13. An Example of That Promise:
Comorbidity Network for 6.2M Danes
Over 14.9 Years
Jensen et al 2014 Nat Comm 5:4022
14. “And that’s why we’re here today. Because something
called precision medicine … gives us one of the greatest
opportunities for new medical breakthroughs that we
have ever seen.”
President Barack Obama
January 30, 2015
15. Precision Medicine Initiative
National Research Cohort
– >1 million U.S. volunteers
– Numerous existing cohorts (many funded by NIH)
– New volunteers
Participants will be centrally involved in design and
implementation of the cohort
They will be able to share genomic data, lifestyle
information, biological samples – all linked to their
electronic health records
16. Conversation Cards
What is happening that makes a
discussion of security important
What is secure anyway?
How is the NIH responding in this
changing landscape
17. For the Purposes of this NIH Centric
Digital Discussion:
What is Secure Anyway?
Access to digital research objects
when, how, and by whom are
authorized to access them in
accordance of the wishes of the
owner and/or laws and policies which
define accessibility
18. Some of the Complexities
Research objects
– Narrative
– Data – preclinical and
clinical
– Software
– Publications
Owner
– Individual
– Institution
– Funding agency
– Third party
Governance
– Federal
– Funding agency
– Institutional
– Third party
19. Conversation Cards
What is happening that makes a
discussion of security important
What is secure anyway?
How is the NIH responding in this
changing landscape
23. “The HGP changed the norms around data sharing
in biomedical research.”
“The HGP changed the norms around data sharing
in biomedical research.”
24. Data Sharing Goes Global: GA4GH
Global Alliance for Genomics and
Health
Accelerating the potential of genomic medicine to
advance human health, by:
– Establishing common framework of approaches to enable
effective, responsible sharing of genomic and clinical data
– Catalyzing data sharing projects that drive and demonstrate
value of data sharing
Alliance*: >350 leading institutions (healthcare, research,
advocacy, life science, IT) representing 35 countries
Working groups (Clinical, Data, Security, Regulatory &
Ethics) assess, prioritize needs
– Form task teams to produce tools, solutions, demonstration
projects
*Statistics as of October 5, 2015
26. A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
27. Guiding Principle of NIH GWAS Policy
The greatest public benefit will be
realized if data from GWAS are made
available, under terms and conditions
consistent with the informed consent
provided by individual participants, in a
timely manner to the largest possible
number of investigators.
NIH expectation that data would be shared in the
NIH database of Genotype and Phenotype (dbGaP)
29. A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
30. NIH Public Access Policy for Publications
Ensures public access to published results of all
research funded by NIH since 2008
– Recipients of NIH funds required to submit final peer-
reviewed journal manuscripts to PubMed Central (PMC)
upon acceptance for publication
– Papers must be accessible to the public on PMC no later
than 12 months after publication
31. A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
32. Harnessing Data to Improve Health:
BD2K (Big Data to Knowledge)
NIH’s 6-year initiative to use data science to foster an
open digital ecosystem that will accelerate efficient,
cost-effective biomedical research to enhance health,
lengthen life, and reduce illness and disability
Programs and activities:
Advance discovery for biomedical research
Facilitate use and re-use of biomedical data
Develop analytical methods and software
Enhance biomedical data science training
33. A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
34. NIH Genomic Data Sharing (GDS)
Policy
Purpose
– Sets forth expectations, responsibilities that ensure broad,
responsible sharing of genomic research data in a timely
manner
Scope
– All NIH-funded research generating large-scale human or
non-human genomic data – and their use for subsequent
research
• Data to be submitted to NIH-designated data repositories
(e.g., dbGaP, GEO, GenBank, WormBase, FlyBase, Rat
Genome Database)
– Applies to all funding mechanisms (grants, contracts,
intramural support) with no minimum threshold for cost
Released August 2014; effective January 25, 2015
gds.nih.gov
35. A Culture of Sharing
1999 20042003 2007 20142008
Research
Tools
Policy
NIH Data
Sharing Policy
Model
Organism
Policy
Genome-wide
Association
(GWAS) Policy
2012
NIH Public
Access Policy
(Publications)
Big Data to
Knowledge
(BD2K) Initiative
Genomic Data
Sharing (GDS)
Policy
Modernization of
NIH Clinical
Trials
White House
Initiative
(2013 “Holdren
Memo”)
36. Modernizing NIH Clinical Trials
Activities:
The Need
NIH-Funded trials published within 100 months of
completion
Less than 50% published within 30 months of completion
BMJ 2012;344:d7292
38. Increasing Clinical Trial Transparency
Proposed November 2014; Final Spring 2016 (est.)
Notice of Proposed Rulemaking: Clinical Trials
Registration and Results Submission (FDAAA, Section
801)
– Further implements statutory requirements on private and
public sponsors to register; report results on phase 2, 3,
and 4 trials
– Includes drugs, biologics, and devices (except small
feasibility)
Draft NIH Policy on Clinical Trial Information
Dissemination
– Extends Section 801 requirements to all NIH-funded clinical
trials
– Includes phase 1 trials and trials of non-FDA regulated
interventions such as behavioral trials
42. The Commons
Digital Object Compliance: FAIR
Attributes of digital objects in the Commons
Initial Phase
• Unique digital object identifiers of some type
• A minimal set of searchable metadata
• Physically available in a cloud based Commons provider
• Clear access rules (especially important for human subjects data)
• An entry (with metadata) in one or more indices
– Future Phases
• Standard, community based unique digital object identifiers
• Conform to community approved standard metadata for enhanced
searching
• Digital objects accessible via open standard APIs
• Are physically and logical available to the commons
43. BD2K Targeted Software Topics
Supports innovative analytical methods and software tools
that address critical current and emerging needs of the
biomedical research
2015 Topics (18 awards, U01s)
– Data Compression
– Data Provenance
– Data Visualization
– Data Wrangling
2016 Topics (U01s, under review)
– Data Privacy
– Data Repurposing
– Applying Metadata
– 2016: Crowdsourcing and interactive Digital Media
(UH2)
44. I not only use all the brains
I have, but all I can borrow.
– Woodrow Wilson
46. NIHNIH……
Turning Discovery Into HealthTurning Discovery Into Health
philip.bourne@nih.gov
https://datascience.nih.gov/
http://www.ncbi.nlm.nih.gov/research/staff/bourne/
Editor's Notes
16 million hospital inpatient events (24.5% of total), 35 million outpatient clinic events (53.6% of total) and 14 million emergency
department events (21.9% of total
Photos: FC tweet; RK screen grab
Images of people from Infographic (NOTE: Image is just a placeholder—Jill will tweak)
Detailed Notes:
National Research Cohort <<OR name of study>>
>1 million U.S. volunteers committed to participating in research
Will combine a number of existing cohorts
Will include Dept of Veterans Affairs Million Veteran Program—note Veteran is singular per http://www.research.va.gov/MVP/
“As biology’s first large-scale project, the HGP paved the way for numerous consortium-based research ventures. The NHGRI alone has been involved in launching more than 25 such projects since 2000. These have presented new challenges to biomedical research — demanding, for instance, that diverse groups from different countries and disciplines come together to share and analyse vast data sets.”
“The HGP changed the norms around data sharing in biomedical research.”
2013 White House Initiative: “Increasing Access to the Results of Federally Funded Scientific Research”
Updated to include numbers through September 2015.
From Dina Paltoo [10/6/15]: “The data in the first slide is for all of dbGaP 2007-2014. The information came from a version of what is on the GDS website (https://gds.nih.gov/19dataaccesscommitteereview_dbGaP.html) and in a Nature Genetics paper (http://www.nature.com/ng/journal/v46/n9/full/ng.3062.html), but results from information that we receive from NCBI.”
The NIH Public Access Policy implements Division F Section 217 of PL 111-8 (Omnibus Appropriations Act, 2009).
http://publicaccess.nih.gov/policy.htm
OSP’s summary:
The NIH Public Access Policy for publications has been in a requirement for all recipients of NIH funds since 2008. It implements Division G, Title II, Section 218 of PL 110-161 (Consolidated Appropriations Act, 2008). The NIH Public Access Policy ensures that the public has access to the published results of NIH-funded research. It requires scientists to submit final peer-reviewed journal manuscripts that arise from NIH funds to the digital archive PubMed Central (PMC) upon acceptance for publication. Scientists can also deposit papers through partnerships NIH has established with publishers. To help advance science and improve human health, the Policy requires that NIH supported papers are accessible to the public on PMC no later than 12 months after publication.
Updated by ADDS group 8/25/15
Figure 2. Cumulative percentage of studies published in a peer reviewed biomedical journal indexed by Medline during 100 months after trial completion among all NIH funded clinical trials registered within ClinicalTrials.gov
Public benefits to clinical trials data-sharing (OSP):
Inform future research and research funding decisions
Mitigate bias (e.g., non publication of results, especially negative results)
Prevent duplication of unsafe trials
Meet ethical obligation to human subjects (i.e., that results inform science)
Increase access to data about marketed products
All contribute to public trust in clinical research
Source: Ross JS, Tse T, Zarin DA, Xu H, Zhou L, Krumholz HM. Publication of NIH funded trials registered in ClinicalTrials.gov: cross-sectional analysis. BMJ 2012;344:d7292.
Text updated by Sarah Carr [10/7/2015] – also changed order to feature NPRM before Draft NIH Policy.
Nearly 900 Comments received on PPRM: Many simply stating broad support
Final Rule expected Spring 2016
Section 801 of the Food and Drug Administration Amendments Act (FDAAA)