SlideShare a Scribd company logo

Human Genome and Big Data Challenges

Presented to the Big Data in Symptoms Research Boot Camp, National Institute of Nursing Research, July 24, 2014

1 of 39
Download to read offline
Human Genome and Big Data Challenges
Philip E. Bourne Ph.D.
Associate Director for Data Science
National Institutes of Health
http://www.slideshare.net/pebourne
The History of Analytics in Biomedical
Research
1980s 1990s 2000s 2010s 2020
Discipline:
Unknown Expt. Driven Emergent Over-sold A Service A Partner A Driver
The Raw Material:
Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated
The People:
No name Technicians Industry recognition data scientists Academics
Searls (ed) The Roots in Bioinformatics Series PLOS Comp Biol
Data Science Timeline
6/12
• U54 Centers of Excellence - under review
• U54 BD2K-LINCS– under review
• U24 Data Discovery Index– under review
• R01, R41, R42, R43, R44, U01 software and
analysis methods grants – on-going
• T32, T15, K01, R25 and R26 training awards
– under review
2/14 3/14
“It was the best of times, it was the
worst of times, it was the age of
wisdom, it was the age of foolishness,
it was the epoch of belief, it was the
epoch of incredulity, it was the season
of Light, it was the season of
Darkness, it was the spring of hope, it
was the winter of despair …”
A Tale of Two Numbers
Source Michael Bell http://homepages.cs.ncl.ac.uk/m.j.bell1/blog/?p=830
Myriad Data Types
Other ‘Omic
Imaging Phenotypic
Clinical
Genomic
Exposure
Ad

Recommended

Big Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesBig Data in Genomics: Opportunities and Challenges
Big Data in Genomics: Opportunities and ChallengesMatthieu Schapranow
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesMatthieu Schapranow
 
Analyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life SciencesAnalyze Genomes: A Federated In-Memory Database System For Life Sciences
Analyze Genomes: A Federated In-Memory Database System For Life SciencesMatthieu Schapranow
 
A Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data AnalysisA Platform for Integrated Genome Data Analysis
A Platform for Integrated Genome Data AnalysisMatthieu Schapranow
 
BioNRW: Big Medical Data: Challenge or Potential
BioNRW: Big Medical Data: Challenge or PotentialBioNRW: Big Medical Data: Challenge or Potential
BioNRW: Big Medical Data: Challenge or PotentialMatthieu Schapranow
 
Processing of Big Medical Data in Personalized Medicine: Challenge or Potential
Processing of Big Medical Data in Personalized Medicine: Challenge or PotentialProcessing of Big Medical Data in Personalized Medicine: Challenge or Potential
Processing of Big Medical Data in Personalized Medicine: Challenge or PotentialMatthieu Schapranow
 
Analyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineAnalyze Genomes Services for Precision Medicine
Analyze Genomes Services for Precision MedicineMatthieu Schapranow
 

More Related Content

What's hot

Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Matthieu Schapranow
 
In-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems MedicineIn-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems MedicineMatthieu Schapranow
 
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...Matthieu Schapranow
 
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...Matthieu Schapranow
 
AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health
AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital HealthAnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health
AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital HealthMatthieu Schapranow
 
Open PHACTS : Linked Data Future Challenges
Open PHACTS : Linked Data Future ChallengesOpen PHACTS : Linked Data Future Challenges
Open PHACTS : Linked Data Future ChallengesSciBite Limited
 
Patient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Patient Journey in Oncology 2025: Molecular Tumour Boards in PracticePatient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Patient Journey in Oncology 2025: Molecular Tumour Boards in PracticeMatthieu Schapranow
 
Analyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response AnalysisAnalyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response AnalysisMatthieu Schapranow
 
Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?Matthieu Schapranow
 
In-Memory Apps for Precision Medicine
In-Memory Apps for Precision MedicineIn-Memory Apps for Precision Medicine
In-Memory Apps for Precision MedicineMatthieu Schapranow
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceDr. Haxel Consult
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsManuel Corpas
 
Mashing Up Drug Discovery
Mashing Up Drug DiscoveryMashing Up Drug Discovery
Mashing Up Drug DiscoverySciBite Limited
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Tom Plasterer
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
 
How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?Matthieu Schapranow
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data managementcunera
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Tom Plasterer
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...keesvb
 

What's hot (20)

"When time matters..."
"When time matters...""When time matters..."
"When time matters..."
 
Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?Big Medical Data – Challenge or Potential?
Big Medical Data – Challenge or Potential?
 
In-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems MedicineIn-Memory Data Management for Systems Medicine
In-Memory Data Management for Systems Medicine
 
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
Gesundheit geht uns alle an: Smart Data ermöglicht passendere Entscheidungen...
 
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
A Federated In-Memory Database Computing Platform Enabling Real-Time Analysis...
 
AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health
AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital HealthAnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health
AnalyzeGenomes.com: A Federated In-Memory Database Platform for Digital Health
 
Open PHACTS : Linked Data Future Challenges
Open PHACTS : Linked Data Future ChallengesOpen PHACTS : Linked Data Future Challenges
Open PHACTS : Linked Data Future Challenges
 
Patient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Patient Journey in Oncology 2025: Molecular Tumour Boards in PracticePatient Journey in Oncology 2025: Molecular Tumour Boards in Practice
Patient Journey in Oncology 2025: Molecular Tumour Boards in Practice
 
Analyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response AnalysisAnalyze Genomes: Drug Response Analysis
Analyze Genomes: Drug Response Analysis
 
Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?Festival of Genomics 2016 London: What to take home?
Festival of Genomics 2016 London: What to take home?
 
In-Memory Apps for Precision Medicine
In-Memory Apps for Precision MedicineIn-Memory Apps for Precision Medicine
In-Memory Apps for Precision Medicine
 
II-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in NiceII-SDV 2015, 20 - 21 April, in Nice
II-SDV 2015, 20 - 21 April, in Nice
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics Datasets
 
Mashing Up Drug Discovery
Mashing Up Drug DiscoveryMashing Up Drug Discovery
Mashing Up Drug Discovery
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?How will AI affect the patient journey of the future?
How will AI affect the patient journey of the future?
 
Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data management
 
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
Edge Informatics and FAIR (Findable, Accessible, Interoperable and Reusable) ...
 
TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...TranSMART: How open source software revolutionizes drug discovery through cro...
TranSMART: How open source software revolutionizes drug discovery through cro...
 

Viewers also liked

Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...Matthieu Schapranow
 
Bioinformatics & Genomics December Newsletter
Bioinformatics & Genomics December NewsletterBioinformatics & Genomics December Newsletter
Bioinformatics & Genomics December NewsletterKelly Wickham
 
Big Data & the networked future of Science (at Ignite Seattle 7)
Big Data & the networked future of Science (at Ignite Seattle 7)Big Data & the networked future of Science (at Ignite Seattle 7)
Big Data & the networked future of Science (at Ignite Seattle 7)Deepak Singh
 
Visualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotationsVisualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotationsAnn Loraine
 
Interviewing - why some questions are off limits
Interviewing - why some questions are off limitsInterviewing - why some questions are off limits
Interviewing - why some questions are off limitsAnn Loraine
 
Daily Snapshot - 2nd March 2017
Daily Snapshot - 2nd March 2017Daily Snapshot - 2nd March 2017
Daily Snapshot - 2nd March 2017Tracxn
 
Hannes Smarason: Genomics: Forging Patient-Centric Communities
Hannes Smarason: Genomics: Forging Patient-Centric CommunitiesHannes Smarason: Genomics: Forging Patient-Centric Communities
Hannes Smarason: Genomics: Forging Patient-Centric CommunitiesHannes Smárason
 
Hannes Smarason: Progress & Prospects in Genomics
Hannes Smarason: Progress & Prospects in GenomicsHannes Smarason: Progress & Prospects in Genomics
Hannes Smarason: Progress & Prospects in GenomicsHannes Smárason
 
Genomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive BiologyGenomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive BiologyUri Laserson
 
Hannes Smarason: 2015 = An Inflection Point in Genomics
Hannes Smarason: 2015 = An Inflection Point in GenomicsHannes Smarason: 2015 = An Inflection Point in Genomics
Hannes Smarason: 2015 = An Inflection Point in GenomicsHannes Smárason
 
Strata Big Data Science Talk on ADAM
Strata Big Data Science Talk on ADAMStrata Big Data Science Talk on ADAM
Strata Big Data Science Talk on ADAMMatt Massie
 
Genome Analysis Pipelines with Spark and ADAM
Genome Analysis Pipelines with Spark and ADAMGenome Analysis Pipelines with Spark and ADAM
Genome Analysis Pipelines with Spark and ADAMAllen Day, PhD
 
Tracxn Research — Home Improvements Landscape, December 2016
Tracxn Research — Home Improvements Landscape, December 2016Tracxn Research — Home Improvements Landscape, December 2016
Tracxn Research — Home Improvements Landscape, December 2016Tracxn
 
Tracxn Research — New Space Landscape, December 2016
Tracxn Research — New Space Landscape, December 2016Tracxn Research — New Space Landscape, December 2016
Tracxn Research — New Space Landscape, December 2016Tracxn
 
Tracxn Research - PR Tech Landscape, January 2017
Tracxn Research - PR Tech Landscape, January 2017Tracxn Research - PR Tech Landscape, January 2017
Tracxn Research - PR Tech Landscape, January 2017Tracxn
 
Tracxn Research — Legal Tech Landscape, December 2016
Tracxn Research — Legal Tech Landscape, December 2016Tracxn Research — Legal Tech Landscape, December 2016
Tracxn Research — Legal Tech Landscape, December 2016Tracxn
 
Tracxn Research - Smart Cars Landscape, January 2017
Tracxn Research - Smart Cars Landscape, January 2017Tracxn Research - Smart Cars Landscape, January 2017
Tracxn Research - Smart Cars Landscape, January 2017Tracxn
 
Genomics: Big Data Leading to Big Opportunities
Genomics: Big Data Leading to Big OpportunitiesGenomics: Big Data Leading to Big Opportunities
Genomics: Big Data Leading to Big OpportunitiesHannes Smárason
 
Tracxn Research - Wind Energy Landscape, January 2017
Tracxn Research - Wind Energy Landscape, January 2017Tracxn Research - Wind Energy Landscape, January 2017
Tracxn Research - Wind Energy Landscape, January 2017Tracxn
 

Viewers also liked (20)

Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
Festival of Genomics 2016 London: Analyze Genomes: Modeling and Executing Gen...
 
Bioinformatics & Genomics December Newsletter
Bioinformatics & Genomics December NewsletterBioinformatics & Genomics December Newsletter
Bioinformatics & Genomics December Newsletter
 
Big Data & the networked future of Science (at Ignite Seattle 7)
Big Data & the networked future of Science (at Ignite Seattle 7)Big Data & the networked future of Science (at Ignite Seattle 7)
Big Data & the networked future of Science (at Ignite Seattle 7)
 
Visualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotationsVisualizing the genome: Techniques for presenting genome data and annotations
Visualizing the genome: Techniques for presenting genome data and annotations
 
Interviewing - why some questions are off limits
Interviewing - why some questions are off limitsInterviewing - why some questions are off limits
Interviewing - why some questions are off limits
 
Daily Snapshot - 2nd March 2017
Daily Snapshot - 2nd March 2017Daily Snapshot - 2nd March 2017
Daily Snapshot - 2nd March 2017
 
Hannes Smarason: Genomics: Forging Patient-Centric Communities
Hannes Smarason: Genomics: Forging Patient-Centric CommunitiesHannes Smarason: Genomics: Forging Patient-Centric Communities
Hannes Smarason: Genomics: Forging Patient-Centric Communities
 
Hannes Smarason: Progress & Prospects in Genomics
Hannes Smarason: Progress & Prospects in GenomicsHannes Smarason: Progress & Prospects in Genomics
Hannes Smarason: Progress & Prospects in Genomics
 
Genomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive BiologyGenomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive Biology
 
Hannes Smarason: 2015 = An Inflection Point in Genomics
Hannes Smarason: 2015 = An Inflection Point in GenomicsHannes Smarason: 2015 = An Inflection Point in Genomics
Hannes Smarason: 2015 = An Inflection Point in Genomics
 
Strata Big Data Science Talk on ADAM
Strata Big Data Science Talk on ADAMStrata Big Data Science Talk on ADAM
Strata Big Data Science Talk on ADAM
 
Genome Analysis Pipelines with Spark and ADAM
Genome Analysis Pipelines with Spark and ADAMGenome Analysis Pipelines with Spark and ADAM
Genome Analysis Pipelines with Spark and ADAM
 
Genome Big Data
Genome Big DataGenome Big Data
Genome Big Data
 
Tracxn Research — Home Improvements Landscape, December 2016
Tracxn Research — Home Improvements Landscape, December 2016Tracxn Research — Home Improvements Landscape, December 2016
Tracxn Research — Home Improvements Landscape, December 2016
 
Tracxn Research — New Space Landscape, December 2016
Tracxn Research — New Space Landscape, December 2016Tracxn Research — New Space Landscape, December 2016
Tracxn Research — New Space Landscape, December 2016
 
Tracxn Research - PR Tech Landscape, January 2017
Tracxn Research - PR Tech Landscape, January 2017Tracxn Research - PR Tech Landscape, January 2017
Tracxn Research - PR Tech Landscape, January 2017
 
Tracxn Research — Legal Tech Landscape, December 2016
Tracxn Research — Legal Tech Landscape, December 2016Tracxn Research — Legal Tech Landscape, December 2016
Tracxn Research — Legal Tech Landscape, December 2016
 
Tracxn Research - Smart Cars Landscape, January 2017
Tracxn Research - Smart Cars Landscape, January 2017Tracxn Research - Smart Cars Landscape, January 2017
Tracxn Research - Smart Cars Landscape, January 2017
 
Genomics: Big Data Leading to Big Opportunities
Genomics: Big Data Leading to Big OpportunitiesGenomics: Big Data Leading to Big Opportunities
Genomics: Big Data Leading to Big Opportunities
 
Tracxn Research - Wind Energy Landscape, January 2017
Tracxn Research - Wind Energy Landscape, January 2017Tracxn Research - Wind Energy Landscape, January 2017
Tracxn Research - Wind Energy Landscape, January 2017
 

Similar to Human Genome and Big Data Challenges

The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...Philip Bourne
 
Biomedical Research as an Open Digital Enterprise
Biomedical Research as an Open Digital EnterpriseBiomedical Research as an Open Digital Enterprise
Biomedical Research as an Open Digital EnterprisePhilip Bourne
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data ManagementCarole Goble
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314Philip Bourne
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHPhilip Bourne
 
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH     Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH Philip Bourne
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangePhilip Bourne
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environmentphilipdurbin
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataSusanna-Assunta Sansone
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersIncisive_Events
 
Records professionals and Research Data - a new role?
Records professionals and Research Data - a new role?Records professionals and Research Data - a new role?
Records professionals and Research Data - a new role?Rebecca Grant
 

Similar to Human Genome and Big Data Challenges (20)

The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
The Role of Automated Function Prediction in the Era of Big Data and Small Bu...
 
Biomedical Research as an Open Digital Enterprise
Biomedical Research as an Open Digital EnterpriseBiomedical Research as an Open Digital Enterprise
Biomedical Research as an Open Digital Enterprise
 
Yale Day of Data
Yale Day of Data Yale Day of Data
Yale Day of Data
 
A Big Picture in Research Data Management
A Big Picture in Research Data ManagementA Big Picture in Research Data Management
A Big Picture in Research Data Management
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314
 
The Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIHThe Thinking Behind Big Data at the NIH
The Thinking Behind Big Data at the NIH
 
Data!
Data!Data!
Data!
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis Critical infrastructure to promote data synthesis
Critical infrastructure to promote data synthesis
 
AMIA 2014
AMIA 2014AMIA 2014
AMIA 2014
 
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH     Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
Ask Not What the NIH Can Do For You; Ask What You Can Do For the NIH
 
Data Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything ChangeData Science Meets Biomedicine, Does Anything Change
Data Science Meets Biomedicine, Does Anything Change
 
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLANINCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
INCLUSION OF DATA ARCHIVES IN DATA MANAGEMENT PLAN
 
Managing, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital EnvironmentManaging, Sharing and Curating Your Research Data in a Digital Environment
Managing, Sharing and Curating Your Research Data in a Digital Environment
 
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific DataNIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
NIH iDASH meeting on data sharing - BioSharing, ISA and Scientific Data
 
Research-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhDResearch-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhD
 
Intro to RDM
Intro to RDMIntro to RDM
Intro to RDM
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Alain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producersAlain Frey Research Data for universities and information producers
Alain Frey Research Data for universities and information producers
 
Records professionals and Research Data - a new role?
Records professionals and Research Data - a new role?Records professionals and Research Data - a new role?
Records professionals and Research Data - a new role?
 

More from Philip Bourne

AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingPhilip Bourne
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityPhilip Bourne
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?Philip Bourne
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug DiscoveryPhilip Bourne
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AlonePhilip Bourne
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchPhilip Bourne
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data SciencePhilip Bourne
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewPhilip Bourne
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptxPhilip Bourne
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Philip Bourne
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision EducationPhilip Bourne
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data SciencePhilip Bourne
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Philip Bourne
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Philip Bourne
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance SustainabilityPhilip Bourne
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesPhilip Bourne
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in ResearchPhilip Bourne
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?Philip Bourne
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data SciencePhilip Bourne
 

More from Philip Bourne (20)

AI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We GoingAI+ Now and Then How Did We Get Here And Where Are We Going
AI+ Now and Then How Did We Get Here And Where Are We Going
 
Thoughts on Biological Data Sustainability
Thoughts on Biological Data SustainabilityThoughts on Biological Data Sustainability
Thoughts on Biological Data Sustainability
 
What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?What is FAIR Data and Who Needs It?
What is FAIR Data and Who Needs It?
 
Data Science Meets Drug Discovery
Data Science Meets Drug DiscoveryData Science Meets Drug Discovery
Data Science Meets Drug Discovery
 
Biomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not AloneBiomedical Data Science: We Are Not Alone
Biomedical Data Science: We Are Not Alone
 
BIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in ResearchBIMS7100-2023. Social Responsibility in Research
BIMS7100-2023. Social Responsibility in Research
 
AI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data ScienceAI from the Perspective of a School of Data Science
AI from the Perspective of a School of Data Science
 
What Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's ViewWhat Data Science Will Mean to You - One Person's View
What Data Science Will Mean to You - One Person's View
 
Novo Nordisk 080522.pptx
Novo Nordisk 080522.pptxNovo Nordisk 080522.pptx
Novo Nordisk 080522.pptx
 
Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)Towards a US Open research Commons (ORC)
Towards a US Open research Commons (ORC)
 
COVID and Precision Education
COVID and Precision EducationCOVID and Precision Education
COVID and Precision Education
 
One View of Data Science
One View of Data ScienceOne View of Data Science
One View of Data Science
 
Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?Cancer Research Meets Data Science — What Can We Do Together?
Cancer Research Meets Data Science — What Can We Do Together?
 
Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?Data Science Meets Open Scholarship – What Comes Next?
Data Science Meets Open Scholarship – What Comes Next?
 
Data to Advance Sustainability
Data to Advance SustainabilityData to Advance Sustainability
Data to Advance Sustainability
 
Frontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular ScalesFrontiers of Computing at the Cellular and Molecular Scales
Frontiers of Computing at the Cellular and Molecular Scales
 
Social Responsibility in Research
Social Responsibility in ResearchSocial Responsibility in Research
Social Responsibility in Research
 
SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?SWOT Analysis - What Does it Tell Us?
SWOT Analysis - What Does it Tell Us?
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
The UVA School of Data Science
The UVA School of Data ScienceThe UVA School of Data Science
The UVA School of Data Science
 

Recently uploaded

UniSC Fraser Coast library self-guided tour
UniSC Fraser Coast library self-guided tourUniSC Fraser Coast library self-guided tour
UniSC Fraser Coast library self-guided tourUSC_Library
 
DISCOURSE: TEXT AS CONNECTED DISCOURSE
DISCOURSE:   TEXT AS CONNECTED DISCOURSEDISCOURSE:   TEXT AS CONNECTED DISCOURSE
DISCOURSE: TEXT AS CONNECTED DISCOURSEMYDA ANGELICA SUAN
 
catch-up-friday-ARALING PNLIPUNAN SOCIAL JUSTICE AND HUMAN RIGHTS
catch-up-friday-ARALING PNLIPUNAN SOCIAL JUSTICE AND HUMAN RIGHTScatch-up-friday-ARALING PNLIPUNAN SOCIAL JUSTICE AND HUMAN RIGHTS
catch-up-friday-ARALING PNLIPUNAN SOCIAL JUSTICE AND HUMAN RIGHTSCarlaNicolas7
 
Diploma 2nd yr PHARMACOLOGY chapter 5 part 1.pdf
Diploma 2nd yr PHARMACOLOGY chapter 5 part 1.pdfDiploma 2nd yr PHARMACOLOGY chapter 5 part 1.pdf
Diploma 2nd yr PHARMACOLOGY chapter 5 part 1.pdfSUMIT TIWARI
 
11 CI SINIF SINAQLARI - 1-2023-Aynura-Hamidova.pdf
11 CI SINIF SINAQLARI - 1-2023-Aynura-Hamidova.pdf11 CI SINIF SINAQLARI - 1-2023-Aynura-Hamidova.pdf
11 CI SINIF SINAQLARI - 1-2023-Aynura-Hamidova.pdfAynouraHamidova
 
Shapley Tech Talk - SHAP and Shapley Discussion
Shapley Tech Talk - SHAP and Shapley DiscussionShapley Tech Talk - SHAP and Shapley Discussion
Shapley Tech Talk - SHAP and Shapley DiscussionTushar Tank
 
GIÁO ÁN TIẾNG ANH GLOBAL SUCCESS LỚP 11 (CẢ NĂM) THEO CÔNG VĂN 5512 (2 CỘT) N...
GIÁO ÁN TIẾNG ANH GLOBAL SUCCESS LỚP 11 (CẢ NĂM) THEO CÔNG VĂN 5512 (2 CỘT) N...GIÁO ÁN TIẾNG ANH GLOBAL SUCCESS LỚP 11 (CẢ NĂM) THEO CÔNG VĂN 5512 (2 CỘT) N...
GIÁO ÁN TIẾNG ANH GLOBAL SUCCESS LỚP 11 (CẢ NĂM) THEO CÔNG VĂN 5512 (2 CỘT) N...Nguyen Thanh Tu Collection
 
Overview of Databases and Data Modelling-1.pdf
Overview of Databases and Data Modelling-1.pdfOverview of Databases and Data Modelling-1.pdf
Overview of Databases and Data Modelling-1.pdfChristalin Nelson
 
BEZA or Bangladesh Economic Zone Authority recruitment exam question solution...
BEZA or Bangladesh Economic Zone Authority recruitment exam question solution...BEZA or Bangladesh Economic Zone Authority recruitment exam question solution...
BEZA or Bangladesh Economic Zone Authority recruitment exam question solution...MohonDas
 
Practical Research 1: Nature of Inquiry and Research.pptx
Practical Research 1: Nature of Inquiry and Research.pptxPractical Research 1: Nature of Inquiry and Research.pptx
Practical Research 1: Nature of Inquiry and Research.pptxKatherine Villaluna
 
ACTIVIDAD DE CLASE No 1 sopa de letras.docx
ACTIVIDAD DE CLASE No 1 sopa de letras.docxACTIVIDAD DE CLASE No 1 sopa de letras.docx
ACTIVIDAD DE CLASE No 1 sopa de letras.docxMaria Lucia Céspedes
 
Intuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov ChainsIntuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov ChainsTushar Tank
 
Writing Agony Letter & If type O+1 & Diphthongs + Text “Arab Science”.pdf
Writing Agony Letter & If type O+1 & Diphthongs + Text “Arab Science”.pdfWriting Agony Letter & If type O+1 & Diphthongs + Text “Arab Science”.pdf
Writing Agony Letter & If type O+1 & Diphthongs + Text “Arab Science”.pdfMr Bounab Samir
 
Odontogenesis and its related anomiles.pptx
Odontogenesis and its related anomiles.pptxOdontogenesis and its related anomiles.pptx
Odontogenesis and its related anomiles.pptxMennat Allah Alkaram
 
SOCIAL JUSTICE LESSON ON CATCH UP FRIDAY
SOCIAL JUSTICE LESSON ON CATCH UP FRIDAYSOCIAL JUSTICE LESSON ON CATCH UP FRIDAY
SOCIAL JUSTICE LESSON ON CATCH UP FRIDAYGloriaRamos83
 
Plant Genetic Resources, Germplasm, gene pool - Copy.pptx
Plant Genetic Resources, Germplasm, gene pool - Copy.pptxPlant Genetic Resources, Germplasm, gene pool - Copy.pptx
Plant Genetic Resources, Germplasm, gene pool - Copy.pptxAKSHAYMAGAR17
 
New Features in the Odoo 17 Sales Module
New Features in  the Odoo 17 Sales ModuleNew Features in  the Odoo 17 Sales Module
New Features in the Odoo 17 Sales ModuleCeline George
 
Bayesian Analysis Fundamentals with Examples
Bayesian Analysis Fundamentals with ExamplesBayesian Analysis Fundamentals with Examples
Bayesian Analysis Fundamentals with ExamplesTushar Tank
 

Recently uploaded (20)

UniSC Fraser Coast library self-guided tour
UniSC Fraser Coast library self-guided tourUniSC Fraser Coast library self-guided tour
UniSC Fraser Coast library self-guided tour
 
DISCOURSE: TEXT AS CONNECTED DISCOURSE
DISCOURSE:   TEXT AS CONNECTED DISCOURSEDISCOURSE:   TEXT AS CONNECTED DISCOURSE
DISCOURSE: TEXT AS CONNECTED DISCOURSE
 
catch-up-friday-ARALING PNLIPUNAN SOCIAL JUSTICE AND HUMAN RIGHTS
catch-up-friday-ARALING PNLIPUNAN SOCIAL JUSTICE AND HUMAN RIGHTScatch-up-friday-ARALING PNLIPUNAN SOCIAL JUSTICE AND HUMAN RIGHTS
catch-up-friday-ARALING PNLIPUNAN SOCIAL JUSTICE AND HUMAN RIGHTS
 
Diploma 2nd yr PHARMACOLOGY chapter 5 part 1.pdf
Diploma 2nd yr PHARMACOLOGY chapter 5 part 1.pdfDiploma 2nd yr PHARMACOLOGY chapter 5 part 1.pdf
Diploma 2nd yr PHARMACOLOGY chapter 5 part 1.pdf
 
11 CI SINIF SINAQLARI - 1-2023-Aynura-Hamidova.pdf
11 CI SINIF SINAQLARI - 1-2023-Aynura-Hamidova.pdf11 CI SINIF SINAQLARI - 1-2023-Aynura-Hamidova.pdf
11 CI SINIF SINAQLARI - 1-2023-Aynura-Hamidova.pdf
 
Shapley Tech Talk - SHAP and Shapley Discussion
Shapley Tech Talk - SHAP and Shapley DiscussionShapley Tech Talk - SHAP and Shapley Discussion
Shapley Tech Talk - SHAP and Shapley Discussion
 
GIÁO ÁN TIẾNG ANH GLOBAL SUCCESS LỚP 11 (CẢ NĂM) THEO CÔNG VĂN 5512 (2 CỘT) N...
GIÁO ÁN TIẾNG ANH GLOBAL SUCCESS LỚP 11 (CẢ NĂM) THEO CÔNG VĂN 5512 (2 CỘT) N...GIÁO ÁN TIẾNG ANH GLOBAL SUCCESS LỚP 11 (CẢ NĂM) THEO CÔNG VĂN 5512 (2 CỘT) N...
GIÁO ÁN TIẾNG ANH GLOBAL SUCCESS LỚP 11 (CẢ NĂM) THEO CÔNG VĂN 5512 (2 CỘT) N...
 
Overview of Databases and Data Modelling-1.pdf
Overview of Databases and Data Modelling-1.pdfOverview of Databases and Data Modelling-1.pdf
Overview of Databases and Data Modelling-1.pdf
 
BEZA or Bangladesh Economic Zone Authority recruitment exam question solution...
BEZA or Bangladesh Economic Zone Authority recruitment exam question solution...BEZA or Bangladesh Economic Zone Authority recruitment exam question solution...
BEZA or Bangladesh Economic Zone Authority recruitment exam question solution...
 
Practical Research 1: Nature of Inquiry and Research.pptx
Practical Research 1: Nature of Inquiry and Research.pptxPractical Research 1: Nature of Inquiry and Research.pptx
Practical Research 1: Nature of Inquiry and Research.pptx
 
ACTIVIDAD DE CLASE No 1 sopa de letras.docx
ACTIVIDAD DE CLASE No 1 sopa de letras.docxACTIVIDAD DE CLASE No 1 sopa de letras.docx
ACTIVIDAD DE CLASE No 1 sopa de letras.docx
 
Capter 5 Climate of Ethiopia and the Horn GeES 1011.pdf
Capter 5 Climate of Ethiopia and the Horn GeES 1011.pdfCapter 5 Climate of Ethiopia and the Horn GeES 1011.pdf
Capter 5 Climate of Ethiopia and the Horn GeES 1011.pdf
 
Intuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov ChainsIntuition behind Monte Carlo Markov Chains
Intuition behind Monte Carlo Markov Chains
 
Lipids as Biopolymer
Lipids as Biopolymer Lipids as Biopolymer
Lipids as Biopolymer
 
Writing Agony Letter & If type O+1 & Diphthongs + Text “Arab Science”.pdf
Writing Agony Letter & If type O+1 & Diphthongs + Text “Arab Science”.pdfWriting Agony Letter & If type O+1 & Diphthongs + Text “Arab Science”.pdf
Writing Agony Letter & If type O+1 & Diphthongs + Text “Arab Science”.pdf
 
Odontogenesis and its related anomiles.pptx
Odontogenesis and its related anomiles.pptxOdontogenesis and its related anomiles.pptx
Odontogenesis and its related anomiles.pptx
 
SOCIAL JUSTICE LESSON ON CATCH UP FRIDAY
SOCIAL JUSTICE LESSON ON CATCH UP FRIDAYSOCIAL JUSTICE LESSON ON CATCH UP FRIDAY
SOCIAL JUSTICE LESSON ON CATCH UP FRIDAY
 
Plant Genetic Resources, Germplasm, gene pool - Copy.pptx
Plant Genetic Resources, Germplasm, gene pool - Copy.pptxPlant Genetic Resources, Germplasm, gene pool - Copy.pptx
Plant Genetic Resources, Germplasm, gene pool - Copy.pptx
 
New Features in the Odoo 17 Sales Module
New Features in  the Odoo 17 Sales ModuleNew Features in  the Odoo 17 Sales Module
New Features in the Odoo 17 Sales Module
 
Bayesian Analysis Fundamentals with Examples
Bayesian Analysis Fundamentals with ExamplesBayesian Analysis Fundamentals with Examples
Bayesian Analysis Fundamentals with Examples
 

Human Genome and Big Data Challenges

  • 1. Human Genome and Big Data Challenges Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health http://www.slideshare.net/pebourne
  • 2. The History of Analytics in Biomedical Research 1980s 1990s 2000s 2010s 2020 Discipline: Unknown Expt. Driven Emergent Over-sold A Service A Partner A Driver The Raw Material: Non-existent Limited /Poor More/Ontologies Big Data/Siloed Open/Integrated The People: No name Technicians Industry recognition data scientists Academics Searls (ed) The Roots in Bioinformatics Series PLOS Comp Biol
  • 3. Data Science Timeline 6/12 • U54 Centers of Excellence - under review • U54 BD2K-LINCS– under review • U24 Data Discovery Index– under review • R01, R41, R42, R43, R44, U01 software and analysis methods grants – on-going • T32, T15, K01, R25 and R26 training awards – under review 2/14 3/14
  • 4. “It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair …”
  • 5. A Tale of Two Numbers Source Michael Bell http://homepages.cs.ncl.ac.uk/m.j.bell1/blog/?p=830
  • 6. Myriad Data Types Other ‘Omic Imaging Phenotypic Clinical Genomic Exposure
  • 7. Growth May Just be Beginning  Evidence: – Google car – 3D printers – Waze – Robotics From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee
  • 8. There are other drivers of change out there besides economics and an increasing emphasis on data and analytics
  • 9. Politicians Demand It: G8 open data charter 9http://opensource.com/government/13/7/open-data-charter-g8
  • 11. The Story of Meredith http://fora.tv/2012/04/20/Congress_Unplugged_ Phil_Bourne Stephen Friend
  • 12. We have Entered An Era of Deinstitutionalize & Democratization of Science Daniel Hulshizer/Associated Press
  • 13. We have Entered An Era of Deinstitutionalize & Democratization of Science – NIH Should Support This Daniel Hulshizer/Associated Press
  • 14. 47/53 “landmark” publications could not be replicated [Begley, Ellis Nature, 483, 2012] [Carole Goble]
  • 15. Scholarship is broken  I have a paper with 16,000 citations that no one has ever read  I have papers in PLOS ONE that have more citations than ones in PNAS  I have data sets I am proud of few places to put them  I edited a journal but it did not count for much
  • 16. It was the age when software developers are in the greatest demand for science.. It was the age when the rewards outside academia are greater than the rewards inside
  • 17. It was a time when patient data are becoming more available It is a time when the ability to maintain the anonymity of a patient gets harder and harder
  • 18. To Summarize Thus Far … A time of great (unprecedented?) scientific development but limited funding A time of upheaval in the way we do science
  • 19. From a funders perspective… A time to squeeze every cent/penny to maximize the amount of research that can be done A time when top down approaches meet bottom up approaches
  • 20. Top Down vs Bottom Up  Top Down – Regulations e.g. US: Common Rule, FISMA, HIPPA – Data sharing policies • OSTP • GWAS • Genome data • Clinical trials – Digital enablement – Moves towards reproducibility  Bottom Up – Communities emerge and crowd source • Collaboration • Data shared • Open source software • Common principles • Standards
  • 21. Okay so what are we doing about it?
  • 22. To start with we are thinking about the complete research lifecycle
  • 23. The Research Life Cycle IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION
  • 24. Tools and Resources Will Continue To Be Developed IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Authoring Tools Lab Notebooks Data Capture Software Analysis Tools Visualization Scholarly Communication
  • 25. Those Elements of the Research Life Cycle Need to Become More Interconnected Around a Common Framework IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Authoring Tools Lab Notebooks Data Capture Software Analysis Tools Visualization Scholarly Communication
  • 26. Those Elements of the Research Life Cycle Need to Become More Interconnected Around a Common Framework IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Authoring Tools Lab Notebooks Data Capture Software Analysis Tools Visualization Scholarly Communication Commercial & Public Tools Git-like Resources By Discipline Data Journals Discipline- Based Metadata Standards Community Portals Institutional Repositories New Reward Systems Commercial Repositories Training
  • 27. Those Elements of the Research Life Cycle Need to Become More Interconnected Around a Common Framework IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Authoring Tools Lab Notebooks Data Capture Software Analysis Tools Visualization Scholarly Communication Commercial & Public Tools Git-like Resources By Discipline Data Journals Discipline- Based Metadata Standards Community Portals Institutional Repositories New Reward Systems Commercial Repositories Training
  • 28. Associate Director for Data Science Commons Training Center BD2K Modified Review Sustainability* Education* Innovation* Process • Cloud – Data & Compute • Search • Security • Reproducibility Standards • App Store • Coordinate • Hands-on • Syllabus • MOOCs • Community • Centers • Training Grants • Catalogs • Standards • Analysis • Data Resource Support • Metrics • Best Practices • Evaluation • Portfolio Analysis The Biomedical Research Digital Enterprise Communication Collaboration rogrammatic Theme Deliverable Example Features • IC’s • Researchers • Federal Agencies • International Partners • Computer Scientists Scientific Data Council External Advisory Board * Hires made
  • 29. I. Facilitating Broad Use of Biomedical Big Data II. Developing and Disseminating Analysis Methods and Software for Biomedical Big Data III. Enhancing Training for Biomedical Big Data IV. Establishing Centers of Excellence for Biomedical Big Data BD2K: Four Programmatic Areas
  • 30. What are we proposing as that common framework?
  • 31. The Commons Is …  A public/private partnership  An agile development starting with the evaluation of a few pilots  An example: porting DbGAP to the cloud  An experiment with new funding strategies
  • 32. What The Commons Is and Is Not  Is Not: – A database – Confined to one physical location – A new large infrastructure – Owned by any one group  Is: – A conceptual framework – Analogous to the Internet – A collaboratory – A few shared rules • All research objects have unique identifiers • All research objects have limited provenance
  • 33. Sustainability and Sharing: The Commons Data The Long Tail Core Facilities/HS Centers Clinical /Patient The Why: Data Sharing Plans The Commons Government The How: Data Discovery Index Sustainable Storage Quality Scientific Discovery Usability Security/ Privacy Commons == Extramural NCBI == Research Object Sandbox == Collaborative Environment The End Game: KnowledgeNIH Awardees Private Sector Metrics/ Standards Rest of Academia Software Standards Index BD2K Centers Cloud, Research Objects,
  • 34. What Does the Commons Enable?  Dropbox like storage  The opportunity to apply quality metrics  Bring compute to the data  A place to collaborate  A place to discover http://100plus.com/wp-content/uploads/Data-Commons-3- 1024x825.png
  • 35. [Adapted from George Komatsoulis] One Possible Commons Business Model HPC, Institution …
  • 36. Commons Pilots  Define a set of use cases emphasizing: – Openness of the system – Support for basic statistical analysis – Embedding of existing applications – API support into existing resources  Evaluate against the use cases  Review results & business model with NIH leadership  Design a pilot phase with various groups  Conduct pilot for 6-12 months  Evaluate outcomes and determine whether a wider deployment makes sense  Report to NIH leadership summer 2015
  • 37. One Possible End Product 1. User clicks on thumbnail 2. Metadata and a webservices call provide a renderable image that can be annotated 3. Selecting a features provides a database/literature mashup 4. That leads to new papers 1. A link brings up figures from the paper 0. Full text of PLoS papers stored in a database 2. Clicking the paper figure retrieves data from the PDB which is analyzed 3. A composite view of journal and database content results 4. The composite view has links to pertinent blocks of literature text and back to the PDB 1. 2. 3. 4. PLoS Comp. Biol. 2005 1(3) e34
  • 38. Mission Statement To foster an ecosystem that enables biomedical research to be conducted as a digital enterprise that enhances health, lengthens life and reduces illness and disability
  • 39. Some Acknowledgements  Eric Green & Mark Guyer (NHGRI)  Jennie Larkin (NHLBI)  Leigh Finnegan (NHGRI)  Vivien Bonazzi (NHGRI)  Michelle Dunn (NCI)  Mike Huerta (NLM)  David Lipman (NLM)  Jim Ostell (NLM)  Andrea Norris (CIT)  Peter Lyster (NIGMS)  All the over 100 folks on the BD2K team

Editor's Notes

  1. 1 hr
  2. Within biomedical research, many data types Victims of our own success Data production outstrips data handling and analysis Major long-term changes are needed
  3. Ioannidis JPA (2005) Why Most Published Research Findings Are False. PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124 http://www.reuters.com/article/2012/03/28/us-science-cancer-idUSBRE82R12P20120328
  4. Federal Information Security Management Act of 2002 The Health Insurance Portability and Accountability Act of 1996