SlideShare a Scribd company logo
1 of 43
National Cancer Institute 
U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES 
National Institutes of Health 
NCI Genomics 
Data Commons 
and cloud pilots 
September 2014
Overview 
• National Challenges in Cancer Data 
• Disruptive Technologies 
• NCI Genomics Data Commons 
• NCI Cloud Pilots 
• Building a national learning health system 
for cancer clinical genomics
National Challenges in Cancer 
Informatics 
• Lowering barriers to data access, 
analysis and modeling for cancer 
research 
• Integration of data and learning from 
basic and clinical research with 
cancer care that enable prediction 
and improved outcomes
We need: 
• Open Science (Open Access, Open Data, 
Open Source) and Data Liquidity for the 
cancer community 
• Semantic interoperability through CDEs 
and Case Report Forms mapped to 
standards 
• Sustainable models for informatics 
infrastructure, services, data
Where we are 
Disruptive technologies 
Getting social 
Open access to data
Disruptive Technologies 
• Printing 
• Steam power 
• Transportation 
• Electricity 
• Antibiotics 
• Semiconductors &VLSI design 
• http 
• High throughput biology 
Systems view - end of reductionism?
Disruptive Technologies 
• Printing 
• Steam power 
• Transportation 
• Electricity 
• Antibiotics 
• Semiconductors &VLSI design 
• http 
• High throughput biology 
• Ubiquitous computing 
Everyone is a data provider 
Data immersion 
World: 
6.6B active mobile contracts 
1.9B smart phone contracts 
1.1B land lines 
World population 7.1B 
US: 
345M active mobile contracts 
287M smart phone contracts 
US population 313M
What about social media? 
• Social media may be one avenue for 
modifying behaviors that result in cancer 
• Properly orchestrated, social media can 
have dramatic impact on quality of life 
for patients and survivors 
• It can reach into all segments of our 
society, including underserved populations
Public Health 
• These three modifiable factors - 
infectious disease, smoking, and poor 
nutrition and lack of exercise contribute 
to at least 50% of our current cancer 
burden. And the cost from loss of quality of 
life, pain and suffering is incalculable.
Some NCI Big Data activities 
• TCGA, TARGET and ICGC 
– Cancer Genomics Data Commons 
– NCI Cloud Pilots 
• Molecular Clinical Trials: 
– MPACT, MATCH, Exceptional Responders
Data are accumulating!
From the Second Machine Age 
From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant 
Technologies by Erik Brynjolfsson & Andrew McAfee
Molecular data is Big Data 
• Brief trip down memory lane 
• Sequencing and the Human Genome 
Project
GenBank High Throughput 
Genome Sequence (HTGS)
February 12, 2001
HGP outcomes
19
Assays and Data Types 
20
TCGA history 
• Initiated in 2005 
• Collaboration of NHGRI and NCI to 
examine GBM, Lung and Ovarian cancer 
using genomic techniques in 2006. 
• Expanded to 20+ tumor types.
TCGA drivers 
• Providing high quality reference sets for 
20+ tissue types 
• Providing a platform for systems biology 
and hypothesis generation 
• Providing a test bed for understanding the 
real world implications of consent and data 
access policies on genomic and clinical 
data.
Focus on TCGA 
• TCGA consortium slides 
• Thanks to Lou Staudt and Jean Claude 
Zenklusen
TCGA – 
Lessons from 
structural 
genomics 
Jean Claude Zenklusen, 
Ph.D. 
Director 
TCGA Program Office 
National Cancer Institute
The Mutational Burden of Human Cancer 
Mike Lawrence and Gaddy Getz 
Increasing genomic 
complexity 
Childhood 
cancers 
Carcinogens
Molecular Subgroups Refine Histological Diagnosis 
TCGA Nature 497:67 (2013) 
Of Endometrial Carcinoma 
POLE 
(ultra-mutated) 
MSI 
(hypermutated) 
Copy-number low 
(endometriod) 
Copy-number high 
(serous-like) 
Mutations 
Per Mb 
PolE 
MSI / MSH2 
Copy # 
PTEN 
p53 
Histology 
Serous 
misdiagnosed 
as endometrioid? 
Histology 
Endometrioid 
Serous
Molecular Diagnosis of Endometrial Cancer May 
Surgery only? 
Adjuvant 
radiotherapy? 
TCGA Nature 497:67 (2013) 
Influence Choice of Therapy 
POLE 
(ultra-mutated) 
MSI 
(hypermutated) 
Copy-number low 
(endometriod) 
Copy-number high 
(serous-like) 
Mutations 
Per Mb 
PolE 
MSI / MSH2 
Copy # 
PTEN 
p53 
Histology 
Adjuvant 
chemotherapy?
NCI Cancer Genomics Data Commons 
GDC 
NCI Genomics 
Data Commons 
Genomic + 
clinical data 
. . .
NCI Cancer Genomics Data Commons 
GDC 
NCI Genomics 
Data Commons 
Genomic + 
clinical data 
. . . 
Cancer 
information 
donor
Utility of a Cancer Knowledge Base 
GDC 
Identify 
low-frequency 
cancer drivers 
Define genomic 
determinants of response 
to therapy 
Compose clinical trial 
cohorts sharing 
Targeted genetic lesions 
Cancer 
information 
donor
DACO 
ICGC 
dbGaP 
EGA 
TCGA 
BAM 
Open 
Open 
ERA 
BA 
M 
Germ 
Line 
+ EGA id 
BA 
M 
BA 
M
ICGC 
BAM/FASTQ 
TCGA 
BAM/FASTQ 
ICGC 
Open 
Data 
(includes 
TCGA 
Open Data) 
COSMIC 
Open 
Data
Driver for the Cloud Pilots 
• An inflection point for TCGA is looming 
2,500,000	 
2,000,000	 
1,500,000	 
1,000,000	 
500,000	 
0	 
7/1/09	 
1/1/10	 
7/1/10	 
1/1/11	 
7/1/11	 
1/1/12	 
7/1/12	 
1/1/13	 
7/1/13	 
1/1/14	 
7/1/14	 
Gigabytes (GB)
Local copies and computation 
• Assuming the 2.5 PB TCGA data set 
• Storage and backups ~ $1M US 
• Downloading TCGA data at 10 Gb/sec = 
23 days 
• Size + high dimensionality = high 
computational requirements that grow 
quickly
Relationship of the Cancer Genomics 
Data Commons and NCI Cloud Pilots 
GDC 
NCI Cloud 
Computational Centers 
Periodic 
Data Freezes 
Search / 
retrieve 
Analysis 
NCI Genomics 
Data Commons
Cancer Genomics Cloud Pilots
NCI Cloud Pilots 
• Funding for up to 3 cloud pilots - 24 
month pilots that are meant to inform the 
Cancer Genomics Data Commons 
– Explore models for cancer genomics APIs 
– Explore cloud models for data+analysis
NCI Cloud Pilots 
• A way to move computation to the data 
• Sustainable models for providing access 
to data 
• Reproducible pipelines for QA, variant 
calling, knowledge sharing 
• Define genomics/phenomics APIs for 
discovering new variants contributing to 
cancer, enhancing response, modulating 
risk
The future 
• Elastic computing ‘clouds’ 
• Social networks 
• Big Data analytics 
• Precision medicine 
• Measuring health 
• Practicing protective medicine 
Semantic and 
synoptic data 
Intervening 
before health is 
compromised 
Learning systems that enable learning 
from every cancer patient
Thank you 
Warren A. Kibbe 
warren.kibbe@nih.gov
FDA NGS and Big Data Conference September 2014

More Related Content

What's hot

What's hot (18)

A Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge SystemA Vision for a Cancer Research Knowledge System
A Vision for a Cancer Research Knowledge System
 
SILS 2015 - Connecting Precision Medicine to Precision Wellness
SILS 2015 - Connecting Precision Medicine to Precision Wellness SILS 2015 - Connecting Precision Medicine to Precision Wellness
SILS 2015 - Connecting Precision Medicine to Precision Wellness
 
Precision Medicine in the Age of NCI MATCH and the Beau Biden Cancer Moonshot
Precision Medicine in the Age of NCI MATCH and the Beau Biden Cancer MoonshotPrecision Medicine in the Age of NCI MATCH and the Beau Biden Cancer Moonshot
Precision Medicine in the Age of NCI MATCH and the Beau Biden Cancer Moonshot
 
Cancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data CommonsCancer Moonshot, Data sharing and the Genomic Data Commons
Cancer Moonshot, Data sharing and the Genomic Data Commons
 
Kibbe One Voice Against Cancer 20170605
Kibbe One Voice Against Cancer 20170605Kibbe One Voice Against Cancer 20170605
Kibbe One Voice Against Cancer 20170605
 
Nci clinical genomics data sharing ncra sept 2016
Nci clinical genomics data sharing ncra sept 2016Nci clinical genomics data sharing ncra sept 2016
Nci clinical genomics data sharing ncra sept 2016
 
Genomics and Computation in Precision Medicine March 2017
Genomics and Computation in Precision Medicine March 2017Genomics and Computation in Precision Medicine March 2017
Genomics and Computation in Precision Medicine March 2017
 
Precision Medicine in Oncology Informatics
Precision Medicine in Oncology InformaticsPrecision Medicine in Oncology Informatics
Precision Medicine in Oncology Informatics
 
Converged IT Summit - NCI Data Sharing
Converged IT Summit - NCI Data SharingConverged IT Summit - NCI Data Sharing
Converged IT Summit - NCI Data Sharing
 
ISCB ECCB BD2K keynote Kibbe 201707
ISCB ECCB BD2K keynote Kibbe 201707ISCB ECCB BD2K keynote Kibbe 201707
ISCB ECCB BD2K keynote Kibbe 201707
 
NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR NCI Cancer Genomics, Open Science and PMI: FAIR
NCI Cancer Genomics, Open Science and PMI: FAIR
 
Precision Medicine in Oncology Informatics
Precision Medicine in Oncology InformaticsPrecision Medicine in Oncology Informatics
Precision Medicine in Oncology Informatics
 
NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016
 
PMED: APPM Workshop: Data & Analytics in Precision Oncology- Warren Kibbe, Ma...
PMED: APPM Workshop: Data & Analytics in Precision Oncology- Warren Kibbe, Ma...PMED: APPM Workshop: Data & Analytics in Precision Oncology- Warren Kibbe, Ma...
PMED: APPM Workshop: Data & Analytics in Precision Oncology- Warren Kibbe, Ma...
 
Federal Research & Development for the Florida system Sept 2014
Federal Research & Development for the Florida system Sept 2014 Federal Research & Development for the Florida system Sept 2014
Federal Research & Development for the Florida system Sept 2014
 
ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014ICBO 2014, October 8, 2014
ICBO 2014, October 8, 2014
 
NCI Cancer Imaging Program - Cancer Research Data Ecosystem
NCI Cancer Imaging Program - Cancer Research Data EcosystemNCI Cancer Imaging Program - Cancer Research Data Ecosystem
NCI Cancer Imaging Program - Cancer Research Data Ecosystem
 
Big Data in Cancer Control
Big Data in Cancer ControlBig Data in Cancer Control
Big Data in Cancer Control
 

Similar to FDA NGS and Big Data Conference September 2014

Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014
Joel Saltz
 
Oncology Discoveries, University of Chicago
Oncology Discoveries, University of ChicagoOncology Discoveries, University of Chicago
Oncology Discoveries, University of Chicago
uchicagotech
 

Similar to FDA NGS and Big Data Conference September 2014 (20)

16
1616
16
 
EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013EBI Industry programme TCGA Warren KIbbe November 2013
EBI Industry programme TCGA Warren KIbbe November 2013
 
Data Commons & Data Science Workshop
Data Commons & Data Science WorkshopData Commons & Data Science Workshop
Data Commons & Data Science Workshop
 
Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).Biocuration activities for the International Cancer Genome Consortium (ICGC).
Biocuration activities for the International Cancer Genome Consortium (ICGC).
 
Nov 2014 ouellette_windsor_icgc_final
Nov 2014 ouellette_windsor_icgc_finalNov 2014 ouellette_windsor_icgc_final
Nov 2014 ouellette_windsor_icgc_final
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: The TraIT user stories fo...
 
Clinical Genomics and Medicine
Clinical Genomics and MedicineClinical Genomics and Medicine
Clinical Genomics and Medicine
 
Big data and better health outcomes - Researcher perspective
Big data and better health outcomes - Researcher perspectiveBig data and better health outcomes - Researcher perspective
Big data and better health outcomes - Researcher perspective
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
 
Enabling Translational Medicine with e-Science
Enabling Translational Medicine with e-ScienceEnabling Translational Medicine with e-Science
Enabling Translational Medicine with e-Science
 
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
Advancing Innovation and Convergence in Cancer Research: US Federal Cancer Mo...
 
NCI HTAN, cancer trajectories, precision oncology
NCI HTAN, cancer trajectories, precision oncologyNCI HTAN, cancer trajectories, precision oncology
NCI HTAN, cancer trajectories, precision oncology
 
Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014Computational Pathology Workshop July 8 2014
Computational Pathology Workshop July 8 2014
 
Genomics: Big Data Leading to Big Opportunities
Genomics: Big Data Leading to Big OpportunitiesGenomics: Big Data Leading to Big Opportunities
Genomics: Big Data Leading to Big Opportunities
 
LLS Southern California Blood Cancer Conference, March 4, 2017
LLS Southern California Blood Cancer Conference, March 4, 2017LLS Southern California Blood Cancer Conference, March 4, 2017
LLS Southern California Blood Cancer Conference, March 4, 2017
 
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating CenterInternational Cancer Genomics Consortium (ICGC) Data Coordinating Center
International Cancer Genomics Consortium (ICGC) Data Coordinating Center
 
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-shareRozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
Rozen 2016-10-05-ieee-cibcb-big-genome-data-to-share
 
Oncology Discoveries, University of Chicago
Oncology Discoveries, University of ChicagoOncology Discoveries, University of Chicago
Oncology Discoveries, University of Chicago
 
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
Health IT Summit Austin 2013 - Presentation "The Impact of All Data on Health...
 
Introduction to Cancer Genomics Databases
Introduction to Cancer Genomics DatabasesIntroduction to Cancer Genomics Databases
Introduction to Cancer Genomics Databases
 

More from Warren Kibbe

More from Warren Kibbe (20)

CCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptxCCDI Kibbe Wake Forest University Dec 2023.pptx
CCDI Kibbe Wake Forest University Dec 2023.pptx
 
Big Data Training for Cancer Research, Purdue, May 2023
Big Data Training for Cancer Research, Purdue, May 2023Big Data Training for Cancer Research, Purdue, May 2023
Big Data Training for Cancer Research, Purdue, May 2023
 
CCDI Overview November 2022
CCDI Overview November 2022CCDI Overview November 2022
CCDI Overview November 2022
 
RADx-UP CDCC Overview November 2022
RADx-UP CDCC Overview November 2022RADx-UP CDCC Overview November 2022
RADx-UP CDCC Overview November 2022
 
CCDI Kibbe Big Data Training May 2022
CCDI Kibbe Big Data Training May 2022CCDI Kibbe Big Data Training May 2022
CCDI Kibbe Big Data Training May 2022
 
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
Real world data, the National COVID-19 Cohort Consortium, and Oncology 2021
 
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
Childhood Cancer Data Initiative presentation to the Children’s Brain Tumor N...
 
RADx-UP CDCC presentation for the NIH Disaster Interest Group
RADx-UP CDCC presentation for the NIH Disaster Interest GroupRADx-UP CDCC presentation for the NIH Disaster Interest Group
RADx-UP CDCC presentation for the NIH Disaster Interest Group
 
DCHI webinar on N3C January 2021
DCHI webinar on N3C January 2021DCHI webinar on N3C January 2021
DCHI webinar on N3C January 2021
 
NCATS CTSA N3C
NCATS CTSA N3C NCATS CTSA N3C
NCATS CTSA N3C
 
NAACCR June 2020
NAACCR June 2020NAACCR June 2020
NAACCR June 2020
 
ENAR 2020
ENAR 2020ENAR 2020
ENAR 2020
 
ENAR 2020
ENAR 2020ENAR 2020
ENAR 2020
 
Technology and connected health for population science kibbe duke jan 2020
Technology and connected health for population science kibbe duke jan 2020Technology and connected health for population science kibbe duke jan 2020
Technology and connected health for population science kibbe duke jan 2020
 
Super computing 19 Cancer Computing Workshop Keynote
Super computing 19 Cancer Computing Workshop KeynoteSuper computing 19 Cancer Computing Workshop Keynote
Super computing 19 Cancer Computing Workshop Keynote
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
Data supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbeData supporting precision oncology fda wakibbe
Data supporting precision oncology fda wakibbe
 
Data Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health SystemData Harmonization for a Molecularly Driven Health System
Data Harmonization for a Molecularly Driven Health System
 
Data sharing Webinar March 2019
Data sharing Webinar March  2019Data sharing Webinar March  2019
Data sharing Webinar March 2019
 
Data in precision oncology SAMSI Precision Medicine Meeting mar 2019
Data in precision oncology SAMSI Precision Medicine Meeting mar 2019Data in precision oncology SAMSI Precision Medicine Meeting mar 2019
Data in precision oncology SAMSI Precision Medicine Meeting mar 2019
 

Recently uploaded

Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
gajnagarg
 
Cara Gugurkan Pembuahan Secara Alami Dan Cepat ABORSI KANDUNGAN 087776558899
Cara Gugurkan Pembuahan Secara Alami Dan Cepat ABORSI KANDUNGAN 087776558899Cara Gugurkan Pembuahan Secara Alami Dan Cepat ABORSI KANDUNGAN 087776558899
Cara Gugurkan Pembuahan Secara Alami Dan Cepat ABORSI KANDUNGAN 087776558899
Cara Menggugurkan Kandungan 087776558899
 
Competitive Advantage slide deck___.pptx
Competitive Advantage slide deck___.pptxCompetitive Advantage slide deck___.pptx
Competitive Advantage slide deck___.pptx
ScottMeyers35
 
Unique Value Prop slide deck________.pdf
Unique Value Prop slide deck________.pdfUnique Value Prop slide deck________.pdf
Unique Value Prop slide deck________.pdf
ScottMeyers35
 

Recently uploaded (20)

Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Haldia [ 7014168258 ] Call Me For Genuine Models We...
 
Election 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdfElection 2024 Presiding Duty Keypoints_01.pdf
Election 2024 Presiding Duty Keypoints_01.pdf
 
Cara Gugurkan Pembuahan Secara Alami Dan Cepat ABORSI KANDUNGAN 087776558899
Cara Gugurkan Pembuahan Secara Alami Dan Cepat ABORSI KANDUNGAN 087776558899Cara Gugurkan Pembuahan Secara Alami Dan Cepat ABORSI KANDUNGAN 087776558899
Cara Gugurkan Pembuahan Secara Alami Dan Cepat ABORSI KANDUNGAN 087776558899
 
Scaling up coastal adaptation in Maldives through the NAP process
Scaling up coastal adaptation in Maldives through the NAP processScaling up coastal adaptation in Maldives through the NAP process
Scaling up coastal adaptation in Maldives through the NAP process
 
Delivery in 20 Mins Call Girls Malappuram { 9332606886 } VVIP NISHA Call Girl...
Delivery in 20 Mins Call Girls Malappuram { 9332606886 } VVIP NISHA Call Girl...Delivery in 20 Mins Call Girls Malappuram { 9332606886 } VVIP NISHA Call Girl...
Delivery in 20 Mins Call Girls Malappuram { 9332606886 } VVIP NISHA Call Girl...
 
Lorain Road Business District Revitalization Plan Final Presentation
Lorain Road Business District Revitalization Plan Final PresentationLorain Road Business District Revitalization Plan Final Presentation
Lorain Road Business District Revitalization Plan Final Presentation
 
Bhubaneswar Call Girls Bhubaneswar 👉👉 9777949614 Top Class Call Girl Service ...
Bhubaneswar Call Girls Bhubaneswar 👉👉 9777949614 Top Class Call Girl Service ...Bhubaneswar Call Girls Bhubaneswar 👉👉 9777949614 Top Class Call Girl Service ...
Bhubaneswar Call Girls Bhubaneswar 👉👉 9777949614 Top Class Call Girl Service ...
 
Contributi dei parlamentari del PD - Contributi L. 3/2019
Contributi dei parlamentari del PD - Contributi L. 3/2019Contributi dei parlamentari del PD - Contributi L. 3/2019
Contributi dei parlamentari del PD - Contributi L. 3/2019
 
Financing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCCFinancing strategies for adaptation. Presentation for CANCC
Financing strategies for adaptation. Presentation for CANCC
 
The NAP process & South-South peer learning
The NAP process & South-South peer learningThe NAP process & South-South peer learning
The NAP process & South-South peer learning
 
Peace-Conflict-and-National-Adaptation-Plan-NAP-Processes-.pdf
Peace-Conflict-and-National-Adaptation-Plan-NAP-Processes-.pdfPeace-Conflict-and-National-Adaptation-Plan-NAP-Processes-.pdf
Peace-Conflict-and-National-Adaptation-Plan-NAP-Processes-.pdf
 
2024: The FAR, Federal Acquisition Regulations, Part 32
2024: The FAR, Federal Acquisition Regulations, Part 322024: The FAR, Federal Acquisition Regulations, Part 32
2024: The FAR, Federal Acquisition Regulations, Part 32
 
Panchayath circular KLC -Panchayath raj act s 169, 218
Panchayath circular KLC -Panchayath raj act s 169, 218Panchayath circular KLC -Panchayath raj act s 169, 218
Panchayath circular KLC -Panchayath raj act s 169, 218
 
Time, Stress & Work Life Balance for Clerks with Beckie Whitehouse
Time, Stress & Work Life Balance for Clerks with Beckie WhitehouseTime, Stress & Work Life Balance for Clerks with Beckie Whitehouse
Time, Stress & Work Life Balance for Clerks with Beckie Whitehouse
 
Just Call VIP Call Girls In Bangalore Kr Puram ☎️ 6378878445 Independent Fem...
Just Call VIP Call Girls In  Bangalore Kr Puram ☎️ 6378878445 Independent Fem...Just Call VIP Call Girls In  Bangalore Kr Puram ☎️ 6378878445 Independent Fem...
Just Call VIP Call Girls In Bangalore Kr Puram ☎️ 6378878445 Independent Fem...
 
NAP Expo - Delivering effective and adequate adaptation.pptx
NAP Expo - Delivering effective and adequate adaptation.pptxNAP Expo - Delivering effective and adequate adaptation.pptx
NAP Expo - Delivering effective and adequate adaptation.pptx
 
Competitive Advantage slide deck___.pptx
Competitive Advantage slide deck___.pptxCompetitive Advantage slide deck___.pptx
Competitive Advantage slide deck___.pptx
 
tOld settlement register shouldnotaffect BTR
tOld settlement register shouldnotaffect BTRtOld settlement register shouldnotaffect BTR
tOld settlement register shouldnotaffect BTR
 
Finance strategies for adaptation. Presentation for CANCC
Finance strategies for adaptation. Presentation for CANCCFinance strategies for adaptation. Presentation for CANCC
Finance strategies for adaptation. Presentation for CANCC
 
Unique Value Prop slide deck________.pdf
Unique Value Prop slide deck________.pdfUnique Value Prop slide deck________.pdf
Unique Value Prop slide deck________.pdf
 

FDA NGS and Big Data Conference September 2014

  • 1. National Cancer Institute U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health NCI Genomics Data Commons and cloud pilots September 2014
  • 2. Overview • National Challenges in Cancer Data • Disruptive Technologies • NCI Genomics Data Commons • NCI Cloud Pilots • Building a national learning health system for cancer clinical genomics
  • 3. National Challenges in Cancer Informatics • Lowering barriers to data access, analysis and modeling for cancer research • Integration of data and learning from basic and clinical research with cancer care that enable prediction and improved outcomes
  • 4. We need: • Open Science (Open Access, Open Data, Open Source) and Data Liquidity for the cancer community • Semantic interoperability through CDEs and Case Report Forms mapped to standards • Sustainable models for informatics infrastructure, services, data
  • 5. Where we are Disruptive technologies Getting social Open access to data
  • 6. Disruptive Technologies • Printing • Steam power • Transportation • Electricity • Antibiotics • Semiconductors &VLSI design • http • High throughput biology Systems view - end of reductionism?
  • 7. Disruptive Technologies • Printing • Steam power • Transportation • Electricity • Antibiotics • Semiconductors &VLSI design • http • High throughput biology • Ubiquitous computing Everyone is a data provider Data immersion World: 6.6B active mobile contracts 1.9B smart phone contracts 1.1B land lines World population 7.1B US: 345M active mobile contracts 287M smart phone contracts US population 313M
  • 8. What about social media? • Social media may be one avenue for modifying behaviors that result in cancer • Properly orchestrated, social media can have dramatic impact on quality of life for patients and survivors • It can reach into all segments of our society, including underserved populations
  • 9. Public Health • These three modifiable factors - infectious disease, smoking, and poor nutrition and lack of exercise contribute to at least 50% of our current cancer burden. And the cost from loss of quality of life, pain and suffering is incalculable.
  • 10. Some NCI Big Data activities • TCGA, TARGET and ICGC – Cancer Genomics Data Commons – NCI Cloud Pilots • Molecular Clinical Trials: – MPACT, MATCH, Exceptional Responders
  • 12. From the Second Machine Age From: The Second Machine Age: Work, Progress, and Prosperity in a Time of Brilliant Technologies by Erik Brynjolfsson & Andrew McAfee
  • 13. Molecular data is Big Data • Brief trip down memory lane • Sequencing and the Human Genome Project
  • 14.
  • 15.
  • 16. GenBank High Throughput Genome Sequence (HTGS)
  • 19. 19
  • 20. Assays and Data Types 20
  • 21. TCGA history • Initiated in 2005 • Collaboration of NHGRI and NCI to examine GBM, Lung and Ovarian cancer using genomic techniques in 2006. • Expanded to 20+ tumor types.
  • 22. TCGA drivers • Providing high quality reference sets for 20+ tissue types • Providing a platform for systems biology and hypothesis generation • Providing a test bed for understanding the real world implications of consent and data access policies on genomic and clinical data.
  • 23. Focus on TCGA • TCGA consortium slides • Thanks to Lou Staudt and Jean Claude Zenklusen
  • 24. TCGA – Lessons from structural genomics Jean Claude Zenklusen, Ph.D. Director TCGA Program Office National Cancer Institute
  • 25. The Mutational Burden of Human Cancer Mike Lawrence and Gaddy Getz Increasing genomic complexity Childhood cancers Carcinogens
  • 26.
  • 27.
  • 28. Molecular Subgroups Refine Histological Diagnosis TCGA Nature 497:67 (2013) Of Endometrial Carcinoma POLE (ultra-mutated) MSI (hypermutated) Copy-number low (endometriod) Copy-number high (serous-like) Mutations Per Mb PolE MSI / MSH2 Copy # PTEN p53 Histology Serous misdiagnosed as endometrioid? Histology Endometrioid Serous
  • 29. Molecular Diagnosis of Endometrial Cancer May Surgery only? Adjuvant radiotherapy? TCGA Nature 497:67 (2013) Influence Choice of Therapy POLE (ultra-mutated) MSI (hypermutated) Copy-number low (endometriod) Copy-number high (serous-like) Mutations Per Mb PolE MSI / MSH2 Copy # PTEN p53 Histology Adjuvant chemotherapy?
  • 30. NCI Cancer Genomics Data Commons GDC NCI Genomics Data Commons Genomic + clinical data . . .
  • 31. NCI Cancer Genomics Data Commons GDC NCI Genomics Data Commons Genomic + clinical data . . . Cancer information donor
  • 32. Utility of a Cancer Knowledge Base GDC Identify low-frequency cancer drivers Define genomic determinants of response to therapy Compose clinical trial cohorts sharing Targeted genetic lesions Cancer information donor
  • 33. DACO ICGC dbGaP EGA TCGA BAM Open Open ERA BA M Germ Line + EGA id BA M BA M
  • 34. ICGC BAM/FASTQ TCGA BAM/FASTQ ICGC Open Data (includes TCGA Open Data) COSMIC Open Data
  • 35. Driver for the Cloud Pilots • An inflection point for TCGA is looming 2,500,000 2,000,000 1,500,000 1,000,000 500,000 0 7/1/09 1/1/10 7/1/10 1/1/11 7/1/11 1/1/12 7/1/12 1/1/13 7/1/13 1/1/14 7/1/14 Gigabytes (GB)
  • 36. Local copies and computation • Assuming the 2.5 PB TCGA data set • Storage and backups ~ $1M US • Downloading TCGA data at 10 Gb/sec = 23 days • Size + high dimensionality = high computational requirements that grow quickly
  • 37. Relationship of the Cancer Genomics Data Commons and NCI Cloud Pilots GDC NCI Cloud Computational Centers Periodic Data Freezes Search / retrieve Analysis NCI Genomics Data Commons
  • 39. NCI Cloud Pilots • Funding for up to 3 cloud pilots - 24 month pilots that are meant to inform the Cancer Genomics Data Commons – Explore models for cancer genomics APIs – Explore cloud models for data+analysis
  • 40. NCI Cloud Pilots • A way to move computation to the data • Sustainable models for providing access to data • Reproducible pipelines for QA, variant calling, knowledge sharing • Define genomics/phenomics APIs for discovering new variants contributing to cancer, enhancing response, modulating risk
  • 41. The future • Elastic computing ‘clouds’ • Social networks • Big Data analytics • Precision medicine • Measuring health • Practicing protective medicine Semantic and synoptic data Intervening before health is compromised Learning systems that enable learning from every cancer patient
  • 42. Thank you Warren A. Kibbe warren.kibbe@nih.gov