SlideShare a Scribd company logo
The NCI Cancer Genomics
Cloud (CGC) Pilots
NIH IC Show and Tell
Tuesday, November 8, 2016, from 10:00am – 12:00pm
Steve Tsang
Durga Addepalli
Sean Davis
Cancer Genomic Data Challenges
● > 2.5 PB of TCGA data (WXS, RNASeq, WGS)
● Fragmentary repositories of cancer genomic data
○ TCGA, TARGET and CGCI have their own data repositories (DCCs)
○ Sequencing data: BAM files at CGhub while VCF/MAF files at DCC
● Assuming the 2.5 PB TCGA data set
○ Storage and Data Protection cost approximately $2,000,000 per year
○ Downloading TCGA data at 10 Gb/sec = 23 days
○ Only large institutions have the ability to utilize this data
○ These data types will continue to grow
Slide Courtesy of Tanja Davidsen, NCI
Cloud Pilots Concept: Co-located Compute & Data
Three Cancer Genomics Cloud Pilot Awardees
http://firecloud.orgFireCloud Concepts
● Data Files reside in Google Cloud
Storage
● Workspaces
● Tasks and Workflows
● Method Repositories
● Provenance captured for every
analysis run (i.e. what version of
what methods was run on what data
at what time)
FireCloud Overview
● The Workspace is the organizing
principle for FireCloud
○ When a workspace is created,
a Google bucket is
automatically attached to that
workspace
● The Data Model is the backbone
within the workspace
○ Holds meta-data, and bucket
pointers to input and output
http://cgc.systemsbiology.net/
… is to make TCGA data, together with tools and
compute-power, available and accessible to a broad
range of users using multiple access modes:
❏ Interactive web application
❏ Scripting languages: R, Python, SQL
❏ Direct programmatic access
❏ Build an open platform that can grow and evolve to satisfy a
broad range of users and use-cases
❏ Leverage the best existing tools and technologies, as they are
released
❏ Collaborate with the research community in areas of data
standards, containers, workflows, etc
❏ Provide a range of examples and tutorials to get newcomers
up and running quickly
http://www.cancergenomicscloud.org
/
❖The CGC aims to provide a collaborative environment where researchers can
take advantage of co-localized public data (like TCGA) and public tools; but
also recombine these with their private data and tools.
❖Guiding Principles
➢ Making data available isn’t enough to make it usable.
➢ The best science happens in teams.
➢ Reproducibility shouldn’t be hard.
➢ The impact of TCGA is extended by new data & tools
Seven Bridges Genomics CGC Objectives
❖Explore processed TCGA data for
mutations, copy number variations
and expression levels
❖Analyze data from their private
cohorts alongside TCGA data.
❖Use standard bioinformatics pipelines
to perform analyses.
❖Bring their own analysis tools directly
to the TCGA dataset.
❖Collaborate with researchers around
the world.
❖Access storage and compute
resources on the cloud on demand.
❖Access the CGC using the API as
Seven Bridges Genomic
CGC Features
Acknowledgement
Team CGC - https://goo.gl/f21Lqq
National Cancer Institute CBIIT
CGC Fact sheet - https://cbiit.nci.nih.gov/sites/nci-cbiit/files/Cloud_Pilot_Handout.pdf
Access Cloud Pilots https://cbiit.nci.nih.gov/ncip/nci-cancer-genomics-cloud-pilots/access-the-cloud-pilot-
platforms
Broad Institute - FireCloud - http://firecloud.org
Institute of Systems Biology - Cancer Genomics Cloud - http://cgc.systemsbiology.net/
Seven Bridges Genomics - Cancer Genomics Cloud - http://www.cancergenomicscloud.org/
Attain, LLC - http://http://www.attain.com/

More Related Content

Similar to The Cancer Genomics Cloud (CGC) Pilots NIH IC Show and Tell

The Cancer Genomics Cloud (CGC) pilots - an Introduction
The Cancer Genomics Cloud (CGC) pilots  - an IntroductionThe Cancer Genomics Cloud (CGC) pilots  - an Introduction
The Cancer Genomics Cloud (CGC) pilots - an Introduction
Steve Tsang
 
cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)
Pistoia Alliance
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deck
Pistoia Alliance
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
Ian Foster
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an example
Enis Afgan
 
PRISM Project Update
PRISM Project UpdatePRISM Project Update
PRISM Project Update
imgcommcall
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
c.titus.brown
 
CI4CC sustainability-panel
CI4CC sustainability-panelCI4CC sustainability-panel
CI4CC sustainability-panel
Ravi Madduri
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2
Dan Taylor
 
San diego-supercomputing-sc17-user-group
San diego-supercomputing-sc17-user-groupSan diego-supercomputing-sc17-user-group
San diego-supercomputing-sc17-user-group
inside-BigData.com
 
NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018
Alexander Pico
 
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
Larry Smarr
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
Carole Goble
 
Globus Genomics: Democratizing NGS Analysis
Globus Genomics: Democratizing NGS AnalysisGlobus Genomics: Democratizing NGS Analysis
Globus Genomics: Democratizing NGS Analysis
Ravi Madduri
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
Globus
 
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET
 
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017
Stacy Véronneau
 
Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...
Ola Spjuth
 
FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014
Warren Kibbe
 
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
Larry Smarr
 

Similar to The Cancer Genomics Cloud (CGC) Pilots NIH IC Show and Tell (20)

The Cancer Genomics Cloud (CGC) pilots - an Introduction
The Cancer Genomics Cloud (CGC) pilots  - an IntroductionThe Cancer Genomics Cloud (CGC) pilots  - an Introduction
The Cancer Genomics Cloud (CGC) pilots - an Introduction
 
cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)cBioPortal Webinar Slides (2/3)
cBioPortal Webinar Slides (2/3)
 
Opportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deckOpportunities for HPC in pharma R&D - main deck
Opportunities for HPC in pharma R&D - main deck
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an example
 
PRISM Project Update
PRISM Project UpdatePRISM Project Update
PRISM Project Update
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
CI4CC sustainability-panel
CI4CC sustainability-panelCI4CC sustainability-panel
CI4CC sustainability-panel
 
Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2Internet2 Bio IT 2016 v2
Internet2 Bio IT 2016 v2
 
San diego-supercomputing-sc17-user-group
San diego-supercomputing-sc17-user-groupSan diego-supercomputing-sc17-user-group
San diego-supercomputing-sc17-user-group
 
NRNB Annual Report 2018
NRNB Annual Report 2018NRNB Annual Report 2018
NRNB Annual Report 2018
 
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway SystemThe Pacific Research Platform: A Science-Driven Big-Data Freeway System
The Pacific Research Platform: A Science-Driven Big-Data Freeway System
 
Being FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data ScienceBeing FAIR: Enabling Reproducible Data Science
Being FAIR: Enabling Reproducible Data Science
 
Globus Genomics: Democratizing NGS Analysis
Globus Genomics: Democratizing NGS AnalysisGlobus Genomics: Democratizing NGS Analysis
Globus Genomics: Democratizing NGS Analysis
 
GlobusWorld 2020 Keynote
GlobusWorld 2020 KeynoteGlobusWorld 2020 Keynote
GlobusWorld 2020 Keynote
 
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...
 
OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017OpenStack Toronto Q3 MeetUp - September 28th 2017
OpenStack Toronto Q3 MeetUp - September 28th 2017
 
Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...Data-intensive applications on cloud computing resources: Applications in lif...
Data-intensive applications on cloud computing resources: Applications in lif...
 
FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014FDA NGS and Big Data Conference September 2014
FDA NGS and Big Data Conference September 2014
 
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
A National Big Data Cyberinfrastructure Supporting Computational Biomedical R...
 

Recently uploaded

8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
Shashank Shekhar Pandey
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Selcen Ozturkcan
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
PirithiRaju
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 

Recently uploaded (20)

8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1HOW DO ORGANISMS REPRODUCE?reproduction part 1
HOW DO ORGANISMS REPRODUCE?reproduction part 1
 
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfMending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdf
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 

The Cancer Genomics Cloud (CGC) Pilots NIH IC Show and Tell

  • 1. The NCI Cancer Genomics Cloud (CGC) Pilots NIH IC Show and Tell Tuesday, November 8, 2016, from 10:00am – 12:00pm Steve Tsang Durga Addepalli Sean Davis
  • 2. Cancer Genomic Data Challenges ● > 2.5 PB of TCGA data (WXS, RNASeq, WGS) ● Fragmentary repositories of cancer genomic data ○ TCGA, TARGET and CGCI have their own data repositories (DCCs) ○ Sequencing data: BAM files at CGhub while VCF/MAF files at DCC ● Assuming the 2.5 PB TCGA data set ○ Storage and Data Protection cost approximately $2,000,000 per year ○ Downloading TCGA data at 10 Gb/sec = 23 days ○ Only large institutions have the ability to utilize this data ○ These data types will continue to grow Slide Courtesy of Tanja Davidsen, NCI
  • 3. Cloud Pilots Concept: Co-located Compute & Data
  • 4. Three Cancer Genomics Cloud Pilot Awardees
  • 5. http://firecloud.orgFireCloud Concepts ● Data Files reside in Google Cloud Storage ● Workspaces ● Tasks and Workflows ● Method Repositories ● Provenance captured for every analysis run (i.e. what version of what methods was run on what data at what time)
  • 6. FireCloud Overview ● The Workspace is the organizing principle for FireCloud ○ When a workspace is created, a Google bucket is automatically attached to that workspace ● The Data Model is the backbone within the workspace ○ Holds meta-data, and bucket pointers to input and output
  • 7. http://cgc.systemsbiology.net/ … is to make TCGA data, together with tools and compute-power, available and accessible to a broad range of users using multiple access modes: ❏ Interactive web application ❏ Scripting languages: R, Python, SQL ❏ Direct programmatic access
  • 8. ❏ Build an open platform that can grow and evolve to satisfy a broad range of users and use-cases ❏ Leverage the best existing tools and technologies, as they are released ❏ Collaborate with the research community in areas of data standards, containers, workflows, etc ❏ Provide a range of examples and tutorials to get newcomers up and running quickly
  • 9. http://www.cancergenomicscloud.org / ❖The CGC aims to provide a collaborative environment where researchers can take advantage of co-localized public data (like TCGA) and public tools; but also recombine these with their private data and tools. ❖Guiding Principles ➢ Making data available isn’t enough to make it usable. ➢ The best science happens in teams. ➢ Reproducibility shouldn’t be hard. ➢ The impact of TCGA is extended by new data & tools Seven Bridges Genomics CGC Objectives
  • 10. ❖Explore processed TCGA data for mutations, copy number variations and expression levels ❖Analyze data from their private cohorts alongside TCGA data. ❖Use standard bioinformatics pipelines to perform analyses. ❖Bring their own analysis tools directly to the TCGA dataset. ❖Collaborate with researchers around the world. ❖Access storage and compute resources on the cloud on demand. ❖Access the CGC using the API as Seven Bridges Genomic CGC Features
  • 11. Acknowledgement Team CGC - https://goo.gl/f21Lqq National Cancer Institute CBIIT CGC Fact sheet - https://cbiit.nci.nih.gov/sites/nci-cbiit/files/Cloud_Pilot_Handout.pdf Access Cloud Pilots https://cbiit.nci.nih.gov/ncip/nci-cancer-genomics-cloud-pilots/access-the-cloud-pilot- platforms Broad Institute - FireCloud - http://firecloud.org Institute of Systems Biology - Cancer Genomics Cloud - http://cgc.systemsbiology.net/ Seven Bridges Genomics - Cancer Genomics Cloud - http://www.cancergenomicscloud.org/ Attain, LLC - http://http://www.attain.com/

Editor's Notes

  1. this is good but I would focus on how the native Google platform has been fully exploited - BigQuery and Google Genomics in addition to google storage
  2. It would be nice to have a visual of the case explorer or something else. Do you plan to explain why 3 pilots, what was uniquely evaluated in each of the three? also do you plan a concluding slide: - on next steps from the programs perspective and how these would become part of the Commons vision or something like that - a call to action for those who want to use it to access cancer data, availability of free credits and or mimic it for their ICs using the open source code of the platforms available for others use.