SlideShare a Scribd company logo
1 of 35
Download to read offline
IntOGen, Integrative OncoGenomics
     for personal cancer genomes

                                        Christian Pérez-Llamas
                                        Biomedical Genomics Lab
                                        Pompeu Fabra University




Biomedical Research Park at Barcelona
IntOGen, Integrative OncoGenomics
     for personal cancer genomes

                                        Christian Pérez-Llamas
                                        Biomedical Genomics Lab
                                        Pompeu Fabra University




Biomedical Research Park at Barcelona
Overview
              Oncogenomics data              Clinical annotations         Biological modules

                Transcriptomic alterations               International                Functional
DATA




                Copy Number alterations                  Classification               Regulatory
                Mutations                                of Diseases                  Cancer related
                ...                                      for Oncology                 ...



                     Integrative methodologies
STATISTICS




                Cancer related genes identification                              Data management
                Cancer related modules identification
                Combinations of experiments by ICDO
                Generation of cancer specific modules



               Web discovery tool                 Biomart services                    Gitools
EXPLORATION




                www.intogen.org                  biomart.intogen.org              www.gitools.org
Data
       Transcriptomic alterations     Copy number alterations           Mutations


                                              Copy Number Analysis
                                              from Sanger Institute




                      Selection of experiments
                         Public data
                         Experiment design: cancer vs normal
                         At least 20 samples



                      Annotation of tumour type
                         International Classification of Diseases for Oncology (ICD-O)
                         Manual curation from publication or description
                         Progenetix already annotated with ICD-O


                          More than 800 experiments
                          More than 25000 samples
                          Almost 150 ICD-O tumor types
Statistics             Cancer related genes identification




                                                      exp. 1
        experiment 1
          samples
                            STEP 1
                         identification of




                                              genes
                         driver alterations
genes




        altered                                 0                   0.05 1
        not altered
                                                       corrected p-value
Statistics             Cancer related genes identification




                                                                                                            Cancer type A
                                                      exp. 1

                                                                   exp. 2
                                                                   exp. 3

                                                                                  exp. n
        experiment 1
          samples
                            STEP 1                                                          STEP 2
                         identification of                                                 combination of




                                              genes
                         driver alterations                                                experiments
                                                               +            ...
genes




        altered                                 0                            0.05 1
        not altered
                                                       corrected p-value
Statistics   Cancer related modules identification
Exploration
 Web discovery tool    Biomart services         Gitools
  www.intogen.org     biomart.intogen.org   www.gitools.org
Cancer gene prioritization with personal genomes

 TUMOUR
 SAMPLE




  READS

             Mutations
             INDELS
             Dif. Expr.


 LONG LIST
OF ALTERED
  GENES
Exploration
 Web discovery tool           Biomart services                   Gitools
  www.intogen.org            biomart.intogen.org           www.gitools.org



           MartView


                                                              RESTful
                                                             Web service
       biomart.intogen.org                           biomart.intogen.org/martservice




                                           biomaRt        perl         python          curl
Exploration
 Web discovery tool    Biomart services        Gitools
  www.intogen.org     biomart.intogen.org   www.gitools.org
Exploration
 Web discovery tool    Biomart services        Gitools
  www.intogen.org     biomart.intogen.org   www.gitools.org
Exploration
 Web discovery tool    Biomart services        Gitools
  www.intogen.org     biomart.intogen.org   www.gitools.org
More details...

  IntOGen: Integration and data-mining of multidimensional oncogenomic data
  Gundem G, Perez-Llamas C, Jene-Sanz A, Kedzierska A,Islam A,
  Deu-Pons J, Furney S and Lopez-Bigas N.

  Nature Methods, 7, 92-93 (2010)




             www.intogen.org                            www.gitools.org

           biomart.intogen.org
International Cancer Genome Consortium




   50 cancer types

   500 samples each cancer type

   About 25000 genomes in total
International Cancer Genome Consortium




   50 cancer types

   500 samples each cancer type

   About 25000 genomes in total
                                  Data Storage, Analysis & Management
Cancer genomes in the context of IntOGen
             ICGC-CLL genome
                  project
                            samples
Samples
  7 CLL
  7 normal


                       genes
Technology
  RNA-seq



                               altere
 Alteration                    d
                               not
  Dif. Expression:             altered
  - Upregulated
  - Downregulated
(Roderic Guigo lab)
Cancer genomes in the context of IntOGen
             ICGC-CLL genome                              IntOGen
                  project
                            samples                      tumours /
Samples
                                                        experiments
  7 CLL
  7 normal


                       genes




                                         genes
Technology
  RNA-seq



                               altere            0                  0.05 1
 Alteration                    d
                               not
                                                     corrected p-
  Dif. Expression:                                   value
                               altered
  - Upregulated
  - Downregulated
(Roderic Guigo lab)
Cancer genomes in the context of IntOGen
             ICGC-CLL genome                              IntOGen
                  project
                            samples                        tumours
Samples
  7 CLL
  7 normal


                       genes




                                         genes
Technology
  RNA-seq



                               altere            0                  0.05 1
 Alteration                    d
                               not
                                                     corrected p-
  Dif. Expression:                                   value
                               altered
  - Upregulated
  - Downregulated
(Roderic Guigo lab)
Cancer genomes in the context of IntOGen
             ICGC-CLL genome                                            IntOGen
                  project
                            samples                                      tumours
Samples
  7 CLL
  7 normal


                       genes




                                                       genes
Technology
  RNA-seq



                                  altere                       0                  0.05 1
 Alteration                       d
                                  not
                                                                   corrected p-
  Dif. Expression:                                                 value
                                  altered
  - Upregulated                        Enrichment
  - Downregulated                      analysis

(Roderic Guigo lab)              samples                                 tumours
                       pathway




                                                       pathway
                          s




                                                          s


                           0                  0.05 1           0                  0.05 1
                               corrected p-                        corrected p-
                               value                               value
Cancer genomes in the context of IntOGen
             ICGC-CLL genome                                            IntOGen
                  project
                            samples                                      tumours
Samples
  7 CLL
  7 normal


                       genes




                                                       genes
Technology
  RNA-seq



                                  altere                       0                  0.05 1
 Alteration                       d
                                  not
                                                                   corrected p-
  Dif. Expression:                                                 value
                                  altered
  - Upregulated                        Enrichment
  - Downregulated                      analysis

(Roderic Guigo lab)              samples                                 tumours
                       pathway




                                                       pathway
                          s




                                                          s


                           0                  0.05 1           0                  0.05 1
                               corrected p-                        corrected p-
                               value                               value
Considerations for the next version



    Ethical
    Technological
Ethical considerations




                     Data that cannot be used
          open       to identify individuals:
         access      age, normalized gene expression, ...




                     Germline genomic data and
        controlled   detailed clinical information
         access      associated to a unique individual
Ethical considerations




                     Data that cannot be used
          open       to identify individuals:
         access      age, normalized gene expression, ...




                     Germline genomic data and
        controlled   detailed clinical information
         access      associated to a unique individual
Technical considerations

                                             User interfaces

               Management        Gitools        Browser        Biomart        Web services




                                              IntOGen core

             Experiments     Analysis        Analysis         Data         Data       Data
             management     management       workflows     management     models    importers




                                             Infrastructure

                Hadoop         Hadoop
                                               Cascading       PIG       Amazon / Eucalyptus
              Map-Reduce        DFS

                                                                               Bioinformatics
              Grid Engine      Plain files       MySQL         MongoDB            software
Technical considerations
 Genome view
                                                  User interfaces

                    Management        Gitools        Browser        Biomart        Web services




 NGS workflows                                     IntOGen core

                  Experiments     Analysis        Analysis         Data         Data       Data
                  management     management       workflows     management     models    importers




 Web management                                   Infrastructure

                    Hadoop          Hadoop
                                                    Cascading       PIG       Amazon / Eucalyptus
                   Map-Reduce        DFS

                                                                                    Bioinformatics
                   Grid Engine      Plain files       MySQL         MongoDB            software
Technical considerations
    Genome view
                                                                                User interfaces

                                         Management             Gitools              Browser          Biomart          Web services




    NGS workflows                                                                IntOGen core

                                      Experiments         Analysis               Analysis          Data            Data         Data
                                      management         management              workflows      management        models      importers




    Web management                                                               Infrastructure

                                          Hadoop              Hadoop
                                                                                   Cascading          PIG       Amazon / Eucalyptus
                                         Map-Reduce            DFS

                                                                                                                        Bioinformatics
                                         Grid Engine          Plain files            MySQL           MongoDB               software



    Flexibility                                                                 Scalability
●Different ways to access the data                                          ●   Quantity of data increases
●Methods constantly evolving                                                ●   And also the number and complexity of calculations
●Methods impl. different languages and infrastructure requirements
Summary

  IntOGen is a novel framework for oncogenomics data integration
  and analysis
  It integrates many tumor types and different types of alterations in
  a common framework
  It explores the data at different levels, from individual experiments
  to combinations of experiments, and from individual genes to
  biological modules
  It incorporates an intuitive web system designed to be a
  discovery tool for cancer researchers
  I have presented some examples on how to use IntOGen and Gitools
  to prioritize and compare personal genomes data.
  We are adapting IntOGen to store, analyze and visualize
  next generation sequencing data, which will allow to incorporate
  data from the ICGC, starting by the Chronic Lymphocytic Leukemia
  data.
  Ethical and technological considerations has to be addressed.
Acknowledgements


Biomedical Genomics

  Nuria López-Bigas


   Gunes Gundem
    Jordi Deu-Pons
   Khademul Islam
   Alba Jené-Sanz
  Michael Schroeder
    Xavier Rafael
    Sophia Derdak
 Abel Gonzalez-Pérez
   Armand Gutierrez

More Related Content

What's hot

Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18Sage Base
 
LAS - Project Overview
LAS - Project OverviewLAS - Project Overview
LAS - Project OverviewLASircc
 
Applications of microarray
Applications of microarrayApplications of microarray
Applications of microarraysana shakeel
 
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23Sage Base
 
Neuromics Presentation V4
Neuromics Presentation V4Neuromics Presentation V4
Neuromics Presentation V4Pete Shuster
 
TriStar Corporate Presentation
TriStar Corporate PresentationTriStar Corporate Presentation
TriStar Corporate Presentationthnkstudios
 
Multi-scale network biology model & the model library
Multi-scale network biology model & the model libraryMulti-scale network biology model & the model library
Multi-scale network biology model & the model librarylaserxiong
 
Bio-IT 2010 Genome Commons
Bio-IT 2010 Genome CommonsBio-IT 2010 Genome Commons
Bio-IT 2010 Genome CommonsReece Hart
 
In tech perforated-patch_clamp_in_non_neuronal_cells_the_model_of_mammalian_s...
In tech perforated-patch_clamp_in_non_neuronal_cells_the_model_of_mammalian_s...In tech perforated-patch_clamp_in_non_neuronal_cells_the_model_of_mammalian_s...
In tech perforated-patch_clamp_in_non_neuronal_cells_the_model_of_mammalian_s...Jorge Parodi
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingAamir Wahab
 
Current and future techniques for cancer diagnosis
Current and future techniques for  cancer diagnosisCurrent and future techniques for  cancer diagnosis
Current and future techniques for cancer diagnosisNitin Talreja
 
NIST Microbial Genomic RMs
NIST Microbial Genomic RMs NIST Microbial Genomic RMs
NIST Microbial Genomic RMs Nathan Olson
 
Applications of microarray
Applications of microarrayApplications of microarray
Applications of microarrayprateek kumar
 
The Microbiome of Research Animals : Implications for Reproducibility, Transl...
The Microbiome of Research Animals : Implications for Reproducibility, Transl...The Microbiome of Research Animals : Implications for Reproducibility, Transl...
The Microbiome of Research Animals : Implications for Reproducibility, Transl...QIAGEN
 
Using methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and ageUsing methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and ageQIAGEN
 
DNA Microarray and Analysis of Metabolic Control
DNA Microarray and Analysis of Metabolic ControlDNA Microarray and Analysis of Metabolic Control
DNA Microarray and Analysis of Metabolic Controlshilpa sharma
 
Molecular analysis of Microbial Community
Molecular analysis of Microbial CommunityMolecular analysis of Microbial Community
Molecular analysis of Microbial CommunityRinaldo John
 

What's hot (20)

Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18Stephen Friend ICR UK 2012-06-18
Stephen Friend ICR UK 2012-06-18
 
LAS - Project Overview
LAS - Project OverviewLAS - Project Overview
LAS - Project Overview
 
Applications of microarray
Applications of microarrayApplications of microarray
Applications of microarray
 
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
Stephen Friend Complex Traits: Genomics and Computational Approaches 2012-02-23
 
Neuromics Presentation V4
Neuromics Presentation V4Neuromics Presentation V4
Neuromics Presentation V4
 
TriStar Corporate Presentation
TriStar Corporate PresentationTriStar Corporate Presentation
TriStar Corporate Presentation
 
Multi-scale network biology model & the model library
Multi-scale network biology model & the model libraryMulti-scale network biology model & the model library
Multi-scale network biology model & the model library
 
High throughput genotyping
High throughput genotypingHigh throughput genotyping
High throughput genotyping
 
Dna microarray mehran- u of toronto
Dna microarray  mehran- u of torontoDna microarray  mehran- u of toronto
Dna microarray mehran- u of toronto
 
Bio-IT 2010 Genome Commons
Bio-IT 2010 Genome CommonsBio-IT 2010 Genome Commons
Bio-IT 2010 Genome Commons
 
In tech perforated-patch_clamp_in_non_neuronal_cells_the_model_of_mammalian_s...
In tech perforated-patch_clamp_in_non_neuronal_cells_the_model_of_mammalian_s...In tech perforated-patch_clamp_in_non_neuronal_cells_the_model_of_mammalian_s...
In tech perforated-patch_clamp_in_non_neuronal_cells_the_model_of_mammalian_s...
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
132 gene expression in atherosclerotic plaques
132 gene expression in atherosclerotic plaques132 gene expression in atherosclerotic plaques
132 gene expression in atherosclerotic plaques
 
Current and future techniques for cancer diagnosis
Current and future techniques for  cancer diagnosisCurrent and future techniques for  cancer diagnosis
Current and future techniques for cancer diagnosis
 
NIST Microbial Genomic RMs
NIST Microbial Genomic RMs NIST Microbial Genomic RMs
NIST Microbial Genomic RMs
 
Applications of microarray
Applications of microarrayApplications of microarray
Applications of microarray
 
The Microbiome of Research Animals : Implications for Reproducibility, Transl...
The Microbiome of Research Animals : Implications for Reproducibility, Transl...The Microbiome of Research Animals : Implications for Reproducibility, Transl...
The Microbiome of Research Animals : Implications for Reproducibility, Transl...
 
Using methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and ageUsing methylation patterns to determine origin of biological material and age
Using methylation patterns to determine origin of biological material and age
 
DNA Microarray and Analysis of Metabolic Control
DNA Microarray and Analysis of Metabolic ControlDNA Microarray and Analysis of Metabolic Control
DNA Microarray and Analysis of Metabolic Control
 
Molecular analysis of Microbial Community
Molecular analysis of Microbial CommunityMolecular analysis of Microbial Community
Molecular analysis of Microbial Community
 

Similar to IntOGen, Integrative Oncogenomics for Personal Cancer Genomes

Microarrays;application
Microarrays;applicationMicroarrays;application
Microarrays;applicationFyzah Bashir
 
Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Sage Base
 
General Principles of Toxicogenomics
General Principles of ToxicogenomicsGeneral Principles of Toxicogenomics
General Principles of Toxicogenomicscwoodland
 
Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Vall d'Hebron Institute of Research (VHIR)
 
Genomics In Personal Care Product Development
Genomics In Personal Care Product DevelopmentGenomics In Personal Care Product Development
Genomics In Personal Care Product DevelopmentGenemarkers
 
Mining Gene Expression Data Focusing Cancer Therapeutics: A Digest
Mining Gene Expression Data Focusing Cancer Therapeutics: A DigestMining Gene Expression Data Focusing Cancer Therapeutics: A Digest
Mining Gene Expression Data Focusing Cancer Therapeutics: A DigestKaashivInfoTech Company
 
Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21Sage Base
 
Digital PCR for soybean GMO detection on the OpenArray Platform: a case study...
Digital PCR for soybean GMO detection on the OpenArray Platform: a case study...Digital PCR for soybean GMO detection on the OpenArray Platform: a case study...
Digital PCR for soybean GMO detection on the OpenArray Platform: a case study...Thermo Fisher Scientific
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicJoaquin Dopazo
 
Q biomarkersomaticmutation
Q biomarkersomaticmutationQ biomarkersomaticmutation
Q biomarkersomaticmutationElsa von Licy
 
Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Leighton Pritchard
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...laserxiong
 

Similar to IntOGen, Integrative Oncogenomics for Personal Cancer Genomes (20)

Microarrays;application
Microarrays;applicationMicroarrays;application
Microarrays;application
 
Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18Stephen Friend Food & Drug Administration 2011-07-18
Stephen Friend Food & Drug Administration 2011-07-18
 
General Principles of Toxicogenomics
General Principles of ToxicogenomicsGeneral Principles of Toxicogenomics
General Principles of Toxicogenomics
 
Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...Identification of pathological mutations from the single-gene case to exome p...
Identification of pathological mutations from the single-gene case to exome p...
 
Genomics In Personal Care Product Development
Genomics In Personal Care Product DevelopmentGenomics In Personal Care Product Development
Genomics In Personal Care Product Development
 
Mining Gene Expression Data Focusing Cancer Therapeutics: A Digest
Mining Gene Expression Data Focusing Cancer Therapeutics: A DigestMining Gene Expression Data Focusing Cancer Therapeutics: A Digest
Mining Gene Expression Data Focusing Cancer Therapeutics: A Digest
 
Epigenetics 2013
Epigenetics 2013Epigenetics 2013
Epigenetics 2013
 
Lehrach
LehrachLehrach
Lehrach
 
Wp3
Wp3Wp3
Wp3
 
IntOGen & Gitools
IntOGen & GitoolsIntOGen & Gitools
IntOGen & Gitools
 
Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21Stephen Friend AMIA Symposium 2012-03-21
Stephen Friend AMIA Symposium 2012-03-21
 
Digital PCR for soybean GMO detection on the OpenArray Platform: a case study...
Digital PCR for soybean GMO detection on the OpenArray Platform: a case study...Digital PCR for soybean GMO detection on the OpenArray Platform: a case study...
Digital PCR for soybean GMO detection on the OpenArray Platform: a case study...
 
CIMNA CRO Central Lab
CIMNA CRO Central LabCIMNA CRO Central Lab
CIMNA CRO Central Lab
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 
155 dna microarray
155 dna microarray155 dna microarray
155 dna microarray
 
155 dna microarray
155 dna microarray155 dna microarray
155 dna microarray
 
Dna microarray mehran
Dna microarray  mehranDna microarray  mehran
Dna microarray mehran
 
Q biomarkersomaticmutation
Q biomarkersomaticmutationQ biomarkersomaticmutation
Q biomarkersomaticmutation
 
Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1Comparative Genomics and Visualisation - Part 1
Comparative Genomics and Visualisation - Part 1
 
Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...Novel network pharmacology methods for drug mechanism of action identificatio...
Novel network pharmacology methods for drug mechanism of action identificatio...
 

IntOGen, Integrative Oncogenomics for Personal Cancer Genomes

  • 1. IntOGen, Integrative OncoGenomics for personal cancer genomes Christian Pérez-Llamas Biomedical Genomics Lab Pompeu Fabra University Biomedical Research Park at Barcelona
  • 2. IntOGen, Integrative OncoGenomics for personal cancer genomes Christian Pérez-Llamas Biomedical Genomics Lab Pompeu Fabra University Biomedical Research Park at Barcelona
  • 3.
  • 4.
  • 5. Overview Oncogenomics data Clinical annotations Biological modules Transcriptomic alterations International Functional DATA Copy Number alterations Classification Regulatory Mutations of Diseases Cancer related ... for Oncology ... Integrative methodologies STATISTICS Cancer related genes identification Data management Cancer related modules identification Combinations of experiments by ICDO Generation of cancer specific modules Web discovery tool Biomart services Gitools EXPLORATION www.intogen.org biomart.intogen.org www.gitools.org
  • 6. Data Transcriptomic alterations Copy number alterations Mutations Copy Number Analysis from Sanger Institute Selection of experiments Public data Experiment design: cancer vs normal At least 20 samples Annotation of tumour type International Classification of Diseases for Oncology (ICD-O) Manual curation from publication or description Progenetix already annotated with ICD-O More than 800 experiments More than 25000 samples Almost 150 ICD-O tumor types
  • 7. Statistics Cancer related genes identification exp. 1 experiment 1 samples STEP 1 identification of genes driver alterations genes altered 0 0.05 1 not altered corrected p-value
  • 8. Statistics Cancer related genes identification Cancer type A exp. 1 exp. 2 exp. 3 exp. n experiment 1 samples STEP 1 STEP 2 identification of combination of genes driver alterations experiments + ... genes altered 0 0.05 1 not altered corrected p-value
  • 9. Statistics Cancer related modules identification
  • 10. Exploration Web discovery tool Biomart services Gitools www.intogen.org biomart.intogen.org www.gitools.org
  • 11.
  • 12.
  • 13.
  • 14.
  • 15. Cancer gene prioritization with personal genomes TUMOUR SAMPLE READS Mutations INDELS Dif. Expr. LONG LIST OF ALTERED GENES
  • 16. Exploration Web discovery tool Biomart services Gitools www.intogen.org biomart.intogen.org www.gitools.org MartView RESTful Web service biomart.intogen.org biomart.intogen.org/martservice biomaRt perl python curl
  • 17. Exploration Web discovery tool Biomart services Gitools www.intogen.org biomart.intogen.org www.gitools.org
  • 18. Exploration Web discovery tool Biomart services Gitools www.intogen.org biomart.intogen.org www.gitools.org
  • 19. Exploration Web discovery tool Biomart services Gitools www.intogen.org biomart.intogen.org www.gitools.org
  • 20. More details... IntOGen: Integration and data-mining of multidimensional oncogenomic data Gundem G, Perez-Llamas C, Jene-Sanz A, Kedzierska A,Islam A, Deu-Pons J, Furney S and Lopez-Bigas N. Nature Methods, 7, 92-93 (2010) www.intogen.org www.gitools.org biomart.intogen.org
  • 21. International Cancer Genome Consortium 50 cancer types 500 samples each cancer type About 25000 genomes in total
  • 22. International Cancer Genome Consortium 50 cancer types 500 samples each cancer type About 25000 genomes in total Data Storage, Analysis & Management
  • 23. Cancer genomes in the context of IntOGen ICGC-CLL genome project samples Samples 7 CLL 7 normal genes Technology RNA-seq altere Alteration d not Dif. Expression: altered - Upregulated - Downregulated (Roderic Guigo lab)
  • 24. Cancer genomes in the context of IntOGen ICGC-CLL genome IntOGen project samples tumours / Samples experiments 7 CLL 7 normal genes genes Technology RNA-seq altere 0 0.05 1 Alteration d not corrected p- Dif. Expression: value altered - Upregulated - Downregulated (Roderic Guigo lab)
  • 25. Cancer genomes in the context of IntOGen ICGC-CLL genome IntOGen project samples tumours Samples 7 CLL 7 normal genes genes Technology RNA-seq altere 0 0.05 1 Alteration d not corrected p- Dif. Expression: value altered - Upregulated - Downregulated (Roderic Guigo lab)
  • 26. Cancer genomes in the context of IntOGen ICGC-CLL genome IntOGen project samples tumours Samples 7 CLL 7 normal genes genes Technology RNA-seq altere 0 0.05 1 Alteration d not corrected p- Dif. Expression: value altered - Upregulated Enrichment - Downregulated analysis (Roderic Guigo lab) samples tumours pathway pathway s s 0 0.05 1 0 0.05 1 corrected p- corrected p- value value
  • 27. Cancer genomes in the context of IntOGen ICGC-CLL genome IntOGen project samples tumours Samples 7 CLL 7 normal genes genes Technology RNA-seq altere 0 0.05 1 Alteration d not corrected p- Dif. Expression: value altered - Upregulated Enrichment - Downregulated analysis (Roderic Guigo lab) samples tumours pathway pathway s s 0 0.05 1 0 0.05 1 corrected p- corrected p- value value
  • 28. Considerations for the next version Ethical Technological
  • 29. Ethical considerations Data that cannot be used open to identify individuals: access age, normalized gene expression, ... Germline genomic data and controlled detailed clinical information access associated to a unique individual
  • 30. Ethical considerations Data that cannot be used open to identify individuals: access age, normalized gene expression, ... Germline genomic data and controlled detailed clinical information access associated to a unique individual
  • 31. Technical considerations User interfaces Management Gitools Browser Biomart Web services IntOGen core Experiments Analysis Analysis Data Data Data management management workflows management models importers Infrastructure Hadoop Hadoop Cascading PIG Amazon / Eucalyptus Map-Reduce DFS Bioinformatics Grid Engine Plain files MySQL MongoDB software
  • 32. Technical considerations Genome view User interfaces Management Gitools Browser Biomart Web services NGS workflows IntOGen core Experiments Analysis Analysis Data Data Data management management workflows management models importers Web management Infrastructure Hadoop Hadoop Cascading PIG Amazon / Eucalyptus Map-Reduce DFS Bioinformatics Grid Engine Plain files MySQL MongoDB software
  • 33. Technical considerations Genome view User interfaces Management Gitools Browser Biomart Web services NGS workflows IntOGen core Experiments Analysis Analysis Data Data Data management management workflows management models importers Web management Infrastructure Hadoop Hadoop Cascading PIG Amazon / Eucalyptus Map-Reduce DFS Bioinformatics Grid Engine Plain files MySQL MongoDB software Flexibility Scalability ●Different ways to access the data ● Quantity of data increases ●Methods constantly evolving ● And also the number and complexity of calculations ●Methods impl. different languages and infrastructure requirements
  • 34. Summary IntOGen is a novel framework for oncogenomics data integration and analysis It integrates many tumor types and different types of alterations in a common framework It explores the data at different levels, from individual experiments to combinations of experiments, and from individual genes to biological modules It incorporates an intuitive web system designed to be a discovery tool for cancer researchers I have presented some examples on how to use IntOGen and Gitools to prioritize and compare personal genomes data. We are adapting IntOGen to store, analyze and visualize next generation sequencing data, which will allow to incorporate data from the ICGC, starting by the Chronic Lymphocytic Leukemia data. Ethical and technological considerations has to be addressed.
  • 35. Acknowledgements Biomedical Genomics Nuria López-Bigas Gunes Gundem Jordi Deu-Pons Khademul Islam Alba Jené-Sanz Michael Schroeder Xavier Rafael Sophia Derdak Abel Gonzalez-Pérez Armand Gutierrez