SlideShare a Scribd company logo
1 of 17
Download to read offline
Emerging strategies for computational
ADC target selection and prioritization
François Fauteux, Ph.D.
National Research Council Canada
Information and Communication Technologies
World ADC, San Diego
September 20, 2017
Outline
2
• Data warehousing
• Tumor classification
• ADC target selection
• Alternative splicing
• Genomic alterations
• Business intelligence
Molecular data
3
• NGS
• Illumina HiSeq
• STAR 2-pass (RNA-seq)
• BWA-MEM (WXS)
• GENCODE 22
• GRCh38.p2
• Microarray
• HG-U133 Plus 2.0
• BrainArray 20
Whole exome
Cancer Normal Cancer Normal Paired
Blood 1490 2212 134 507 54
Bone marrow 4678 274 151 0 149
Brain 2753 1735 667 1261 396
Breast 6173 584 1097 327 1044
Colon 3858 669 635 395 590
Heart 0 117 0 408 0
Kidney 1185 392 887 160 405
Liver 506 503 371 168 375
Lung 2391 501 1026 428 1066
Muscle 0 381 0 427 0
Ovary 1351 135 374 97 443
Pancreas 286 113 177 174 183
Prostate 356 157 499 158 498
Skin 84 834 103 891 470
Stomach 955 97 375 224 441
Uterus 293 124 603 118 542
Microarray RNA-seq
Data sources
4
• Microarray data & metadata
• GEO https://www.ncbi.nlm.nih.gov/geo
• GEOmetadb https://gbnci-abcc.ncifcrf.gov/geo
• NGS data & metadata
• GDC https://gdc.cancer.gov
• dbGap https://www.ncbi.nlm.nih.gov/gap
• SRA https://www.ncbi.nlm.nih.gov/sra
• Gene/protein annotations
• HGNC http://www.genenames.org
• Entrez Gene https://www.ncbi.nlm.nih.gov/gene
• GENCODE https://www.gencodegenes.org
• UniProt http://www.uniprot.org
• GOA http://www.ebi.ac.uk/GOA
• Projects
• TCGA https://cancergenome.nih.gov
• GTEX https://www.gtexportal.org
• TARGET https://ocg.cancer.gov/programs/target
• Publications
• PubMed https://www.ncbi.nlm.nih.gov/pubmed
• Patents
• USPTO https://www.uspto.gov
• EBI http://www.ebi.ac.uk/patentdata
• Clinical trials
• ClinicalTrials.gov https://clinicaltrials.gov
Cancer mortality
5
World Health Organization. 2012. GLOBOCAN.
American Cancer Society. 2015. Cancer Facts.
Canadian Cancer Society. 2015. Cancer Statistics.
World USA Canada
Brain 189,000 15,320 2,081
Breast 522,000 40,730 5,073
Cervix uteri 266,000 4,100 370
Colorectum 694,000 49,700 9,339
Corpus uteri 76,000 10,170 1,036
Gallbladder 142,000 3,700 265
Kidney 144,000 14,080 1,773
Leukemia 265,000 24,450 2,705
Liver 745,000 24,550 1,120
Lung 1,590,000 158,040 20,896
Melanoma 55,000 9,940 1,145
Multiple myeloma 80,000 11,240 1,367
NH lymphoma 200,000 19,790 2,656
Oesophagus 400,000 15,590 2,043
Oral 145,000 8,650 1,227
Ovary 152,000 14,180 1,739
Pancreas 331,000 40,560 4,590
Prostate 307,000 27,540 4,141
Stomach 723,000 10,720 2,048
Urinary bladder 165,000 16,000 2,306
Lung cancer
6
• Major histological subtypes
• ADC, SQC, LCC, SCC
• Heterogeneity within, between
subtypes
• Opportunities for target discovery
• Molecular classification
• Clean signal for target selection
Herbst et al. 2008. N Engl J Med.
Travis et al. 2015. J Thorac Oncol.
Class discovery
• Batch effect correction
• Unsupervised feature selection
• Consensus clustering
• Relevant number of clusters
Leek et al. 2012. Bioinformatics.
Wilkerson et al. 2012. Bioinformatics.
Charrad et al. 2014. J Stat Softw.
ADC k=3 SQC k=2
8
Fauteux et al. 2016. Oncotarget.
Class prediction
ADC target selection
9
Protein annotation
10
Alternative splicing
11
TCGA Research Network. 2014. Nature.
Ritchie et al. 2015. Nucleic Acids Res.
• The majority of multi-exon genes
are alternatively spliced
• µ = 7.25 transcripts/protein-coding gene
• Exon-based method
• edgeR/limma:
𝑙𝑜𝑔𝐹𝐶𝑒𝑥𝑜𝑛 − µ 𝑙𝑜𝑔𝐹𝐶 𝑜𝑡ℎ𝑒𝑟 𝑒𝑥𝑜𝑛𝑠
• Exon to TM/EC domain mapping
• 2,800 TM proteins
• 6,800 EC domains
• 28,500 exons
Differential exon usage
12
Genomic alterations
13
• Somatic variant calls
• VarScan2 MAF
• LUAD: 334,879
• LUSC: 307,378
• VEP: moderate/high impact
Kobolt et al. 2012. Genome Res.
McLaren et al. 2016. Genome Biol.
Business intelligence
14
• Database
• Patents (class 424)
• 130.1 (antibodies): 3,700
• 178.1 (conjugates): 1,400
• Publications: 530,000
• Clinical trials (mAb, ADC): 13,000
• Gene & protein synonyms,
aliases
• HGNC, Entrez, Uniprot
• Automated document retrieval
• String cleaning, matching
• English dictionary used as decoy
Conclusion
15
• We work with industry
• Focus on Canadian companies
• Co-development, services
• NRC ADC pipeline
• 60 ADC targets, 2,000 mAbs
• Breast, lung colon, pancreas, ovary
• Current work
• Isoform-level targets
• Antigen sequence optimization
• Immuno-oncology
Acknowledgements
16
• Ottawa (HHT)
• Jennifer Hill
• Jianjun Li
• Maria Moreno
• Ottawa (ICT)
• Anu Surendra
• Mira Cuperlovic-Culf
• Youlian Pan
• Montreal (HHT)
• André Nantel
• Anne Marcil
• Maria Jaramillo
• Richard Marcotte
• Jean Labrecque
• Vincent Dodelet
1717
Thank you
François Fauteux
Research Officer, Scientific Data Mining
Information and Communication Technologies
Tel: 613-993-0875
Francois.Fauteux@nrc-cnrc.gc.ca
www.nrc-cnrc.gc.ca

More Related Content

What's hot

Transcriptome Analysis of Spontaneous PDF
Transcriptome Analysis of Spontaneous PDFTranscriptome Analysis of Spontaneous PDF
Transcriptome Analysis of Spontaneous PDF
Janaya Shelly
 
Kshivets aats new_york2019
Kshivets aats new_york2019Kshivets aats new_york2019
Kshivets aats new_york2019
Oleg Kshivets
 

What's hot (16)

Kshivets barcelona2017
Kshivets barcelona2017Kshivets barcelona2017
Kshivets barcelona2017
 
Sample Presentation
Sample PresentationSample Presentation
Sample Presentation
 
Elma mg jurnal mobilisasi
Elma mg jurnal mobilisasiElma mg jurnal mobilisasi
Elma mg jurnal mobilisasi
 
Kshivets O. Cancer, Computer Sciences and Alive Supersystems
Kshivets O. Cancer, Computer Sciences and Alive SupersystemsKshivets O. Cancer, Computer Sciences and Alive Supersystems
Kshivets O. Cancer, Computer Sciences and Alive Supersystems
 
Kshivets iaslc denver2021
Kshivets iaslc denver2021Kshivets iaslc denver2021
Kshivets iaslc denver2021
 
Kshivets O. Cardioesophageal Cancer Surgery
Kshivets O. Cardioesophageal Cancer SurgeryKshivets O. Cardioesophageal Cancer Surgery
Kshivets O. Cardioesophageal Cancer Surgery
 
Kshivets O. Esophageal Cancer Surgery
Kshivets O. Esophageal Cancer SurgeryKshivets O. Esophageal Cancer Surgery
Kshivets O. Esophageal Cancer Surgery
 
Kshivets barcelona2016
Kshivets barcelona2016Kshivets barcelona2016
Kshivets barcelona2016
 
Kshivets O. Lung Cancer Surgery
Kshivets O. Lung Cancer SurgeryKshivets O. Lung Cancer Surgery
Kshivets O. Lung Cancer Surgery
 
Transcriptome Analysis of Spontaneous PDF
Transcriptome Analysis of Spontaneous PDFTranscriptome Analysis of Spontaneous PDF
Transcriptome Analysis of Spontaneous PDF
 
Kshivets O. Esophageal and Cardioesophageal Cancer Surgery
Kshivets O. Esophageal and Cardioesophageal Cancer SurgeryKshivets O. Esophageal and Cardioesophageal Cancer Surgery
Kshivets O. Esophageal and Cardioesophageal Cancer Surgery
 
Kshivets iaslc toronto2018
Kshivets iaslc toronto2018Kshivets iaslc toronto2018
Kshivets iaslc toronto2018
 
Kshivets wscts2019 sofia
Kshivets wscts2019 sofiaKshivets wscts2019 sofia
Kshivets wscts2019 sofia
 
20190615-資料科學與基因體研究的應用
20190615-資料科學與基因體研究的應用20190615-資料科學與基因體研究的應用
20190615-資料科學與基因體研究的應用
 
Kshivets aats new_york2019
Kshivets aats new_york2019Kshivets aats new_york2019
Kshivets aats new_york2019
 
Kshivets O. Esophagogastric Cancer Surgery
Kshivets O. Esophagogastric Cancer SurgeryKshivets O. Esophagogastric Cancer Surgery
Kshivets O. Esophagogastric Cancer Surgery
 

Similar to Fauteux World ADC 2017 San Diego

Future Horizons in the UK Cancer Diagnostics Market: Supplier Shares and Sale...
Future Horizons in the UK Cancer Diagnostics Market: Supplier Shares and Sale...Future Horizons in the UK Cancer Diagnostics Market: Supplier Shares and Sale...
Future Horizons in the UK Cancer Diagnostics Market: Supplier Shares and Sale...
ReportsnReports
 
Future Horizons in the Japanese Cancer Diagnostics Market: Supplier Shares an...
Future Horizons in the Japanese Cancer Diagnostics Market: Supplier Shares an...Future Horizons in the Japanese Cancer Diagnostics Market: Supplier Shares an...
Future Horizons in the Japanese Cancer Diagnostics Market: Supplier Shares an...
ReportsnReports
 
2013-11-26 DTL FIH symposium, Leiden
2013-11-26 DTL FIH symposium, Leiden2013-11-26 DTL FIH symposium, Leiden
2013-11-26 DTL FIH symposium, Leiden
Alain van Gool
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
Ian Foster
 

Similar to Fauteux World ADC 2017 San Diego (20)

The Scottish Ecosystem for Precision Medicine
The Scottish Ecosystem for Precision MedicineThe Scottish Ecosystem for Precision Medicine
The Scottish Ecosystem for Precision Medicine
 
Challenges and Opportunities for Digital PCR in the CLIA Laboratory of the Mo...
Challenges and Opportunities for Digital PCR in the CLIA Laboratory of the Mo...Challenges and Opportunities for Digital PCR in the CLIA Laboratory of the Mo...
Challenges and Opportunities for Digital PCR in the CLIA Laboratory of the Mo...
 
Covance Global Capabilities. #Covance
Covance Global Capabilities. #Covance Covance Global Capabilities. #Covance
Covance Global Capabilities. #Covance
 
InSyBio at Open Coffee Athens CI
InSyBio at Open Coffee Athens CIInSyBio at Open Coffee Athens CI
InSyBio at Open Coffee Athens CI
 
Metabolomics in the 21st century - perspective
Metabolomics in the 21st century - perspectiveMetabolomics in the 21st century - perspective
Metabolomics in the 21st century - perspective
 
Bioanalytical Capabilities -- Thought-Leading Science Armed with the Latest T...
Bioanalytical Capabilities -- Thought-Leading Science Armed with the Latest T...Bioanalytical Capabilities -- Thought-Leading Science Armed with the Latest T...
Bioanalytical Capabilities -- Thought-Leading Science Armed with the Latest T...
 
Meaningful (meta)data at scale: removing barriers to precision medicine research
Meaningful (meta)data at scale: removing barriers to precision medicine researchMeaningful (meta)data at scale: removing barriers to precision medicine research
Meaningful (meta)data at scale: removing barriers to precision medicine research
 
Charles River Pathology Associates Capabilities
Charles River Pathology Associates CapabilitiesCharles River Pathology Associates Capabilities
Charles River Pathology Associates Capabilities
 
Future Horizons in the UK Cancer Diagnostics Market: Supplier Shares and Sale...
Future Horizons in the UK Cancer Diagnostics Market: Supplier Shares and Sale...Future Horizons in the UK Cancer Diagnostics Market: Supplier Shares and Sale...
Future Horizons in the UK Cancer Diagnostics Market: Supplier Shares and Sale...
 
ChIP-seq Theory
ChIP-seq TheoryChIP-seq Theory
ChIP-seq Theory
 
Future Horizons in the Japanese Cancer Diagnostics Market: Supplier Shares an...
Future Horizons in the Japanese Cancer Diagnostics Market: Supplier Shares an...Future Horizons in the Japanese Cancer Diagnostics Market: Supplier Shares an...
Future Horizons in the Japanese Cancer Diagnostics Market: Supplier Shares an...
 
2013-11-26 DTL FIH symposium, Leiden
2013-11-26 DTL FIH symposium, Leiden2013-11-26 DTL FIH symposium, Leiden
2013-11-26 DTL FIH symposium, Leiden
 
Plenary presentation saturday 11 7_dr. lucie bruijn
Plenary presentation  saturday 11 7_dr. lucie bruijnPlenary presentation  saturday 11 7_dr. lucie bruijn
Plenary presentation saturday 11 7_dr. lucie bruijn
 
Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009Quantitative Medicine Feb 2009
Quantitative Medicine Feb 2009
 
Bioanalytical Capabilities - Thought-Leading Science Armed with the Latest Te...
Bioanalytical Capabilities - Thought-Leading Science Armed with the Latest Te...Bioanalytical Capabilities - Thought-Leading Science Armed with the Latest Te...
Bioanalytical Capabilities - Thought-Leading Science Armed with the Latest Te...
 
Sept2016 newsample cancer_craig
Sept2016 newsample cancer_craigSept2016 newsample cancer_craig
Sept2016 newsample cancer_craig
 
Lambda Therapeutic Research- Corporate Presentation
Lambda Therapeutic Research- Corporate PresentationLambda Therapeutic Research- Corporate Presentation
Lambda Therapeutic Research- Corporate Presentation
 
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
Overview Radboudumc Center for Proteomics, Glycomics and Metabolomics april 2015
 
Crowds Cure Canver: Annotating Data from The Cancer Imaging Archive
Crowds Cure Canver: Annotating Data from The Cancer Imaging ArchiveCrowds Cure Canver: Annotating Data from The Cancer Imaging Archive
Crowds Cure Canver: Annotating Data from The Cancer Imaging Archive
 
NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016NCI Cancer Genomic Data Commons for NCAB September 2016
NCI Cancer Genomic Data Commons for NCAB September 2016
 

Recently uploaded

(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
Scintica Instrumentation
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Silpa
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
Silpa
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
levieagacer
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
ANSARKHAN96
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
Silpa
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
Silpa
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
Silpa
 

Recently uploaded (20)

Dr. E. Muralinath_ Blood indices_clinical aspects
Dr. E. Muralinath_ Blood indices_clinical  aspectsDr. E. Muralinath_ Blood indices_clinical  aspects
Dr. E. Muralinath_ Blood indices_clinical aspects
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.Cyathodium bryophyte: morphology, anatomy, reproduction etc.
Cyathodium bryophyte: morphology, anatomy, reproduction etc.
 
Genome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptxGenome sequencing,shotgun sequencing.pptx
Genome sequencing,shotgun sequencing.pptx
 
Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Genetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditionsGenetics and epigenetics of ADHD and comorbid conditions
Genetics and epigenetics of ADHD and comorbid conditions
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptxTHE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
THE ROLE OF BIOTECHNOLOGY IN THE ECONOMIC UPLIFT.pptx
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.LUNULARIA -features, morphology, anatomy ,reproduction etc.
LUNULARIA -features, morphology, anatomy ,reproduction etc.
 
CYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptxCYTOGENETIC MAP................ ppt.pptx
CYTOGENETIC MAP................ ppt.pptx
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 

Fauteux World ADC 2017 San Diego

  • 1. Emerging strategies for computational ADC target selection and prioritization François Fauteux, Ph.D. National Research Council Canada Information and Communication Technologies World ADC, San Diego September 20, 2017
  • 2. Outline 2 • Data warehousing • Tumor classification • ADC target selection • Alternative splicing • Genomic alterations • Business intelligence
  • 3. Molecular data 3 • NGS • Illumina HiSeq • STAR 2-pass (RNA-seq) • BWA-MEM (WXS) • GENCODE 22 • GRCh38.p2 • Microarray • HG-U133 Plus 2.0 • BrainArray 20 Whole exome Cancer Normal Cancer Normal Paired Blood 1490 2212 134 507 54 Bone marrow 4678 274 151 0 149 Brain 2753 1735 667 1261 396 Breast 6173 584 1097 327 1044 Colon 3858 669 635 395 590 Heart 0 117 0 408 0 Kidney 1185 392 887 160 405 Liver 506 503 371 168 375 Lung 2391 501 1026 428 1066 Muscle 0 381 0 427 0 Ovary 1351 135 374 97 443 Pancreas 286 113 177 174 183 Prostate 356 157 499 158 498 Skin 84 834 103 891 470 Stomach 955 97 375 224 441 Uterus 293 124 603 118 542 Microarray RNA-seq
  • 4. Data sources 4 • Microarray data & metadata • GEO https://www.ncbi.nlm.nih.gov/geo • GEOmetadb https://gbnci-abcc.ncifcrf.gov/geo • NGS data & metadata • GDC https://gdc.cancer.gov • dbGap https://www.ncbi.nlm.nih.gov/gap • SRA https://www.ncbi.nlm.nih.gov/sra • Gene/protein annotations • HGNC http://www.genenames.org • Entrez Gene https://www.ncbi.nlm.nih.gov/gene • GENCODE https://www.gencodegenes.org • UniProt http://www.uniprot.org • GOA http://www.ebi.ac.uk/GOA • Projects • TCGA https://cancergenome.nih.gov • GTEX https://www.gtexportal.org • TARGET https://ocg.cancer.gov/programs/target • Publications • PubMed https://www.ncbi.nlm.nih.gov/pubmed • Patents • USPTO https://www.uspto.gov • EBI http://www.ebi.ac.uk/patentdata • Clinical trials • ClinicalTrials.gov https://clinicaltrials.gov
  • 5. Cancer mortality 5 World Health Organization. 2012. GLOBOCAN. American Cancer Society. 2015. Cancer Facts. Canadian Cancer Society. 2015. Cancer Statistics. World USA Canada Brain 189,000 15,320 2,081 Breast 522,000 40,730 5,073 Cervix uteri 266,000 4,100 370 Colorectum 694,000 49,700 9,339 Corpus uteri 76,000 10,170 1,036 Gallbladder 142,000 3,700 265 Kidney 144,000 14,080 1,773 Leukemia 265,000 24,450 2,705 Liver 745,000 24,550 1,120 Lung 1,590,000 158,040 20,896 Melanoma 55,000 9,940 1,145 Multiple myeloma 80,000 11,240 1,367 NH lymphoma 200,000 19,790 2,656 Oesophagus 400,000 15,590 2,043 Oral 145,000 8,650 1,227 Ovary 152,000 14,180 1,739 Pancreas 331,000 40,560 4,590 Prostate 307,000 27,540 4,141 Stomach 723,000 10,720 2,048 Urinary bladder 165,000 16,000 2,306
  • 6. Lung cancer 6 • Major histological subtypes • ADC, SQC, LCC, SCC • Heterogeneity within, between subtypes • Opportunities for target discovery • Molecular classification • Clean signal for target selection Herbst et al. 2008. N Engl J Med. Travis et al. 2015. J Thorac Oncol.
  • 7. Class discovery • Batch effect correction • Unsupervised feature selection • Consensus clustering • Relevant number of clusters Leek et al. 2012. Bioinformatics. Wilkerson et al. 2012. Bioinformatics. Charrad et al. 2014. J Stat Softw. ADC k=3 SQC k=2
  • 8. 8 Fauteux et al. 2016. Oncotarget. Class prediction
  • 11. Alternative splicing 11 TCGA Research Network. 2014. Nature. Ritchie et al. 2015. Nucleic Acids Res. • The majority of multi-exon genes are alternatively spliced • µ = 7.25 transcripts/protein-coding gene • Exon-based method • edgeR/limma: 𝑙𝑜𝑔𝐹𝐶𝑒𝑥𝑜𝑛 − µ 𝑙𝑜𝑔𝐹𝐶 𝑜𝑡ℎ𝑒𝑟 𝑒𝑥𝑜𝑛𝑠 • Exon to TM/EC domain mapping • 2,800 TM proteins • 6,800 EC domains • 28,500 exons
  • 13. Genomic alterations 13 • Somatic variant calls • VarScan2 MAF • LUAD: 334,879 • LUSC: 307,378 • VEP: moderate/high impact Kobolt et al. 2012. Genome Res. McLaren et al. 2016. Genome Biol.
  • 14. Business intelligence 14 • Database • Patents (class 424) • 130.1 (antibodies): 3,700 • 178.1 (conjugates): 1,400 • Publications: 530,000 • Clinical trials (mAb, ADC): 13,000 • Gene & protein synonyms, aliases • HGNC, Entrez, Uniprot • Automated document retrieval • String cleaning, matching • English dictionary used as decoy
  • 15. Conclusion 15 • We work with industry • Focus on Canadian companies • Co-development, services • NRC ADC pipeline • 60 ADC targets, 2,000 mAbs • Breast, lung colon, pancreas, ovary • Current work • Isoform-level targets • Antigen sequence optimization • Immuno-oncology
  • 16. Acknowledgements 16 • Ottawa (HHT) • Jennifer Hill • Jianjun Li • Maria Moreno • Ottawa (ICT) • Anu Surendra • Mira Cuperlovic-Culf • Youlian Pan • Montreal (HHT) • André Nantel • Anne Marcil • Maria Jaramillo • Richard Marcotte • Jean Labrecque • Vincent Dodelet
  • 17. 1717 Thank you François Fauteux Research Officer, Scientific Data Mining Information and Communication Technologies Tel: 613-993-0875 Francois.Fauteux@nrc-cnrc.gc.ca www.nrc-cnrc.gc.ca