Genomics,
Bioinformatics,
and Pathology
DR. DAN GASTON
BEDARD LAB
DEPARTMENT OF PATHOLOGY
DALHOUSIE UNIVERSITY
MAY 13TH, 2015
Genomic Pathology
Healthcare
Research
Innovation
Why Genomics?
Why Genomics?
Cost
Knowledge Utility
$398,000 -> $0.40
 NGS: Next-Generation Sequencing. A group of
different sequencing technologies defined by
high throughput and low cost
A Short Primer on Common Terms
 NGS: Next-Generation Sequencing. A group of
different sequencing technologies defined by
high throughput and low cost
 Short Read: The output from most NGS
sequencing technologies. Range from 30bp to
300bp
A Short Primer on Common Terms
 NGS: Next-Generation Sequencing. A group of
different sequencing technologies defined by
high throughput and low cost
 Short Read: The output from most NGS
sequencing technologies. Range from 30bp to
300bp
 Mapping: Placing sequencing reads on to a
reference genome
A Short Primer on Common Terms
 NGS: Next-Generation Sequencing. A group of
different sequencing technologies defined by
high throughput and low cost
 Short Read: The output from most NGS
sequencing technologies. Range from 30bp to
300bp
 Mapping: Placing sequencing reads on to a
reference genome
 Variant Calling: Identifying sites of genetic
variation between a sample and a reference
genome
A Short Primer on Common Terms
A Short Primer on Common Terms
 NGS: Next-Generation Sequencing. A group of
different sequencing technologies defined by
high throughput and low cost
 Short Read: The output from most NGS
sequencing technologies. Range from 30bp to
300bp
 Mapping: Placing sequencing reads on to a
reference genome
 Variant Calling: Identifying sites of genetic
variation between a sample and a reference
genome
 Paired End: Two short reads from the same
fragment of the genome, one from each end
The Players
The Players
Next-Gen Sequencing Overview
Next-Gen Sequencing Overview
Illumina Sequencing: The Basics
Targeted Sequencing
Targeted Sequencing
The Data
The Data: FastQ Format
Read ID
Sequence
Quality line
NGS Bioinformatics Workflow
Unpacking the Black Box
Unpacking the Black Box
 Quality assurance of primary data
Unpacking the Black Box
 Quality assurance of primary data
 Map short reads to a reference
Unpacking the Black Box
 Quality assurance of primary data
 Map short reads to a reference
 Quality assurance of mapping process
Unpacking the Black Box
 Quality assurance of primary data
 Map short reads to a reference
 Quality assurance of mapping process
 Identify genetic variation (mutations,
translocations, etc)
Unpacking the Black Box
 Quality assurance of primary data
 Map short reads to a reference
 Quality assurance of mapping process
 Identify genetic variation (mutations,
translocations, etc)
 Quality assurance of variants
Unpacking the Black Box
 Quality assurance of primary data
 Map short reads to a reference
 Quality assurance of mapping process
 Identify genetic variation (mutations,
translocations, etc)
 Quality assurance of variants
 Variant annotation
Unpacking the Black Box
 Quality assurance of primary data
 Map short reads to a reference
 Quality assurance of mapping process
 Identify genetic variation (mutations,
translocations, etc)
 Quality assurance of variants
 Variant annotation
 Variant filtering
Unpacking the Black Box
 Quality assurance of primary data
 Map short reads to a reference
 Quality assurance of mapping process
 Identify genetic variation (mutations,
translocations, etc)
 Quality assurance of variants
 Variant annotation
 Variant filtering
 Reporting (Text, Visualization)
Genomic Oncology
Tumour Sample
DNA
Non-Tumour
Sample
DNA
Databases and
Annotations
Sequence
Tumour
Specific
Mutations
Tumour
Classification
Drugs
Short Read Mapping
Short Read Mapping
AGCTGGGATTTCGGAAAAGTCCGATCCCTTTAAGCGAA
AGCTGGGAT
GATTTCGGAAAA
TCGGAAAAGTC TTTAAGCGAA
TCCCTTTAAG
GTCCGATCCC
GAAAAGTCCGATCC
TGGGATTTCGG
TTCGGAAAAG CGATCCCTTTAAGC
AAAGTCCGATCC
Variant Calling
AGCTGGGATTTCGGAAAAGTCCGATCCCTTTAAGCGAA
AGCTGGGAT
GATTTCGGAAAA
TCGGAAAAGTC TTTAAGCGAA
TCCCTTTAAG
GTCCGATCCC
GAAAAGTCCGAGCC
TGGGATTTCGG
TTCGGAAAAG CGAGCCCTTTAAGC
AAAGTCCGAGCC
Variant Annotation
Variant Databases
Public Databases
Clinical Testing Databases
Pharmaceutical Company
Databases
Important Considerations
Timeline to Action
Day 0 Day 30 ?
Timeline to Action
Day 0 Day 14?
Timeline to Action
Day 0 Day 7?
Data Storage
Data Storage: Sequencing
Centres
Data Storage: Sequencing
Centres
Data Storage: Smaller
Sequencing Centres
Data Storage: Smaller
Sequencing Centres
Data Storage: Focused
Sequencing
Data Storage: Focused
Sequencing
Data Storage: Focused
Sequencing
Data Sharing
Clinical Bioinformatics
Validate, validate, validate!
Clinical Reporting
Genetic Variant Reporting
Genetic Variant Reporting
Genetic Variant Reporting
Genetic Variant Reporting
Levels of
Evidence/Support
Algorithmic Prediction
Levels of
Evidence/Support
Algorithmic Prediction+ Gene of Clinical Significance
Algorithmic Prediction
Levels of
Evidence/Support
Algorithmic Prediction+ Gene of Clinical Significance
Algorithmic Prediction
Clinical Variant in Other Malignancy
Levels of
Evidence/Support
Algorithmic Prediction+ Gene of Clinical Significance
Algorithmic Prediction
Clinical Variant in Other Malignancy
Clinical Variant in Malignancy
Algorithmic Prediction+ Gene of Clinical Significance
Algorithmic Prediction
Variants of Unknown
Significance
Incidental Findings
 ACMG Recommendations
 56 Genes
 Report on known
pathogenic mutations for all
 Report on suspected
(predicted) pathogenic for
some
 Based on actionability
 Allow for patient opt-out?
The Future
Monitoring For Cancer
Chemotherapy Resistance
Field Sequencing and Real-Time
Analysis
Takeaways and Key
Points
Conclusions
 Clinical sequencing is here
Conclusions
 Clinical sequencing is here
 Bit of a learning curve but pay-off is potentially
huge
Conclusions
 Clinical sequencing is here
 Bit of a learning curve but pay-off is potentially
huge
 Future proofing
Conclusions
 Clinical sequencing is here
 Bit of a learning curve but pay-off is potentially
huge
 Future proofing
 Be comfortable with genetics
Conclusions
 Clinical sequencing is here
 Bit of a learning curve but pay-off is potentially
huge
 Future proofing
 Be comfortable with genetics
 Make friends with your friendly local
bioinformatician
Conclusions
 Clinical sequencing is here
 Bit of a learning curve but pay-off is potentially
huge
 Future proofing
 Be comfortable with genetics
 Make friends with your friendly local
bioinformatician
 Leveraging 'Big Data' to make big decisions
Conclusions
 Clinical sequencing is here
 Bit of a learning curve but pay-off is potentially
huge
 Future proofing
 Be comfortable with genetics
 Make friends with your friendly local
bioinformatician
 Leveraging 'Big Data' to make big decisions
 Future: Clinical trails of size 1
Conclusions
Cost
Knowledge Utility
Conclusions
Pathologists
Bioinformaticians Geneticists

Genomics, Bioinformatics, and Pathology