SlideShare a Scribd company logo
1 of 37
Download to read offline
IDT and PacBio joint presentation—Characterizing
Alzheimer’s Disease candidate genes and transcripts
with targeted, long-read, single-molecule sequencing
Jenny Gu, PhD
Strategic Business Development Manager, PacBio
1
For Research Use Only. Not for use in diagnostics procedures. © Copyright 2017 by Pacific Biosciences of California, Inc. All rights reserved.
Characterizing Alzheimer’s Disease candidate genes and
transcripts with targeted, long-read, single-molecule
sequencing
September 27, 2017 / Jenny Gu, Ph.D.
AGENDA
-SMRT Sequencing technology overview
-Recommended IDT capture workflow for
SMRT Sequencing
-Case Study: Alzheimer’s Disease panel
ALZHEIMER’S DISEASE (AD)
Alzheimer’s disease is the most common form of neurodegenerative dementia.
https://www.alz.co.uk/research/WorldAlzheimerReport2015.pdf
Clinical characterization:
Progressive loss of memory and
deficits in thinking, problem
solving, and language
46.8M 131.5M
Neuropathological characterization:
Progressive cortical atrophy due to neuronal loss and
characteristic intracellular and extracellular deposits
of insoluble tau and amyloid β proteins
http://www.reverseagingcentre.com/media/links/signs-of-
alzheimers/
4
ALZHEIMER’S DISEASE (AD)
-Genetically divided into two different groups: early-onset and late-onset
-Relative risk for first degree relatives is 3.5 – 7.5
-30 – 48% of AD patients have an affected first-degree relative
Late-onset AD:
- Manifests after 65 years
- Multifactorial with strong genetic
predisposition
- GWAS have identified 20+ genetic risk
loci with small Odds Ratios (1.1 – 2.0
per risk allele) including both common
functional variants and rare and
structural variants
Early-onset AD:
- For 2 – 10% of patients first symptoms
occur in their 20s or 30s.
- Four genes account for 5 – 10% of
early onset AD:
-APP
PSEN1
PSEN2
APOE
The complex genetic makeup of AD
5
CANDIDATE DISEASE GENES IN ALZHEIMER’S DISEASE (AD)
Many associated genetic
loci contain several genes
Which candidates involved
in disease risk remains
unclear (20+ genes)
Strategies for assessing
GWAS candidate genes:
-DNA sequencing
-Transcriptome
sequencing
-Proteome studies
-Methylome studies
Cuyvers E. et al. (2016) Genetic variations underlying Alzheimer's disease: evidence from genome-wide association studies and beyond. Lancet Neurol. 15(8),857-68.
Several decade long search for risk genes in Alzheimer’s disease
6
SEQUEL SYSTEM
Typical Performance
-Average read length: 10 – 18 kb
-Consensus accuracy: Achieves QV50
-Throughput per cell: 5 – 8 Gb
-SMRT Cells per run: 1 – 16
-Movie lengths: 30 minutes – 10 hours
7
TYPICAL DATA
Read lengths >20 kb
Data per SMRT Cell: 5 – 8 Gb
Half of data in reads >20 kb
Top 5% of reads >35 kb
Maximum read lengths >60 kb
Read length data shown from 30 kb size-selected human library on the Sequel System (10-hour movie, 2.0
chemistry) with a total output of 7.6 Gb. Each Sequel System SMRT Cell 1M generates ~365,000 reads.
Read length (bp)
Reads(#)
8
BENEFITS OF LONG-READ SEQUENCING FOR
CHARACTERIZING GENOMIC STRUCTURAL VARIATION
Mechanisms underlying structural variant formation in genomic disorders. Carvalho CM et al. Nat Rev Genet. (2016)
Structural variation (SV) is an important
contributor to human diversity and disease
SV is also difficult to characterize
Example SV Types and Mechanisms
Targeted SMRT Sequencing allows scientists to
directly characterize:
• Complete Genes (introns & exons)
• Phased Variants (allelic haplotypes)
• Repetitive Regions
• Regulatory Regions (upstream/downstream)
• Insertions & Deletions
• Copy Number Variations
At high coverage for specific genes or regions of
interest across multiple samples.
9
GENETIC VARIATION SEQUENCING WITH SMRT SEQUENCING
1 10 100 1 kb 10 kb 100 kb 1 Mb 10 Mb 100 Mb
Size of Variant
VARIANT
TYPE
SNPs
Small
Indels
STRs &
VNTRs
Large
Insertions,
Deletions
Mobile
Elements
Complex
Variants
Phasing SVs
and SNVs
Indels
Repeat Expansions
One PacBio Read Spans Most Variants
Structural Variants
Phasing (SNVs and SVs)
Haplotype
Reconstruction
Assembled PacBio Reads Span Euchromatic Genome Variation
L1, Alu, SVA
Copy Number Variation
Inversions / Translocations
Phasing Phased Alleles
Medium to
Large SV’s
Haplotypes
Large Structural Rearrangement
10
ADDITIONALLY CHARACTERIZE TRANSCRIPTOME SPLICE
VARIATION WITH LONG-READ SEQUENCING
National Human Genome Research Institute. Bioinformatics: Finding genes. (2013) http://www.genome.gov/25020001
- Proteins and their functions are not only impacted by variants in exonic regions
- Variants in regulatory regions (enhancers/promoters, including methylation) and
intronic regions can also play an important role
- High transcript isoform diversity from alternative splicing
- Obtain full-length transcript sequences with Iso-Seq analysis
11
TRACE VARIANTS TO SPECIFIC ALLELES WITH PHASED
HETEROZYGOUS SNPS
12
CASE STUDY: VARIANT SCREENING IN ALZHEIMER’S DISEASE
WITH LONG-READ SEQUENCING
-Genomic and transcriptomic (cDNA) capture experiment
-Combined data provide better insight on variant-affected gene expression
-Gene panel applied to two AD patients (35 candidate genes):
• Average gDNA fragment size: ~6 kb
• Full-length transcripts ranging from <1 kb – ~10 kb
13
PACBIO TARGETED PROBE-BASED CAPTURE WORKFLOW
(GENOMIC DNA CAPTURE)
Shear to 7 kb
(6 kb for multiplex)
Amplification
Probe hybridization,
bead capture, wash
EXPERIMENTAL PIPELINE
INFORMATICS PIPELINE
Phasing with
SAMtools
Bin reads by
haplotype
Phased allelic
consensus
sequence
Tertiary
analysis
Map reads of
insert to
Reference
1 2 3 4 5
9 10 11 12 13
Size selection
3
5-9 kb
5-9 kb
6
Amplification and
SMRTbell prep.
+ Size selection
78
SequencingAnalysis
Genomic DNA
Ligate
barcoded
adapters
14
BEST PRACTICE SUMMARY: GENOMIC CAPTURE
-Save on project costs by multiplexing and spacing probes up to 1 kb.
-Multiplex up to 12 samples.
-Use PacBio linear barcoded adapters.
-High molecular weight DNA required.
-Size-selection highly recommended to max. on long-read recovery.
-Aim for 100-fold coverage of targeted panel size (full-length gene coverage).
15
10 kb shear
AD SAMPLES: SHEARED GDNA QC
Recommend starting with HMW gDNA (2 µg)
16
Final library size selected
SMRTBELL LIBRARY QC (SIZE-SELECTED)
17
GRCH38 SUBREAD MAPPING RESULTS
Skeletal muscle Brain
7.4 GB
2.2 M reads
8.4 GB
2.5 M reads
18
PACBIO TARGETED PROBE-BASED CAPTURE WORKFLOW
(TRANSCRIPTOME WITH SIZE SELECTION)
cDNA library
+ barcodes
Amplification
Probe hybridization,
bead capture, wash
EXPERIMENTAL PIPELINE
INFORMATICS PIPELINE
Tertiary
analysis
Iso-Seq
analysis
1 2 3 4 5
9 10
Size selection
(optional)
3
5-9 kb
6
Amplification and
SMRTbell prep.
78
SequencingAnalysis
mRNA
19
BEST PRACTICE SUMMARY: CDNA CAPTURE
-Recover high-quality RNA transcripts
-Size-selection is optional, but helpful for specific fractions.
-Targeted capture Iso-Seq analysis is recommended to characterize splice
isoforms
-Not recommended for characterizing gene expression levels
-Aim for min. 30-fold per anticipated splice isoform in samples
-Probes can be designed to exons only and/or including introns
20
AD SAMPLES: MRNA QC
RIN = 8.0
RIN = 8.1
Temporal lobe 1 RNA
Temporal lobe 2 RNA
Recommend RIN > 6
(RNA Integrity Number)
21
EXAMPLE WHOLE TRANSCRIPTOME SMRTBELL LIBRARY
(CDNA)
22
DESIGNING CUSTOM IDT XGEN® LOCKDOWN® CAPTURE PANEL
-Key benefit of xGen® Lockdown® Probes is flexibility in design
-Do not need to redesign existing probe panels
-However, recommend full-gene design by including introns and
exons, plus extra upstream and downstream sequences
-Probes can be spaced up to 1000 bp apart
-Use the same probes for genomic and cDNA capture
FULL-GENE DESIGN
Gene A
Gene B
23
67 2
3
39
319
154
312
SNPs AND LARGER SVs DISCOVERED IN AD SAMPLES
STUDY RESULTS:
Detected broad range of genomic
variants (SNPs and SVs):
-31 unique SVs ranging from 65 bp to
several kb in size
500+ Isoforms found in each patient
-Patient 1: 515 isoforms
-Patient 2: 507 isoforms
88% novel splice isoforms identified
-Only 39 isoform shared among both
patients and those reported in Gencode v25
24
RIN3 GENE: ~50 bp INSERTION DETECTED
25
ZCWPW1 GENE: ~750 bp DELETION DETECTED IN BOTH
PATIENTS
Patient 1
Patient 2
26
BACE1 GENE: PHASED ALLELES (34 KB)
Heterozygous SNPs can be used to phase alleles across multi-kilobase regions
Phase 0
Phase 1
Gene
Probes
Target
Phased
SNPs
27
BIN1 GENE: PHASED ALLELES (63 KB)
Heterozygous SNPs can be used to phase alleles across multi-kilobase regions
Gene
Probes
Target
Phased
SNPs
Phase 0
Phase 1
28
MAPT gene results:
-Detected a
heterozygous
deletion
-One allele is
transcribed into 21
isoforms and the
other only into 5
-Detected a novel
exon and
transcript
MAPT GENE RESULTS FOR PATIENT 1
21 isoforms
5 isoforms
Heterozygous genomic variants can be linked to
corresponding expressed transcripts
29
ZCWPW1 GENE: RETAINED INTRONS AND NEW EXONS
Patient
1
Patient
2
Retained intron
Novel exon
30
-AD has a large
economic impact on
the global society
(2010: $604B)
-To date, over 20+
putative genetic risk
variants have been
mapped
-Associated SNPs are
usually not the true
causative variant
CONCLUSION
-Combining gDNA and
cDNA data is more
informative
-Custom IDT xGen®
Lockdown® Panels
allow flexibility to scale
projects
-SMRT sequencing
provides multi-kilobase
phased alleles and full-
length transcripts
http://www.mvcenters.com/2015/02/11/dementia-
takes-toll-claims-another-american-great-dean-smith/
“Structural variants can be more informative for disease diagnostics,
prognostics and translation than current SNP mapping and exon sequencing.”
Roses A.D. et al. (2016) Structural variants can be more informative for disease diagnostics, prognostics and translation than current SNP mapping and exon sequencing. Expert Opin
Drug Metab Toxicol. 12(2),135-47.
31
Kevin Eng
Ting Hon
Elizabeth Tseng
Aaron Wenger
William Rowell
Jenny Ekholm
Steve Kujawa
ACKNOWLEDGEMENT
Kristina Giorda
Jiashi Wang
Mirna Jarosz
Visit PacBio Blog for new announcements and updates on Targeted Sequencing!
http://www.pacb.com/blog
http://www.pacb.com/applications/targeted-sequencing/
Feel free to contact ! Jenny Gu (jgu@pacb.com)
For Research Use Only. Not for use in diagnostics procedures. © Copyright 2017 by Pacific Biosciences of California, Inc. All rights reserved. Pacific Biosciences, the Pacific Biosciences logo,
PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx.
FEMTO Pulse and Fragment Analyzer are trademarks of Advanced Analytical Technologies. xGen and Lockdown are trademarks of Integrated DNA Technologies, Inc.
All other trademarks are the sole property of their respective owners.
www.pacb.com
gDNA Capture
Supplemental Information
PACBIO POLYMERASE READS
Skeletal muscle
Brain
35
SMRT LINK PROVIDES BASIC PROCESSING OF RAW DATA FOR
TARGETED CAPTURE ENRICHMENT STUDIES
SMRT Analysis produces:
-Filtered subreads
-Circular consensus sequences
-Alignment to reference (BAM files)
-Iso-Seq full-length transcripts
36
BIOINFORMATICS WORKFLOW FOR PHASING ALLELES
Github: Targeted phasing consensus (genomic capture)
Subreads
Raw data SMRTLink CCS reads SMRTLink
Aligned BAM
file
IGV 3.0
Visualize
capture2target.py
Defined
phase blocks
samtoolsPhased
alleles/region
cmdline:
PacBio arrow
1 2 3a 4 5
7
8
910
3b
11
Phased consensus
sequences
(*.fasta)
12
>99.9% accuracy
(dependent on coverage)
Data
SMRTLink
Command line tools
Third party software
Probe *.bed
6
Subset
and phase
Polish
37

More Related Content

What's hot

What's hot (20)

Genome editing techniques
Genome editing techniquesGenome editing techniques
Genome editing techniques
 
Lectut btn-202-ppt-l6. cosmids and phagemids
Lectut btn-202-ppt-l6. cosmids and phagemidsLectut btn-202-ppt-l6. cosmids and phagemids
Lectut btn-202-ppt-l6. cosmids and phagemids
 
Transfection
TransfectionTransfection
Transfection
 
Transfection method
Transfection methodTransfection method
Transfection method
 
Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
 
P bluescript
P bluescriptP bluescript
P bluescript
 
Vectors Used for Gene Cloning in Plants
Vectors Used for Gene Cloning in PlantsVectors Used for Gene Cloning in Plants
Vectors Used for Gene Cloning in Plants
 
Introduction to animal cell culture
Introduction to animal cell cultureIntroduction to animal cell culture
Introduction to animal cell culture
 
Genetic instability
Genetic instabilityGenetic instability
Genetic instability
 
Metabolic engineering
Metabolic engineeringMetabolic engineering
Metabolic engineering
 
Hematopoeitic stem cells
Hematopoeitic stem cellsHematopoeitic stem cells
Hematopoeitic stem cells
 
Gene silencing
Gene silencingGene silencing
Gene silencing
 
DNA microarray
DNA microarrayDNA microarray
DNA microarray
 
Screenable and Selectable Markers
Screenable and Selectable MarkersScreenable and Selectable Markers
Screenable and Selectable Markers
 
Marker free transgenic strategy
Marker free transgenic strategyMarker free transgenic strategy
Marker free transgenic strategy
 
cloning and expression system in yeast
cloning and expression system in yeastcloning and expression system in yeast
cloning and expression system in yeast
 
Bio 151 lec 10 cytokines
Bio 151 lec 10 cytokinesBio 151 lec 10 cytokines
Bio 151 lec 10 cytokines
 
Bioinformatic in drug designing
Bioinformatic in drug designingBioinformatic in drug designing
Bioinformatic in drug designing
 
Gene cloning
Gene cloningGene cloning
Gene cloning
 
Screening and selection of recombinants
Screening and selection of recombinants Screening and selection of recombinants
Screening and selection of recombinants
 

Similar to Characterizing Alzheimer’s Disease candidate genes and transcripts with targeted, long-read, single-molecule sequencing

Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
Long Pei
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
Dayananda Salam
 
Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02
t7260678
 

Similar to Characterizing Alzheimer’s Disease candidate genes and transcripts with targeted, long-read, single-molecule sequencing (20)

NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
NGS Applications I (UEB-UAT Bioinformatics Course - Session 2.1.2 - VHIR, Bar...
 
ACMG Workshop 2011
ACMG Workshop 2011ACMG Workshop 2011
ACMG Workshop 2011
 
NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生NAISTビッグデータシンポジウム - バイオ久保先生
NAISTビッグデータシンポジウム - バイオ久保先生
 
Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030Genome in a bottle for amp GeT-RM 181030
Genome in a bottle for amp GeT-RM 181030
 
Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012
 
whole-genome-sequencing-guide-small-genomes.pdf.pdf
whole-genome-sequencing-guide-small-genomes.pdf.pdfwhole-genome-sequencing-guide-small-genomes.pdf.pdf
whole-genome-sequencing-guide-small-genomes.pdf.pdf
 
Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...
Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...
Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperim...
 
Impact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEGImpact_of_gene_length_on_DEG
Impact_of_gene_length_on_DEG
 
Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821Genome in a bottle for next gen dx v2 180821
Genome in a bottle for next gen dx v2 180821
 
Festival of Genomics Jan 2018
Festival of Genomics Jan 2018Festival of Genomics Jan 2018
Festival of Genomics Jan 2018
 
Apac distributor training series 3 swift product for cancer study
Apac distributor training series 3  swift product for cancer studyApac distributor training series 3  swift product for cancer study
Apac distributor training series 3 swift product for cancer study
 
QIAseq Targeted DNA, RNA and Fusion Gene Panels
QIAseq Targeted DNA, RNA and Fusion Gene PanelsQIAseq Targeted DNA, RNA and Fusion Gene Panels
QIAseq Targeted DNA, RNA and Fusion Gene Panels
 
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
Towards Precision Medicine: Tute Genomics, a cloud-based application for anal...
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02Nextgenerationsequencing 120202015950-phpapp02
Nextgenerationsequencing 120202015950-phpapp02
 
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
GIAB Benchmarks for SVs and Repeats for stanford genetics sv 200511
 
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
Big Data at Golden Helix: Scaling to Meet the Demand of Clinical and Research...
 
600 base reads on the Ion S5™ Next-Generation Sequencing System enables accur...
600 base reads on the Ion S5™ Next-Generation Sequencing System enables accur...600 base reads on the Ion S5™ Next-Generation Sequencing System enables accur...
600 base reads on the Ion S5™ Next-Generation Sequencing System enables accur...
 
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Ses...
 
Genome sequencing. ppt.pptx
Genome sequencing. ppt.pptxGenome sequencing. ppt.pptx
Genome sequencing. ppt.pptx
 

More from Integrated DNA Technologies

The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...
Integrated DNA Technologies
 

More from Integrated DNA Technologies (20)

Overcoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAsOvercoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAs
 
Best practices for data analysis when using UMI adapters to improve variant d...
Best practices for data analysis when using UMI adapters to improve variant d...Best practices for data analysis when using UMI adapters to improve variant d...
Best practices for data analysis when using UMI adapters to improve variant d...
 
Increasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymesIncreasing genome editing efficiency with optimized CRISPR-Cas enzymes
Increasing genome editing efficiency with optimized CRISPR-Cas enzymes
 
The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...The quest for high confidence mutations in plasma: searching for a needle in ...
The quest for high confidence mutations in plasma: searching for a needle in ...
 
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
SNP genotyping on qPCR platforms: Troubleshooting for amplification and clust...
 
Optimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editingOptimized methods to use Cas9 nickases in genome editing
Optimized methods to use Cas9 nickases in genome editing
 
Dual index adapters with UMIs resolve index hopping and increase sensitivity ...
Dual index adapters with UMIs resolve index hopping and increase sensitivity ...Dual index adapters with UMIs resolve index hopping and increase sensitivity ...
Dual index adapters with UMIs resolve index hopping and increase sensitivity ...
 
Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...Reducing off-target events in CRISPR genome editing applications with a novel...
Reducing off-target events in CRISPR genome editing applications with a novel...
 
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotypingrhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
rhAmp™ SNP Genotyping: A novel approach for improving PCR-based SNP genotyping
 
Unique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samplesUnique, dual-matched adapters mitigate index hopping between NGS samples
Unique, dual-matched adapters mitigate index hopping between NGS samples
 
Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...Analyzing the exome—focusing your NGS analysis with high performance target c...
Analyzing the exome—focusing your NGS analysis with high performance target c...
 
Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...Getting started with CRISPR: a review of gene knockout and homology-directed ...
Getting started with CRISPR: a review of gene knockout and homology-directed ...
 
Cpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexesCpf1-based genome editing using ribonucleoprotein complexes
Cpf1-based genome editing using ribonucleoprotein complexes
 
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
 
Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...Accurate detection of low frequency genetic variants using novel, molecular t...
Accurate detection of low frequency genetic variants using novel, molecular t...
 
Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...
 
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTHigh efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
 
Tips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI toolsTips for effective use of BLAST and other NCBI tools
Tips for effective use of BLAST and other NCBI tools
 
Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...Gene synthesis technology and applications update—unleash your lab’s potentia...
Gene synthesis technology and applications update—unleash your lab’s potentia...
 
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
 

Recently uploaded

Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
RizalinePalanog2
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET
 

Recently uploaded (20)

SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
Locating and isolating a gene, FISH, GISH, Chromosome walking and jumping, te...
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Unit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 oUnit5-Cloud.pptx for lpu course cse121 o
Unit5-Cloud.pptx for lpu course cse121 o
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
dkNET Webinar "Texera: A Scalable Cloud Computing Platform for Sharing Data a...
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 

Characterizing Alzheimer’s Disease candidate genes and transcripts with targeted, long-read, single-molecule sequencing

  • 1. IDT and PacBio joint presentation—Characterizing Alzheimer’s Disease candidate genes and transcripts with targeted, long-read, single-molecule sequencing Jenny Gu, PhD Strategic Business Development Manager, PacBio 1
  • 2. For Research Use Only. Not for use in diagnostics procedures. © Copyright 2017 by Pacific Biosciences of California, Inc. All rights reserved. Characterizing Alzheimer’s Disease candidate genes and transcripts with targeted, long-read, single-molecule sequencing September 27, 2017 / Jenny Gu, Ph.D.
  • 3. AGENDA -SMRT Sequencing technology overview -Recommended IDT capture workflow for SMRT Sequencing -Case Study: Alzheimer’s Disease panel
  • 4. ALZHEIMER’S DISEASE (AD) Alzheimer’s disease is the most common form of neurodegenerative dementia. https://www.alz.co.uk/research/WorldAlzheimerReport2015.pdf Clinical characterization: Progressive loss of memory and deficits in thinking, problem solving, and language 46.8M 131.5M Neuropathological characterization: Progressive cortical atrophy due to neuronal loss and characteristic intracellular and extracellular deposits of insoluble tau and amyloid β proteins http://www.reverseagingcentre.com/media/links/signs-of- alzheimers/ 4
  • 5. ALZHEIMER’S DISEASE (AD) -Genetically divided into two different groups: early-onset and late-onset -Relative risk for first degree relatives is 3.5 – 7.5 -30 – 48% of AD patients have an affected first-degree relative Late-onset AD: - Manifests after 65 years - Multifactorial with strong genetic predisposition - GWAS have identified 20+ genetic risk loci with small Odds Ratios (1.1 – 2.0 per risk allele) including both common functional variants and rare and structural variants Early-onset AD: - For 2 – 10% of patients first symptoms occur in their 20s or 30s. - Four genes account for 5 – 10% of early onset AD: -APP PSEN1 PSEN2 APOE The complex genetic makeup of AD 5
  • 6. CANDIDATE DISEASE GENES IN ALZHEIMER’S DISEASE (AD) Many associated genetic loci contain several genes Which candidates involved in disease risk remains unclear (20+ genes) Strategies for assessing GWAS candidate genes: -DNA sequencing -Transcriptome sequencing -Proteome studies -Methylome studies Cuyvers E. et al. (2016) Genetic variations underlying Alzheimer's disease: evidence from genome-wide association studies and beyond. Lancet Neurol. 15(8),857-68. Several decade long search for risk genes in Alzheimer’s disease 6
  • 7. SEQUEL SYSTEM Typical Performance -Average read length: 10 – 18 kb -Consensus accuracy: Achieves QV50 -Throughput per cell: 5 – 8 Gb -SMRT Cells per run: 1 – 16 -Movie lengths: 30 minutes – 10 hours 7
  • 8. TYPICAL DATA Read lengths >20 kb Data per SMRT Cell: 5 – 8 Gb Half of data in reads >20 kb Top 5% of reads >35 kb Maximum read lengths >60 kb Read length data shown from 30 kb size-selected human library on the Sequel System (10-hour movie, 2.0 chemistry) with a total output of 7.6 Gb. Each Sequel System SMRT Cell 1M generates ~365,000 reads. Read length (bp) Reads(#) 8
  • 9. BENEFITS OF LONG-READ SEQUENCING FOR CHARACTERIZING GENOMIC STRUCTURAL VARIATION Mechanisms underlying structural variant formation in genomic disorders. Carvalho CM et al. Nat Rev Genet. (2016) Structural variation (SV) is an important contributor to human diversity and disease SV is also difficult to characterize Example SV Types and Mechanisms Targeted SMRT Sequencing allows scientists to directly characterize: • Complete Genes (introns & exons) • Phased Variants (allelic haplotypes) • Repetitive Regions • Regulatory Regions (upstream/downstream) • Insertions & Deletions • Copy Number Variations At high coverage for specific genes or regions of interest across multiple samples. 9
  • 10. GENETIC VARIATION SEQUENCING WITH SMRT SEQUENCING 1 10 100 1 kb 10 kb 100 kb 1 Mb 10 Mb 100 Mb Size of Variant VARIANT TYPE SNPs Small Indels STRs & VNTRs Large Insertions, Deletions Mobile Elements Complex Variants Phasing SVs and SNVs Indels Repeat Expansions One PacBio Read Spans Most Variants Structural Variants Phasing (SNVs and SVs) Haplotype Reconstruction Assembled PacBio Reads Span Euchromatic Genome Variation L1, Alu, SVA Copy Number Variation Inversions / Translocations Phasing Phased Alleles Medium to Large SV’s Haplotypes Large Structural Rearrangement 10
  • 11. ADDITIONALLY CHARACTERIZE TRANSCRIPTOME SPLICE VARIATION WITH LONG-READ SEQUENCING National Human Genome Research Institute. Bioinformatics: Finding genes. (2013) http://www.genome.gov/25020001 - Proteins and their functions are not only impacted by variants in exonic regions - Variants in regulatory regions (enhancers/promoters, including methylation) and intronic regions can also play an important role - High transcript isoform diversity from alternative splicing - Obtain full-length transcript sequences with Iso-Seq analysis 11
  • 12. TRACE VARIANTS TO SPECIFIC ALLELES WITH PHASED HETEROZYGOUS SNPS 12
  • 13. CASE STUDY: VARIANT SCREENING IN ALZHEIMER’S DISEASE WITH LONG-READ SEQUENCING -Genomic and transcriptomic (cDNA) capture experiment -Combined data provide better insight on variant-affected gene expression -Gene panel applied to two AD patients (35 candidate genes): • Average gDNA fragment size: ~6 kb • Full-length transcripts ranging from <1 kb – ~10 kb 13
  • 14. PACBIO TARGETED PROBE-BASED CAPTURE WORKFLOW (GENOMIC DNA CAPTURE) Shear to 7 kb (6 kb for multiplex) Amplification Probe hybridization, bead capture, wash EXPERIMENTAL PIPELINE INFORMATICS PIPELINE Phasing with SAMtools Bin reads by haplotype Phased allelic consensus sequence Tertiary analysis Map reads of insert to Reference 1 2 3 4 5 9 10 11 12 13 Size selection 3 5-9 kb 5-9 kb 6 Amplification and SMRTbell prep. + Size selection 78 SequencingAnalysis Genomic DNA Ligate barcoded adapters 14
  • 15. BEST PRACTICE SUMMARY: GENOMIC CAPTURE -Save on project costs by multiplexing and spacing probes up to 1 kb. -Multiplex up to 12 samples. -Use PacBio linear barcoded adapters. -High molecular weight DNA required. -Size-selection highly recommended to max. on long-read recovery. -Aim for 100-fold coverage of targeted panel size (full-length gene coverage). 15
  • 16. 10 kb shear AD SAMPLES: SHEARED GDNA QC Recommend starting with HMW gDNA (2 µg) 16
  • 17. Final library size selected SMRTBELL LIBRARY QC (SIZE-SELECTED) 17
  • 18. GRCH38 SUBREAD MAPPING RESULTS Skeletal muscle Brain 7.4 GB 2.2 M reads 8.4 GB 2.5 M reads 18
  • 19. PACBIO TARGETED PROBE-BASED CAPTURE WORKFLOW (TRANSCRIPTOME WITH SIZE SELECTION) cDNA library + barcodes Amplification Probe hybridization, bead capture, wash EXPERIMENTAL PIPELINE INFORMATICS PIPELINE Tertiary analysis Iso-Seq analysis 1 2 3 4 5 9 10 Size selection (optional) 3 5-9 kb 6 Amplification and SMRTbell prep. 78 SequencingAnalysis mRNA 19
  • 20. BEST PRACTICE SUMMARY: CDNA CAPTURE -Recover high-quality RNA transcripts -Size-selection is optional, but helpful for specific fractions. -Targeted capture Iso-Seq analysis is recommended to characterize splice isoforms -Not recommended for characterizing gene expression levels -Aim for min. 30-fold per anticipated splice isoform in samples -Probes can be designed to exons only and/or including introns 20
  • 21. AD SAMPLES: MRNA QC RIN = 8.0 RIN = 8.1 Temporal lobe 1 RNA Temporal lobe 2 RNA Recommend RIN > 6 (RNA Integrity Number) 21
  • 22. EXAMPLE WHOLE TRANSCRIPTOME SMRTBELL LIBRARY (CDNA) 22
  • 23. DESIGNING CUSTOM IDT XGEN® LOCKDOWN® CAPTURE PANEL -Key benefit of xGen® Lockdown® Probes is flexibility in design -Do not need to redesign existing probe panels -However, recommend full-gene design by including introns and exons, plus extra upstream and downstream sequences -Probes can be spaced up to 1000 bp apart -Use the same probes for genomic and cDNA capture FULL-GENE DESIGN Gene A Gene B 23
  • 24. 67 2 3 39 319 154 312 SNPs AND LARGER SVs DISCOVERED IN AD SAMPLES STUDY RESULTS: Detected broad range of genomic variants (SNPs and SVs): -31 unique SVs ranging from 65 bp to several kb in size 500+ Isoforms found in each patient -Patient 1: 515 isoforms -Patient 2: 507 isoforms 88% novel splice isoforms identified -Only 39 isoform shared among both patients and those reported in Gencode v25 24
  • 25. RIN3 GENE: ~50 bp INSERTION DETECTED 25
  • 26. ZCWPW1 GENE: ~750 bp DELETION DETECTED IN BOTH PATIENTS Patient 1 Patient 2 26
  • 27. BACE1 GENE: PHASED ALLELES (34 KB) Heterozygous SNPs can be used to phase alleles across multi-kilobase regions Phase 0 Phase 1 Gene Probes Target Phased SNPs 27
  • 28. BIN1 GENE: PHASED ALLELES (63 KB) Heterozygous SNPs can be used to phase alleles across multi-kilobase regions Gene Probes Target Phased SNPs Phase 0 Phase 1 28
  • 29. MAPT gene results: -Detected a heterozygous deletion -One allele is transcribed into 21 isoforms and the other only into 5 -Detected a novel exon and transcript MAPT GENE RESULTS FOR PATIENT 1 21 isoforms 5 isoforms Heterozygous genomic variants can be linked to corresponding expressed transcripts 29
  • 30. ZCWPW1 GENE: RETAINED INTRONS AND NEW EXONS Patient 1 Patient 2 Retained intron Novel exon 30
  • 31. -AD has a large economic impact on the global society (2010: $604B) -To date, over 20+ putative genetic risk variants have been mapped -Associated SNPs are usually not the true causative variant CONCLUSION -Combining gDNA and cDNA data is more informative -Custom IDT xGen® Lockdown® Panels allow flexibility to scale projects -SMRT sequencing provides multi-kilobase phased alleles and full- length transcripts http://www.mvcenters.com/2015/02/11/dementia- takes-toll-claims-another-american-great-dean-smith/ “Structural variants can be more informative for disease diagnostics, prognostics and translation than current SNP mapping and exon sequencing.” Roses A.D. et al. (2016) Structural variants can be more informative for disease diagnostics, prognostics and translation than current SNP mapping and exon sequencing. Expert Opin Drug Metab Toxicol. 12(2),135-47. 31
  • 32. Kevin Eng Ting Hon Elizabeth Tseng Aaron Wenger William Rowell Jenny Ekholm Steve Kujawa ACKNOWLEDGEMENT Kristina Giorda Jiashi Wang Mirna Jarosz Visit PacBio Blog for new announcements and updates on Targeted Sequencing! http://www.pacb.com/blog http://www.pacb.com/applications/targeted-sequencing/ Feel free to contact ! Jenny Gu (jgu@pacb.com)
  • 33. For Research Use Only. Not for use in diagnostics procedures. © Copyright 2017 by Pacific Biosciences of California, Inc. All rights reserved. Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx. FEMTO Pulse and Fragment Analyzer are trademarks of Advanced Analytical Technologies. xGen and Lockdown are trademarks of Integrated DNA Technologies, Inc. All other trademarks are the sole property of their respective owners. www.pacb.com
  • 36. SMRT LINK PROVIDES BASIC PROCESSING OF RAW DATA FOR TARGETED CAPTURE ENRICHMENT STUDIES SMRT Analysis produces: -Filtered subreads -Circular consensus sequences -Alignment to reference (BAM files) -Iso-Seq full-length transcripts 36
  • 37. BIOINFORMATICS WORKFLOW FOR PHASING ALLELES Github: Targeted phasing consensus (genomic capture) Subreads Raw data SMRTLink CCS reads SMRTLink Aligned BAM file IGV 3.0 Visualize capture2target.py Defined phase blocks samtoolsPhased alleles/region cmdline: PacBio arrow 1 2 3a 4 5 7 8 910 3b 11 Phased consensus sequences (*.fasta) 12 >99.9% accuracy (dependent on coverage) Data SMRTLink Command line tools Third party software Probe *.bed 6 Subset and phase Polish 37