Sample & Assay Technologies

Next Generation Sequencing:
An introduction to applications and technologies
Quan Peng, Ph.D.
Scientist, R&D
Quan.Peng@QIAGEN.com
Sample & Assay Technologies

Welcome to the three-part webinar series
Next Generation Sequencing and its role in cancer biology

Webinar 1: Next-generation sequencing, an introduction to technology and
applications
Date:
March April 4, 2013
Speaker:
Quan Peng, Ph.D.

Webinar 2:
Date:
Speaker:

Next-generation sequencing for cancer research
April 11, 2013
Vikram Devgan, Ph.D., MBA

Webinar 3:
Date:
Speaker:

Next-generation sequencing data analysis for genetic profiling
April 18, 2013
Ravi Vijaya Satya, Ph.D.

Title, Location, Date

2
Sample & Assay Technologies

Agenda

Next Generation Sequencing
Background
Technologies
Applications
Workflow

Targeted Enrichment
Methodology
Data analysis
New product released!

3
Sample & Assay Technologies

DNA Sequencing – The Past Decade

Output (Kb)

10E+8

10E+6

10E+4

10E+2

Adapted from ER Mardis. Nature 470, 198-203 (2011) doi:10.1038/nature09796
Title, Location, Date

4
Sample & Assay Technologies

Rapid Decrease in Cost

Title, Location, Date

5
Sample & Assay Technologies

What is Next-Generation Sequencing?

Sanger Sequencing

DNA is fragmented

NGS: Massive Parallel Sequencing

DNA is fragmented

.

Adaptors ligated to fragments
(Library construction)
Cloned to a plasmid vector

Cyclic sequencing reaction

Clonal amplification of
fragments on a solid surface
(Bridge PCR or Emulsion
PCR)

Direct step-by-step detection of
each nucleotide base
incorporated
during the sequencing reaction
.

Separation by electrophoresis
Readout with fluorescent tags

Title, Location, Date

6
Sample & Assay Technologies

Bridge PCR

DNA fragments are flanked with adaptors (Library)
A flat surface (chip) coated with two types of primers, corresponding to the adaptors
Amplification proceeds in cycles, with one end of each bridge tethered to the surface
Clusters of DNA molecules are generated on the chip. Each cluster is originated from
a single DNA fragment
Used by Illumina
Title, Location, Date

7
Sample & Assay Technologies

Illumina HiSeq/MiSeq

Run time 1- 10 days
Produces 2 - 600 Gb of sequence
Read length 2X100 bp – 2X250bp
(pair end)
Cost: $0.05 - $0.4/Mb

Title, Location, Date

8
Sample & Assay Technologies

Single-end
reading

Single-end vs. paired-end reading

2nd strand
synthesis

Pair-end
reading

Single-end reading (SE):
Sequencer reads a fragment from only one end to the other
Pair-end reading (PE):
Sequencer reads both ends of the same fragment
More sequencing information, reads can be more accurately placed (“mapped”)
May not be required for all experiments, more expensive and time-consuming

Title, Location, Date

9
Sample & Assay Technologies

Emulsion PCR

Fragments, with adaptors, are PCR amplified within a water drop in oil
One primer is attached to the surface of a bead
DNA molecules are synthesized on the beads. Each bead bears DNA originated from a
single DNA fragment
Beads with DNA are then deposit into the wells of sequencing chips, one well one bead
Used by Roche 454, IonTorrent and SOLiD

Title, Location, Date

10
Sample & Assay Technologies

Ion PGM/ Proton

Run time 3 hrs
Read length 100‐300 bp; homopolymer can be an issue
Throughput determined by chip size (pH meter array): 10Mb – 5 Gb
Cost: $1 - $20/Mb
Title, Location, Date

11
Sample & Assay Technologies

Multiplex Sequencing – Barcoding Samples

Depending on the application, we may not need to generate so many reads per sample
Multiple samples with different index can be combined and put into one sequencing run
or into one sequencing lane
Save money on sequencing costs (pay per sample)

Title, Location, Date

12
Sample & Assay Technologies

NGS Applications

Next Generation Sequencing
Genomics

Transcriptomics

Title, Location, Date

Epigenomics

Metagenomics

13
Sample & Assay Technologies

NGS Applications

Next Generation Sequencing
Genomics

Transcriptomics

Epigenomics

Metagenomics

DNA-Seq

Mutation,
SNVs,
Indels,
CNVs,
Translocation

Title, Location, Date

14
Sample & Assay Technologies

NGS Applications

Next Generation Sequencing
Genomics

Transcriptomics

DNA-Seq

RNA-Seq

Mutation,
SNVs,
Indels,
CNVs,
Translocation

Expression level,
Novel transcripts,
Fusion transcript,
Splice variants

Title, Location, Date

Epigenomics

Metagenomics

15
Sample & Assay Technologies

NGS Applications

Next Generation Sequencing
Genomics

Transcriptomics

Epigenomics

DNA-Seq

RNA-Seq

ChIP-Seq,
Methyl-Seq

Mutation,
SNVs,
Indels,
CNVs,
Translocation

Expression level,
Novel transcripts,
Fusion transcript,
Splice variants

Global mapping
of DNA-protein
interactions,
DNA methylation,
histone modification

Title, Location, Date

Metagenomics

16
Sample & Assay Technologies

NGS Applications

Next Generation Sequencing
Genomics

Transcriptomics

Epigenomics

Metagenomics

DNA-Seq

RNA-Seq

ChIP-Seq,
Methyl-Seq

MicrobialSeq

Mutation,
SNVs,
Indels,
CNVs,
Translocation

Expression level,
Novel transcripts,
Fusion transcript,
Splice variants

Global mapping
of DNA-protein
interactions,
DNA methylation,
histone modification

Microbial genome
Sequence,
Microbial ID,
Microbiome
Sequencing,

Title, Location, Date

17
Sample & Assay Technologies

Next Generation Sequencing Workflow

Sample
preparation

• Isolate samples (DNA/RNA)
• Qualify and quantify samples
• Several hours to days

Library
construction

• Prepare platform specific library
• Qualify and quantify library
• 4-8 hours

Sequencing

• Perform sequencing run reaction on NGS platform
• 8 hours to several days

Data
analysis

• Application specific data analysis pipeline
• Several hours to days

Title, Location, Date

18
Sample & Assay Technologies

QIAGEN’s Solution for NGS Workflow

Sample
preparation

Target Enrichment kit
HMW DNA prep kit
Single Cell/WGA kit

Library
construction

rRNA depletion kit
ChIP-seq Kit
Pathogen bacteria prep kit

Library construction kit
MinElute size selection kit
Library quantification kit

Sequencing

Data
analysis

Result
validation

GeneRead DNAseq data
analysis web portal

RT2 Profiler PCR Arrays
Somatic Mutation PCR Arrays
Pyrosequencing

Title, Location, Date

CNA/CNV PCR Arrays
EpiTect ChIP PCR Arrays
SNP PCR Arrays

19
Sample & Assay Technologies

GeneRead DNAseq Gene Panel: Targeted Sequencing

What is targeted sequencing?
Sequencing a sub set of region in the whole-genome

Why do we need targeted sequencing?
Not all regions in the genome are of interest or relevant to specific study
Exome Sequencing: sequencing most of the coding regions of the genome (exome).
Protein-coding regions constitute less than 2% of the entire genome
Focused panel/hot spot sequencing: focused on the genes or regions of interest

What are the advantages of focused panel sequencing?
More coverage per sample, more sensitive mutation detection
More samples per run, lower cost per sample

Title, Location, Date

20
Sample & Assay Technologies

Target Enrichment - Methodology

Hybridization capture
Large DNA input (1 ug)
Long processing time
(2-3 days)
Large throughput
(MB region to whole exome)

Sample
preparation
(DNA
isolation)

Library
construction

Title, Location, Date

Hybridization
capture
(24-72 hrs)

Sequencing

Data analysis

21
Sample & Assay Technologies

Target Enrichment - Methodology

Multiplex PCR
Small DNA input (< 100ng)
Short processing time
(several hrs)
Relatively small throughput
(KB - MB region)

Sample
preparation
(DNA
isolation)

PCR target
enrichment
(2 hours)

Title, Location, Date

Library
construction

Sequencing

Data analysis

22
Sample & Assay Technologies

GeneRead DNAseq Gene Panel

Multiplex PCR technology based targeted enrichment for DNA sequencing
Cover all human exons (coding region + UTR)
Division of gene primers sets into 4 tubes; up to 1200 plex in each tube

23
Sample & Assay Technologies

GeneRead DNAseq Gene Panel
Focus on your Disease of Interest
Comprehensive Cancer Panel (124 genes)

Disease Focused Gene Panels (20 genes)
Breast cancer
Colon Cancer
Gastric cancer
Leukemia
Liver cancer
Genes Involved in Disease

Lung Cancer
Ovarian Cancer
Prostate Cancer

Genes with High Relevance
24
Sample & Assay Technologies

GeneRead DNAseq Custom Panel

25
Sample & Assay Technologies

NGS Data Analysis

Base calling
From raw data to DNA sequences, generate sequencing reads

Mapping to a reference
Align the reads to reference sequences
Can be considered as “blast“ millions of sequences against reference database

Variants identification
Identify the differences between sample DNA and reference DNA

Variant prioritization/filtering/validation

Title, Location, Date

26
Sample & Assay Technologies

NGS Data Analysis

Reference sequence
A

alignment
Sequencing reads
C
C

Title, Location, Date

27
Sample & Assay Technologies

NGS Data Analysis: Sequencing Depth

Coverage depth (or depth of coverage): how many times each base has been sequenced
or read
Unlike Sanger sequencing, in which each sample is sequenced 1-3 times to be confident of
its nucleotide identity, NGS generally needs to cover each position many times to make a
confident base call, due to relative high error rate (0.1 - 1% vs 0.001 – 0.01%)
Increasing coverage depth is also helpful to identify low frequent mutation in heterogenous
samples such as cancer sample

Reference
sequence

NGS
reads

coverage depth = 4
Title, Location, Date

coverage depth = 3

coverage depth = 2
28
Sample & Assay Technologies

NGS Data Analysis: Specificity

Specificity: the percentage of sequences that map to the intended targets
region of interest
number of on-target reads / total number of reads

Reference
sequence

ROI 1

ROI 2

NGS
reads

Off-target reads

On-target reads
Title, Location, Date

On-target
reads
29
Sample & Assay Technologies

NGS Data Analysis: Uniformity

Coverage uniformity: measure the evenness of the coverage depth of target
position
Calculate coverage depth of each position
Calculate the median coverage depth
Set the lower boundary of the coverage depth related to median depth (eg. 0.1 X
median coverage depth)
Calculate the percentage of target region covered by equal or more than the lower
boundary
Reference
sequence

NGS
reads

coverage depth = 10

coverage depth = 3
Title, Location, Date

coverage depth = 2
30
Sample & Assay Technologies

QIAGEN’s Solution

FREE Complete & Easy to use Data Analysis with Web-based Software

31
Sample & Assay Technologies

Summary

Run Summary
Specificity
Coverage
Uniformity
Numbers of SNPs and Indels

Summary By Gene
Specificity
Coverage
Uniformity
# of SNPs and Indels

32
Sample & Assay Technologies

Features of Variant Report

SNP detection
Indel detection

33
Sample & Assay Technologies

QIAGEN’s GeneRead DNAseq Gene Panel System
FOCUS ON YOUR RELEVANT GENES
Focused:
Biologically relevant content
selection enables deep sequencing
on relevant genes and identification
of rare mutations
Flexible:
Mix and match any gene of interest
NGS platform independent:
Functionally validated for PGM,
MiSeq/HiSeq
Integrated controls:
Enabling quality control of prepared
library before sequencing
Free, complete and easy of use data
analysis tool
Sample & Assay Technologies

Upcoming webinars
Next Generation Sequencing and its role in cancer biology

Webinar 2: Next-generation sequencing for cancer research
Date:
April 11, 2013
Speaker:
Vikram Devgan, Ph.D., MBA
Register here: https://www2.gotomeeting.com/register/126404050

Webinar 3: Next-generation sequencing data analysis for genetic profiling
Date:
April 18, 2013
Speaker:
Ravi Vijaya Satya, Ph.D.
Register here: ps://www2.gotomeeting.com/register/966970098

Title, Location, Date

35

Ngs part i 2013

  • 1.
    Sample & AssayTechnologies Next Generation Sequencing: An introduction to applications and technologies Quan Peng, Ph.D. Scientist, R&D Quan.Peng@QIAGEN.com
  • 2.
    Sample & AssayTechnologies Welcome to the three-part webinar series Next Generation Sequencing and its role in cancer biology Webinar 1: Next-generation sequencing, an introduction to technology and applications Date: March April 4, 2013 Speaker: Quan Peng, Ph.D. Webinar 2: Date: Speaker: Next-generation sequencing for cancer research April 11, 2013 Vikram Devgan, Ph.D., MBA Webinar 3: Date: Speaker: Next-generation sequencing data analysis for genetic profiling April 18, 2013 Ravi Vijaya Satya, Ph.D. Title, Location, Date 2
  • 3.
    Sample & AssayTechnologies Agenda Next Generation Sequencing Background Technologies Applications Workflow Targeted Enrichment Methodology Data analysis New product released! 3
  • 4.
    Sample & AssayTechnologies DNA Sequencing – The Past Decade Output (Kb) 10E+8 10E+6 10E+4 10E+2 Adapted from ER Mardis. Nature 470, 198-203 (2011) doi:10.1038/nature09796 Title, Location, Date 4
  • 5.
    Sample & AssayTechnologies Rapid Decrease in Cost Title, Location, Date 5
  • 6.
    Sample & AssayTechnologies What is Next-Generation Sequencing? Sanger Sequencing DNA is fragmented NGS: Massive Parallel Sequencing DNA is fragmented . Adaptors ligated to fragments (Library construction) Cloned to a plasmid vector Cyclic sequencing reaction Clonal amplification of fragments on a solid surface (Bridge PCR or Emulsion PCR) Direct step-by-step detection of each nucleotide base incorporated during the sequencing reaction . Separation by electrophoresis Readout with fluorescent tags Title, Location, Date 6
  • 7.
    Sample & AssayTechnologies Bridge PCR DNA fragments are flanked with adaptors (Library) A flat surface (chip) coated with two types of primers, corresponding to the adaptors Amplification proceeds in cycles, with one end of each bridge tethered to the surface Clusters of DNA molecules are generated on the chip. Each cluster is originated from a single DNA fragment Used by Illumina Title, Location, Date 7
  • 8.
    Sample & AssayTechnologies Illumina HiSeq/MiSeq Run time 1- 10 days Produces 2 - 600 Gb of sequence Read length 2X100 bp – 2X250bp (pair end) Cost: $0.05 - $0.4/Mb Title, Location, Date 8
  • 9.
    Sample & AssayTechnologies Single-end reading Single-end vs. paired-end reading 2nd strand synthesis Pair-end reading Single-end reading (SE): Sequencer reads a fragment from only one end to the other Pair-end reading (PE): Sequencer reads both ends of the same fragment More sequencing information, reads can be more accurately placed (“mapped”) May not be required for all experiments, more expensive and time-consuming Title, Location, Date 9
  • 10.
    Sample & AssayTechnologies Emulsion PCR Fragments, with adaptors, are PCR amplified within a water drop in oil One primer is attached to the surface of a bead DNA molecules are synthesized on the beads. Each bead bears DNA originated from a single DNA fragment Beads with DNA are then deposit into the wells of sequencing chips, one well one bead Used by Roche 454, IonTorrent and SOLiD Title, Location, Date 10
  • 11.
    Sample & AssayTechnologies Ion PGM/ Proton Run time 3 hrs Read length 100‐300 bp; homopolymer can be an issue Throughput determined by chip size (pH meter array): 10Mb – 5 Gb Cost: $1 - $20/Mb Title, Location, Date 11
  • 12.
    Sample & AssayTechnologies Multiplex Sequencing – Barcoding Samples Depending on the application, we may not need to generate so many reads per sample Multiple samples with different index can be combined and put into one sequencing run or into one sequencing lane Save money on sequencing costs (pay per sample) Title, Location, Date 12
  • 13.
    Sample & AssayTechnologies NGS Applications Next Generation Sequencing Genomics Transcriptomics Title, Location, Date Epigenomics Metagenomics 13
  • 14.
    Sample & AssayTechnologies NGS Applications Next Generation Sequencing Genomics Transcriptomics Epigenomics Metagenomics DNA-Seq Mutation, SNVs, Indels, CNVs, Translocation Title, Location, Date 14
  • 15.
    Sample & AssayTechnologies NGS Applications Next Generation Sequencing Genomics Transcriptomics DNA-Seq RNA-Seq Mutation, SNVs, Indels, CNVs, Translocation Expression level, Novel transcripts, Fusion transcript, Splice variants Title, Location, Date Epigenomics Metagenomics 15
  • 16.
    Sample & AssayTechnologies NGS Applications Next Generation Sequencing Genomics Transcriptomics Epigenomics DNA-Seq RNA-Seq ChIP-Seq, Methyl-Seq Mutation, SNVs, Indels, CNVs, Translocation Expression level, Novel transcripts, Fusion transcript, Splice variants Global mapping of DNA-protein interactions, DNA methylation, histone modification Title, Location, Date Metagenomics 16
  • 17.
    Sample & AssayTechnologies NGS Applications Next Generation Sequencing Genomics Transcriptomics Epigenomics Metagenomics DNA-Seq RNA-Seq ChIP-Seq, Methyl-Seq MicrobialSeq Mutation, SNVs, Indels, CNVs, Translocation Expression level, Novel transcripts, Fusion transcript, Splice variants Global mapping of DNA-protein interactions, DNA methylation, histone modification Microbial genome Sequence, Microbial ID, Microbiome Sequencing, Title, Location, Date 17
  • 18.
    Sample & AssayTechnologies Next Generation Sequencing Workflow Sample preparation • Isolate samples (DNA/RNA) • Qualify and quantify samples • Several hours to days Library construction • Prepare platform specific library • Qualify and quantify library • 4-8 hours Sequencing • Perform sequencing run reaction on NGS platform • 8 hours to several days Data analysis • Application specific data analysis pipeline • Several hours to days Title, Location, Date 18
  • 19.
    Sample & AssayTechnologies QIAGEN’s Solution for NGS Workflow Sample preparation Target Enrichment kit HMW DNA prep kit Single Cell/WGA kit Library construction rRNA depletion kit ChIP-seq Kit Pathogen bacteria prep kit Library construction kit MinElute size selection kit Library quantification kit Sequencing Data analysis Result validation GeneRead DNAseq data analysis web portal RT2 Profiler PCR Arrays Somatic Mutation PCR Arrays Pyrosequencing Title, Location, Date CNA/CNV PCR Arrays EpiTect ChIP PCR Arrays SNP PCR Arrays 19
  • 20.
    Sample & AssayTechnologies GeneRead DNAseq Gene Panel: Targeted Sequencing What is targeted sequencing? Sequencing a sub set of region in the whole-genome Why do we need targeted sequencing? Not all regions in the genome are of interest or relevant to specific study Exome Sequencing: sequencing most of the coding regions of the genome (exome). Protein-coding regions constitute less than 2% of the entire genome Focused panel/hot spot sequencing: focused on the genes or regions of interest What are the advantages of focused panel sequencing? More coverage per sample, more sensitive mutation detection More samples per run, lower cost per sample Title, Location, Date 20
  • 21.
    Sample & AssayTechnologies Target Enrichment - Methodology Hybridization capture Large DNA input (1 ug) Long processing time (2-3 days) Large throughput (MB region to whole exome) Sample preparation (DNA isolation) Library construction Title, Location, Date Hybridization capture (24-72 hrs) Sequencing Data analysis 21
  • 22.
    Sample & AssayTechnologies Target Enrichment - Methodology Multiplex PCR Small DNA input (< 100ng) Short processing time (several hrs) Relatively small throughput (KB - MB region) Sample preparation (DNA isolation) PCR target enrichment (2 hours) Title, Location, Date Library construction Sequencing Data analysis 22
  • 23.
    Sample & AssayTechnologies GeneRead DNAseq Gene Panel Multiplex PCR technology based targeted enrichment for DNA sequencing Cover all human exons (coding region + UTR) Division of gene primers sets into 4 tubes; up to 1200 plex in each tube 23
  • 24.
    Sample & AssayTechnologies GeneRead DNAseq Gene Panel Focus on your Disease of Interest Comprehensive Cancer Panel (124 genes) Disease Focused Gene Panels (20 genes) Breast cancer Colon Cancer Gastric cancer Leukemia Liver cancer Genes Involved in Disease Lung Cancer Ovarian Cancer Prostate Cancer Genes with High Relevance 24
  • 25.
    Sample & AssayTechnologies GeneRead DNAseq Custom Panel 25
  • 26.
    Sample & AssayTechnologies NGS Data Analysis Base calling From raw data to DNA sequences, generate sequencing reads Mapping to a reference Align the reads to reference sequences Can be considered as “blast“ millions of sequences against reference database Variants identification Identify the differences between sample DNA and reference DNA Variant prioritization/filtering/validation Title, Location, Date 26
  • 27.
    Sample & AssayTechnologies NGS Data Analysis Reference sequence A alignment Sequencing reads C C Title, Location, Date 27
  • 28.
    Sample & AssayTechnologies NGS Data Analysis: Sequencing Depth Coverage depth (or depth of coverage): how many times each base has been sequenced or read Unlike Sanger sequencing, in which each sample is sequenced 1-3 times to be confident of its nucleotide identity, NGS generally needs to cover each position many times to make a confident base call, due to relative high error rate (0.1 - 1% vs 0.001 – 0.01%) Increasing coverage depth is also helpful to identify low frequent mutation in heterogenous samples such as cancer sample Reference sequence NGS reads coverage depth = 4 Title, Location, Date coverage depth = 3 coverage depth = 2 28
  • 29.
    Sample & AssayTechnologies NGS Data Analysis: Specificity Specificity: the percentage of sequences that map to the intended targets region of interest number of on-target reads / total number of reads Reference sequence ROI 1 ROI 2 NGS reads Off-target reads On-target reads Title, Location, Date On-target reads 29
  • 30.
    Sample & AssayTechnologies NGS Data Analysis: Uniformity Coverage uniformity: measure the evenness of the coverage depth of target position Calculate coverage depth of each position Calculate the median coverage depth Set the lower boundary of the coverage depth related to median depth (eg. 0.1 X median coverage depth) Calculate the percentage of target region covered by equal or more than the lower boundary Reference sequence NGS reads coverage depth = 10 coverage depth = 3 Title, Location, Date coverage depth = 2 30
  • 31.
    Sample & AssayTechnologies QIAGEN’s Solution FREE Complete & Easy to use Data Analysis with Web-based Software 31
  • 32.
    Sample & AssayTechnologies Summary Run Summary Specificity Coverage Uniformity Numbers of SNPs and Indels Summary By Gene Specificity Coverage Uniformity # of SNPs and Indels 32
  • 33.
    Sample & AssayTechnologies Features of Variant Report SNP detection Indel detection 33
  • 34.
    Sample & AssayTechnologies QIAGEN’s GeneRead DNAseq Gene Panel System FOCUS ON YOUR RELEVANT GENES Focused: Biologically relevant content selection enables deep sequencing on relevant genes and identification of rare mutations Flexible: Mix and match any gene of interest NGS platform independent: Functionally validated for PGM, MiSeq/HiSeq Integrated controls: Enabling quality control of prepared library before sequencing Free, complete and easy of use data analysis tool
  • 35.
    Sample & AssayTechnologies Upcoming webinars Next Generation Sequencing and its role in cancer biology Webinar 2: Next-generation sequencing for cancer research Date: April 11, 2013 Speaker: Vikram Devgan, Ph.D., MBA Register here: https://www2.gotomeeting.com/register/126404050 Webinar 3: Next-generation sequencing data analysis for genetic profiling Date: April 18, 2013 Speaker: Ravi Vijaya Satya, Ph.D. Register here: ps://www2.gotomeeting.com/register/966970098 Title, Location, Date 35