Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Bioinformatics workshop Sept 2014
1. Wet-lab Considerations for
Illumina data analysis
Based on a presentation by Henriette O’Geen
Lutz Froenicke
DNA Technologies and Expression Analysis
Cores
UCD Genome Center
2. Genome
Resequencing
RNA-seq
Gene Expression
Metageno
mics
Exome Sequencing
ChIP-SEQ
Small RNA
SNPs, Indels
Genotyping
De novo genome
Sequencing
DNA
Methylation
Splice Isoform
Abundance
3D Organization
CNVs
Rearrangements
4. DNA
(0.1-1.0 ug)
Illumina Sequencing
Single molecule array
Library
preparation Cluster generation
3’ 5’
5’
G
T
A
C
A
C
G
T
C
A
G
T
T
G
C
T
A
C
G
A
T
A
C
C
C
G
A
T
C
G
A
T
Sequencing
Technology
Sequencing By Synthesis (SBS) Technology
5. TruSeq Chemistry: Flow
Cell
Simplified
workflow
• Clusters in
a
contained
environme
nt (no
need for
clean
rooms)
• Sequencin
g
performed
in the flow
cell on the
clusters
Surface of flow
cell coated
with a lawn of
oligo pairs
8
channels
8. Examples of DNA input requirements
Illumina library prep kit Starting material
TruSeq DNA > 100 ng
KAPA DNA > 10 ng
NEB Ultra Low > 5 ng
TruSeq ChIP/MeDIP 10-50 ng
Rubicon ThruPLEX 50 pg – 50 ng
Nextera Kit * 50 ng
Nextera Kit * for Single Cell 0.125 - 0.375 ng
* Unique protocol using “tagmentation”:
DNA is simultaneously fragmented and tagged with
sequencing adapters
9. DNA library construction
5
’
5
’
5
’
5
’
5
’ 5
’
P
HO
OH
P
5
’
5
’ A
P
A
P
T T
Fragmented DNA
End Repair
Blunt End Fragments
“A” Tailing
Single Overhang Fragments
Adapter Ligation
DNA Fragments
with Adapter Ends
10. “If you can put adapters on it,
we can sequence it!”
12. DNA fragmentation
Mechanical shearing:
• NGS BioRuptor
• Covaris
Enzymatic:
• Fragmentase, Transposase
Chemical
All methods are sensitive to
• Purity of DNA
• DNA concentration
13. Size selection and clean-up using
SPRI Beads
SPRI = Solid Phase Reversible Immobilization
Ratio of SPRI beads/PEG solution to sample determines size cut off
14. Optional: PCR-free libraries
PCR-free library:
– Library can be sequenced if concentration
allows
– Reduction of PCR bias against e.g. GC rich
orAT rich regions, especially for
metagenomic samples
OR
Library enrichment by PCR:
– Ideal combination: high input and low cycle
number
17. T A
Fragmented DNA
A
+
Adaptors
“Regular”
adaptors
Advantages:
Simple
Obtain 1-2 reads (F and R)
Problems:
No multiplexing
TA
AT
Forward read
Reverse read
TA
AT
18. T A
Fragmented DNA
A
+
Adaptors
“in-line barcodes”
adaptors
Advantages:
Can multiplex
Simple
Obtain 1-2 reads (F and R)
Problems:
Cluster detection on the High Seq
Lose sequence data in the barcodes
AT
AT TA
TA
Forward read T
Reverse read T
19. T A
Fragmented DNA
A
+
Adaptors
“Truseq –style” indexed
adaptors
Advantages:
Index independent of read
-> more data
-> no more clustering
problems
Problems:
Need more reagents
Index only on one side
TA
AT
TA
AT
Forward read
Index read
Reverse read
20. Adaptors “Dual indexed”
T A
Fragmented DNA
A
+
adaptors
Advantages:
Cheaper
Indexing information
on both sides
Problems:
TBA…
Forward read
Index read 1
Index read 2
Reverse read
For 96 reactions
Simple index:
96 B adaptors
1 A adaptor
Dual index:
12 A adaptors
8 B adaptors
TA
AT
TA
AT
21. Quantitation & QC methods
Intercalating dye methods (PicoGreen, Qubit, etc.):
Specific to dsDNA, accurate at low levels of DNA
Great for pooling of indexed libraries to be sequenced in one lane
Requires standard curve generation, many accurate pipetting steps
Bioanalyzer:
Quantitation is good for rough estimate
Invaluable for library QC
High-sensitivity DNA chip allows quantitation of low DNA levels
qPCR
Most accurate quantitation method
More labor-intensive
Must be compared to a control
22. Library QC by Bioanalyzer
Predominant species of appropriate MW
Minimal primer dimer or adapter dimers
Minimal higher MW material
24. Library QC by Bioanalyzer
Beautiful 100% Adapters
Beautiful
~ 125 bp
25. Library QC
Examples for successful libraries Adapter
~125
bp
contamination
at ~125 bp
26. Library quantitation by qPCR
This step is usually performed by
sequencing service center
Use amplifying primers
corresponding to ends of
adapters
Use standards of known
concentration to generate
standard curve of threshold Ct vs.
concentration
Use conversion factor to deduce
concentration of unknown
libraries
Take library size into
consideration!
Commercial kits are available
Primer 1
Primer 2
27. Examples of RNA input requirements
Library prep kit Starting material
mRNA (TruSeq) 100 ng - 4 μg total RNA
Directional mRNA (TruSeq) 1-5 μg total RNA or 50 ng
mRNA
NEB ultra directional RNA 10 -100 ng mRNA or ribo
depleted RNA
Small RNA (TruSeq) 1 μg total RNA
Ribo depletion (Epicentre) 1-5 μg total RNA
SMARTer™ Ultra Low RNA
(Clontech)
100 pg – 10 ng total RNA
Single cell
SMART-seq2 Single cell
28. Standard RNA-Seq library protocol
QC of total RNA to assess integrity
Removal of rRNA (most common)
mRNA isolation
rRNA depletion
Fragmentation of RNA
Reverse transcription and second-strand
cDNA synthesis
Ligation of adapters
PCR Amplify
Purify, QC and Quantify
30. Strand-specific RNA-seq
Standard library (non-directional)
Antisense non-coding RNA
Sense transcripts
Informative for non-coding RNAs and antisense transcripts
Essential when NOT using polyA selection (mRNA)
No disadvantage to preserving strand specificity
31. Your Sequence Data
• Filtered vs. Unfiltered
Illumina chastity filter (fluorescence ratio
under threshold twice in first 25 bases)
Passing Filter
@HWI-M02034:55:000000000-A85G4:1:1101:21460:1468 1:N:0:_AACGCTTA
CGTTTGATAAGCTGAAAATCGCCCTGACTCAAGCTCCAATTGTGAGAGGACCAG
+
A-ABC7-C9-<CE89,,,CC,CCCC8,CFF8,,;CCF8,CE,E9,,,,,,CD@<
NOT Passing Filter
@HWI-M02034:55:000000000-A85G4:1:1101:21460:1468 1:Y:0:_AACGCTTA
33. Your Sequence Data
• PhiX (phi X 174)
Illumina internal standard for QC
• now in all MiSeq and HiSeq lanes
Not aligned to PhiX
@HWI-M02034:55:000000000-A85G4:1:1101:21460:1468 1:N:0:
CGTTTGATAAGCTGAAAATCGCCCTGACTCAAGCTCCAATTGTGAGAGGACCAG
+
A-ABC7-C9-<CE89,,,CC,CCCC8,CFF8,,;CCF8,CE,E9,,,,,,CD@<
Aligned to PhiX
@HWI-M02034:55:000000000-A85G4:1:1101:21460:1468 1:N:18:
34. Targeted sequencing
- exomes
- cancer gene panels
- RNA-seq
- any non-repeat ROI
- 2 HiSeq lanes / genome
- 20 exomes / lane
35. Amplicon sequencing
• Sequencing of amplified regions of
interest
• Common application: 16S/18S small
subunit ribosomal RNA (SSU rRNA) genes
as phylogenetic markers
Primer 1
Primer 2
Standard
library preparation
OR
38. http://pacificbiosciences.com
THIRD GENERATION
DNA SEQUENCING
Single Molecule Real Time (SMRT™) sequencing
Sequencing of single DNA molecule by single
polymerase
Very long reads: average reads over 8 kb, up to 30 kb
High error rate (~13%).
Complementary to short accurate reads of Illumina
41. First Sequencing of CGG-repeat Alleles in Human Fragile X
Syndrome using PacBio RS Sequencer
Paul Hagerman, Biochemistry and Molecular
Medicine, SOM.
• Single-molecule sequencing of pure CGG array,
- first for disease-relevant allele. Loomis et al. (2012)
Genome Research.
- applicable to many other tandem repeat disorders.
• Direct genomic DNA sequencing of methyl groups,
- direct epigenetic sequencing (paper under review).
• Discovered 100% bias toward methylation of 20 CGG-repeat
allele in female,
– first direct methylated DNA sequencing in human
CGG36 CGG95
disease.
• DoD STTR award with PacBio. Basis of R01
applications.
A
C
G
T
Nucleotide position
42.
43. MinION: disposable
DNA sequencer
GridION
www.nanoporetech.com
The best is yet to come ….. e.g.