SlideShare a Scribd company logo
iMate Protocol (version 2.1) by Phyloinformatics Unit, RIKEN Kobe – May 19, 2017
NGS and Phyloinfo in Kobe
http://www.clst.riken.jp/phylo/
1
iMate Protocol: Improved and Inexpensive NexteraTM
Mate Pair Library Preparation
Authorized by Kaori Tatsumi, Osamu Nishimura, Kazu Itomi, Chiharu Tanegashima & Shigehiro Kuraku
Phyloinformatics Unit
RIKEN Center for Life Science Technologies (CLST)
Notice: When you present or publish data based on technical guidance in this protocol, you could think about
citing this protocol linked from our lab’s web site and our benchmark paper (Tatsumi et al., 2015).
This protocol outlines the modifications to the ‘Gel-plus’ version of the standard protocol for
Nextera Mate Pair Library Preparation and the logical background for them. Desiring optimal
scaffolding performance, we have optimized the protocol under the possibly conservative policy
that only read pairs with junction adaptors (bona fide ‘mates’) should be passed on to scaffolding.
The keys for this protocol are optimizing the 1) tagmentation condition, 2) Covaris shearing
condition, and 3) sequence read length, in order to enhance the yield of libraries and the
capability of detecting the junction adaptor in reads.
Basically, we understand that 4μg of starting genomic DNA, as formulated in the standard
protocol, is enough for preparation of mate-pair libraries with mate distance of >10kb (but,
unrealistic for >20kb). Ideally, we could optimize the tagmentation condition so that as much DNA
as possible fall into the targeted size range. For this purpose, perform tagment reaction with
multiple conditions, for example, in three tubes with 4, 8 and 12 μl of tagment enzyme supplied in
the kit. The tagment buffer can be self-made [1], which leads to cost-saving, if other limiting
reagents are also saved.
Size distribution of tagmented DNA molecules should be analyzed with a trustworthy method,
such as pulse field electrophoresis (e.g., PippinPulse) or the Agilent TapeStation―the Agilent
Bioanalyzer does not perform well for this purpose. With comparable results from multiple
tagment reactions, you could figure out which tagment condition allows you to retrieve the largest
amount of DNA for the targeted size range.
Like the previous tagmentation step, the amounts of the supplied reagents used in this step are
the limiting factor in terms of how many libraries can be prepared with one purchased kit. Thus, it
would be preferable to find a way to decrease the amount of kit-supplied reagents required to
perform this step. This can be achieved by reducing the total volume of the reaction into a half,
with all the components therein also proportionally reduced. Previously, we suggested (in the
iMate protocol versions 1.X) to perform strand displacement with 1/4 volume for all reaction
components, after size selection with BluePippin. But, we found that this can result in
contamination of read pairs with untargeted mate distances. Therefore, we now do not
recommend reversing the order of strand displacement and size selection. We are now looking
into alternative ways to save kit-supplied reagents for the strand displacement step.
Do as instructed in the standard protocol.
We use a BluePippin in this step and usually set a size range of 4 kb in width (for example, from 6
kb to 10 kb) although this is a matter of further consideration. As mentioned above in the
step, performing this step before strand displacement reduces the
amount of DNA to be processed for strand displacement, resulting in saving enzyme and buffer
for strand displacement. We recommend quantifying the amount of DNA after size selection.
iMate Protocol (version 2.1) by Phyloinformatics Unit, RIKEN Kobe – May 19, 2017
NGS and Phyloinfo in Kobe
http://www.clst.riken.jp/phylo/
2
After strand displacement and size selection (whether you perform these steps in this order or the
other way round), it is ideal to retain at least 100 ng of DNA. Although the standard protocol
mentions ‘150-400 ng’ (on page 27), 100-200ng is realistic and still promising, in our experience.
Do as instructed in the standard protocol.
Shearing determines the length of library inserts, which may well be coordinated with read length
in sequencing. If you regard only reads with adaptor junction as true mate pairs, we propose a
shearing condition which will ultimately result in the library size distribution of 300 – 700 bp with
the peak at 450-500bp in the step far below. Note that this is markedly
different from the size distribution illustrated in the standard protocol (300-1200bp; on page 49).
To achieve the size distribution proposed above, we recommend performing successive shearing
with multiple executions of the Covaris condition instructed in the standard protocol. In our
experience, shearing the genomes of different species with the same condition can result in
markedly different fragment size distributions. Thus, you need to optimize the condition
specifically for your species of interest. For one of the species we worked on, we performed as
many as 7 runs of Covaris shearing with the condition instructed in the standard protocol.
You may feel an urge to perform QC with Bioanalyzer immediately after the Covaris shearing, but
it will not give you a fair assessment of shearing results because you do not want to use a large
quantity of sheared DNA for QC. Thus, we recommend to save as much DNA as possible at this
stage and to measure the size distribution later in the step ‘ ’.
Do as instructed in the standard protocol.
To get as many unique mate-pair reads as possible, it is strongly recommended to reduce PCR
cycles and avoid excessive amplification. We suggest performing no more than 10 cycles of PCR.
This warning is supported by our experience of getting enough amount of products with 10 PCR
cycles, even for samples that are supposed to require 15 cycles according to the standard
protocol (for example, 100ng for libraries with mate distant range of 6-10kb; see [2] for details of
cycle number estimate). In fact, we normally perform 8 PCR cycles, and only when we find the
yield too low after AMPure clean-up do we perform additional PCR cycling (still, no more than 10
cycles in total). If you do not get enough products within 10 cycles, you had better first optimize
the tagment condition to increase the yield for the targeted size range.
With the illumina system, it seems that the insert lengths of many reads actually sequenced are
shorter than the most frequent insert length of a library. Thus, be sure to perform greedy size
selection with AMPure to get rid of molecules with short inserts, as instructed in the standard
protocol (x0.67 AMPure to get rid of <300bp molecules), no matter what the size distribution of
library inserts is. Modest size selection can result in high proportion of read pairs with too small
lengths, and they may not suffice for effective scaffolding.
Use Bioanalyzer or equivalent in this final QC before sequencing. Keep in mind that the size
distribution is determined mostly by shearing condition and AMPure clean-up, rather than the
choice of size range of mate distance.
iMate Protocol (version 2.1) by Phyloinformatics Unit, RIKEN Kobe – May 19, 2017
NGS and Phyloinfo in Kobe
http://www.clst.riken.jp/phylo/
3
We use KAPA Library Quantification Kit (KK4835) in this step. Quantification should not be tricky
if the library has an ordinary unimodal size distribution. The standard protocol says that you need
1.5nM-20nM of the synthesized library, but we think that 2nM is enough unless the sequencing
facility you are working with requests much more than required in an actual sequencing run.
In your first trial, it is advised to run a MiSeq for small-scale pilot sequencing to get 300nt-long
paired-end reads from prepared libraries―sequencing as many as 10 libraries per MiSeq run
should allow you fair validation of libraries. Obtained 300bp-long paired-end reads could also be
used for simulating which read length yields the highest proportion of reads with junction adaptor,
by chopping them at 100nt, 127nt and 171nt for example (if sequencing with HiSeq is planned
next).
The lengths of 127nt and 171nt may sound unusual, but with Rapid Run on HiSeq, one can
obtain reads of these lengths by making the best use of extra cycles inherently assigned for
Nextera dual indexing which we do not need in mate-pair sequencing. This trick allows you to get
127nt and 171nt, using three and four of the TruSeq Rapid SBS Kit for 50 cycles, respectively
(see page 6 of the official manual for TruSeq Rapid SBS Kit). Please consult with the sequencing
facility that you plan to work with, about the possibility of this extra-cycle sequencing. The
intention to get 127nt or 171nt is to increase the proportion of reads with the junction adaptor
inside, but if one plans to use all obtained reads including those without the junction adaptor, it
may be wiser to respect cost-saving and go for 100nt reads or even shorter.
In our experience, Rapid Run mode with v1 chemistry on older HiSeq Control Software (HCS) is
vulnerable to suboptimal library pooling, such as the ‘low plex pooling’ issue (see this document
by illumina). In the course of your mate pair sequencing, you may encounter a situation in which
you have only 4 or fewer libraries to be sequenced in a Rapid Run. In this case there is a high
chance that base composition in index reads will be too homogeneous, and you will get lower QV
in index reads, resulting in a larger proportion of reads that failed in demultiplexing. To reduce this
unfavorable effect, you could introduce multiple indices per library in the step above
. As long as demultiplexing between libraries works out without any overlap of indices,
this strategy is supposed to produce as many valid reads as possible, only with the cost of
handling more data files in post-sequencing informatics steps. The latest versions of HCS
(version 2.2.38 or higher) seems to be robust against low diversity samples, so you are
suggested to contact the sequencing facility you are working with in advance to make sure if you
need to be concerned with the low plex pooling issue.
We recommend to first run on raw fastq files a recent version of FastQC (v0.11 or higher) to
monitor some standard metrics, including the frequency of junction adaptor appearance along
base positions (in the ‘Adapter Content’ view newly added in FastQC v0.11).
After the primary QC, run a read processing program, such as NextClip [2] and assess PCR
duplicate rate and what proportion of reads has the junction adaptors. After the NextClip run, be
sure to rerun FastQC on processed fastq files of Category A, B and C, separately, in order to
confirm that junction/external adaptors and low-quality bases were properly trimmed.
1. Wang Q, Gu L, Adey A, Radlwimmer B, Wang W, Hovestadt V, Bahr M, Wolf S, Shendure J, Eils R et al:
Tagmentation-based whole-genome bisulfite sequencing. Nature protocols 2013, 8(10):2022-2032.
2. Heavens D, Garcia Accinelli G, Clavijo B, and Derek Clark M: A method to simultaneously construct up to
12 differently sized Illumina Nextera long mate pair libraries with reduced DNA input, time, and cost.
BioTechniques 2015, 59(1):42-45.
3. Leggett RM, Clavijo BJ, Clissold L, Clark MD, Caccamo M: NextClip: an analysis and read preparation tool
for Nextera Long Mate Pair libraries. Bioinformatics 2014, 30(4):566-568.

More Related Content

What's hot

Ngs de novo assembly progresses and challenges
Ngs de novo assembly progresses and challengesNgs de novo assembly progresses and challenges
Ngs de novo assembly progresses and challengesScott Edmunds
 
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...Baptiste Mayjonade
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...QIAGEN
 
Troubleshooting qPCR: What are my amplification curves telling me?
Troubleshooting qPCR: What are my amplification curves telling me?Troubleshooting qPCR: What are my amplification curves telling me?
Troubleshooting qPCR: What are my amplification curves telling me?Integrated DNA Technologies
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisAdamCribbs1
 
Making powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondMaking powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondAdamCribbs1
 
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for AllThe QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for AllQIAGEN
 
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...Jan Aerts
 
Troubleshooting qPCR: What Are My Amplification Curves Telling Me?
Troubleshooting qPCR: What Are My Amplification Curves Telling Me?Troubleshooting qPCR: What Are My Amplification Curves Telling Me?
Troubleshooting qPCR: What Are My Amplification Curves Telling Me?Integrated DNA Technologies
 
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...Integrated DNA Technologies
 
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...Andor Kiss
 
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTHigh efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTIntegrated DNA Technologies
 
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Integrated DNA Technologies
 
Annotating nc-RNAs with Rfam
Annotating nc-RNAs with RfamAnnotating nc-RNAs with Rfam
Annotating nc-RNAs with RfamLuca Cozzuto
 

What's hot (20)

Ngs de novo assembly progresses and challenges
Ngs de novo assembly progresses and challengesNgs de novo assembly progresses and challenges
Ngs de novo assembly progresses and challenges
 
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
BEST PRACTICE TO MAXIMIZE THROUGHPUT WITH NANOPORE TECHNOLOGY & DE NOVO SEQUE...
 
20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
 
2015 pag-metagenome
2015 pag-metagenome2015 pag-metagenome
2015 pag-metagenome
 
Types of PCR
Types of PCRTypes of PCR
Types of PCR
 
PCR
PCRPCR
PCR
 
PCR DNA PRIMER
PCR DNA PRIMERPCR DNA PRIMER
PCR DNA PRIMER
 
Troubleshooting qPCR: What are my amplification curves telling me?
Troubleshooting qPCR: What are my amplification curves telling me?Troubleshooting qPCR: What are my amplification curves telling me?
Troubleshooting qPCR: What are my amplification curves telling me?
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysis
 
Making powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondMaking powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyond
 
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for AllThe QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
The QIAseq NGS Portfolio for Cancer Research: Sample-to-Insight for All
 
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
 
Technical Tips for qPCR
Technical Tips for qPCRTechnical Tips for qPCR
Technical Tips for qPCR
 
Troubleshooting qPCR: What Are My Amplification Curves Telling Me?
Troubleshooting qPCR: What Are My Amplification Curves Telling Me?Troubleshooting qPCR: What Are My Amplification Curves Telling Me?
Troubleshooting qPCR: What Are My Amplification Curves Telling Me?
 
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
 
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
New Technologies at the Center for Bioinformatics & Functional Genomics at Mi...
 
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTHigh efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
 
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
 
Annotating nc-RNAs with Rfam
Annotating nc-RNAs with RfamAnnotating nc-RNAs with Rfam
Annotating nc-RNAs with Rfam
 

Similar to iMate Protocol Guide version 2.1

RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingmikaelhuss
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishingNikolay Vyahhi
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2BITS
 
1073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_10121073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_1012Elsa von Licy
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packagesRavi Gandham
 
Data Volume Compression Using BIST to get Low-Power Pseudorandom Test Pattern...
Data Volume Compression Using BIST to get Low-Power Pseudorandom Test Pattern...Data Volume Compression Using BIST to get Low-Power Pseudorandom Test Pattern...
Data Volume Compression Using BIST to get Low-Power Pseudorandom Test Pattern...IJMTST Journal
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshopc.titus.brown
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisSANJANA PANDEY
 
Benchmark: Bananas vs Spark Streaming
Benchmark: Bananas vs Spark StreamingBenchmark: Bananas vs Spark Streaming
Benchmark: Bananas vs Spark StreamingAKUDA Labs
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...ijesajournal
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...ijesajournal
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012Dan Gaston
 
rnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdfrnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdfPushpendra83
 
Unit 2 Star Activity.pdf
Unit 2 Star Activity.pdfUnit 2 Star Activity.pdf
Unit 2 Star Activity.pdfKhushiDuttVatsa
 
Part 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataPart 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataJoachim Jacob
 

Similar to iMate Protocol Guide version 2.1 (20)

RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
 
Final doc of dna
Final  doc of dnaFinal  doc of dna
Final doc of dna
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2
 
1073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_10121073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_1012
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packages
 
Data Volume Compression Using BIST to get Low-Power Pseudorandom Test Pattern...
Data Volume Compression Using BIST to get Low-Power Pseudorandom Test Pattern...Data Volume Compression Using BIST to get Low-Power Pseudorandom Test Pattern...
Data Volume Compression Using BIST to get Low-Power Pseudorandom Test Pattern...
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshop
 
Tools for Transcriptome Data Analysis
Tools for Transcriptome Data AnalysisTools for Transcriptome Data Analysis
Tools for Transcriptome Data Analysis
 
Benchmark: Bananas vs Spark Streaming
Benchmark: Bananas vs Spark StreamingBenchmark: Bananas vs Spark Streaming
Benchmark: Bananas vs Spark Streaming
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
 
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...Dominant block guided optimal cache size estimation to maximize ipc of embedd...
Dominant block guided optimal cache size estimation to maximize ipc of embedd...
 
Dgaston dec-06-2012
Dgaston dec-06-2012Dgaston dec-06-2012
Dgaston dec-06-2012
 
RNA-Seq
RNA-SeqRNA-Seq
RNA-Seq
 
AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
 
Robust tn5 transposase
Robust tn5 transposaseRobust tn5 transposase
Robust tn5 transposase
 
rnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdfrnaseq2015-02-18-170327193409.pdf
rnaseq2015-02-18-170327193409.pdf
 
Unit 2 Star Activity.pdf
Unit 2 Star Activity.pdfUnit 2 Star Activity.pdf
Unit 2 Star Activity.pdf
 
Part 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw dataPart 2 of RNA-seq for DE analysis: Investigating raw data
Part 2 of RNA-seq for DE analysis: Investigating raw data
 

Recently uploaded

Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!University of Hertfordshire
 
KOCH'S POSTULATE: an extensive over view.pptx
KOCH'S POSTULATE: an extensive over view.pptxKOCH'S POSTULATE: an extensive over view.pptx
KOCH'S POSTULATE: an extensive over view.pptxOmoniyiDayo
 
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...frank0071
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptxCherry
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Sérgio Sacani
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Sérgio Sacani
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureSérgio Sacani
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...TALAPATI ARUNA CHENNA VYDYANAD
 
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Sérgio Sacani
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Sérgio Sacani
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Sérgio Sacani
 
Cell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptxCell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptxCherry
 
Application of Mass Spectrometry In Biotechnology
Application of Mass Spectrometry In BiotechnologyApplication of Mass Spectrometry In Biotechnology
Application of Mass Spectrometry In BiotechnologyBhanu Krishan
 
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana LahariERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Laharimuralinath2
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinathmuralinath2
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Sahil Suleman
 
GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)Areesha Ahmad
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionAreesha Ahmad
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanmuralinath2
 
B lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and ActivationB lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and ActivationBhanu Krishan
 

Recently uploaded (20)

Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 
KOCH'S POSTULATE: an extensive over view.pptx
KOCH'S POSTULATE: an extensive over view.pptxKOCH'S POSTULATE: an extensive over view.pptx
KOCH'S POSTULATE: an extensive over view.pptx
 
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
Ostiguy & Panizza & Moffitt (eds.) - Populism in Global Perspective. A Perfor...
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptx
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 
Detectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a TechnosignatureDetectability of Solar Panels as a Technosignature
Detectability of Solar Panels as a Technosignature
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
Emergent ribozyme behaviors in oxychlorine brines indicate a unique niche for...
 
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
Exomoons & Exorings with the Habitable Worlds Observatory I: On the Detection...
 
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
Gliese 12 b: A Temperate Earth-sized Planet at 12 pc Ideal for Atmospheric Tr...
 
Cell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptxCell Immobilization Methods and Applications.pptx
Cell Immobilization Methods and Applications.pptx
 
Application of Mass Spectrometry In Biotechnology
Application of Mass Spectrometry In BiotechnologyApplication of Mass Spectrometry In Biotechnology
Application of Mass Spectrometry In Biotechnology
 
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana LahariERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
ERTHROPOIESIS: Dr. E. Muralinath & R. Gnana Lahari
 
Hemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. MuralinathHemoglobin metabolism: C Kalyan & E. Muralinath
Hemoglobin metabolism: C Kalyan & E. Muralinath
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
 
GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)GBSN - Microbiology Lab 2 (Compound Microscope)
GBSN - Microbiology Lab 2 (Compound Microscope)
 
GBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interactionGBSN - Microbiology (Unit 6) Human and Microbial interaction
GBSN - Microbiology (Unit 6) Human and Microbial interaction
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
 
B lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and ActivationB lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and Activation
 

iMate Protocol Guide version 2.1

  • 1. iMate Protocol (version 2.1) by Phyloinformatics Unit, RIKEN Kobe – May 19, 2017 NGS and Phyloinfo in Kobe http://www.clst.riken.jp/phylo/ 1 iMate Protocol: Improved and Inexpensive NexteraTM Mate Pair Library Preparation Authorized by Kaori Tatsumi, Osamu Nishimura, Kazu Itomi, Chiharu Tanegashima & Shigehiro Kuraku Phyloinformatics Unit RIKEN Center for Life Science Technologies (CLST) Notice: When you present or publish data based on technical guidance in this protocol, you could think about citing this protocol linked from our lab’s web site and our benchmark paper (Tatsumi et al., 2015). This protocol outlines the modifications to the ‘Gel-plus’ version of the standard protocol for Nextera Mate Pair Library Preparation and the logical background for them. Desiring optimal scaffolding performance, we have optimized the protocol under the possibly conservative policy that only read pairs with junction adaptors (bona fide ‘mates’) should be passed on to scaffolding. The keys for this protocol are optimizing the 1) tagmentation condition, 2) Covaris shearing condition, and 3) sequence read length, in order to enhance the yield of libraries and the capability of detecting the junction adaptor in reads. Basically, we understand that 4μg of starting genomic DNA, as formulated in the standard protocol, is enough for preparation of mate-pair libraries with mate distance of >10kb (but, unrealistic for >20kb). Ideally, we could optimize the tagmentation condition so that as much DNA as possible fall into the targeted size range. For this purpose, perform tagment reaction with multiple conditions, for example, in three tubes with 4, 8 and 12 μl of tagment enzyme supplied in the kit. The tagment buffer can be self-made [1], which leads to cost-saving, if other limiting reagents are also saved. Size distribution of tagmented DNA molecules should be analyzed with a trustworthy method, such as pulse field electrophoresis (e.g., PippinPulse) or the Agilent TapeStation―the Agilent Bioanalyzer does not perform well for this purpose. With comparable results from multiple tagment reactions, you could figure out which tagment condition allows you to retrieve the largest amount of DNA for the targeted size range. Like the previous tagmentation step, the amounts of the supplied reagents used in this step are the limiting factor in terms of how many libraries can be prepared with one purchased kit. Thus, it would be preferable to find a way to decrease the amount of kit-supplied reagents required to perform this step. This can be achieved by reducing the total volume of the reaction into a half, with all the components therein also proportionally reduced. Previously, we suggested (in the iMate protocol versions 1.X) to perform strand displacement with 1/4 volume for all reaction components, after size selection with BluePippin. But, we found that this can result in contamination of read pairs with untargeted mate distances. Therefore, we now do not recommend reversing the order of strand displacement and size selection. We are now looking into alternative ways to save kit-supplied reagents for the strand displacement step. Do as instructed in the standard protocol. We use a BluePippin in this step and usually set a size range of 4 kb in width (for example, from 6 kb to 10 kb) although this is a matter of further consideration. As mentioned above in the step, performing this step before strand displacement reduces the amount of DNA to be processed for strand displacement, resulting in saving enzyme and buffer for strand displacement. We recommend quantifying the amount of DNA after size selection.
  • 2. iMate Protocol (version 2.1) by Phyloinformatics Unit, RIKEN Kobe – May 19, 2017 NGS and Phyloinfo in Kobe http://www.clst.riken.jp/phylo/ 2 After strand displacement and size selection (whether you perform these steps in this order or the other way round), it is ideal to retain at least 100 ng of DNA. Although the standard protocol mentions ‘150-400 ng’ (on page 27), 100-200ng is realistic and still promising, in our experience. Do as instructed in the standard protocol. Shearing determines the length of library inserts, which may well be coordinated with read length in sequencing. If you regard only reads with adaptor junction as true mate pairs, we propose a shearing condition which will ultimately result in the library size distribution of 300 – 700 bp with the peak at 450-500bp in the step far below. Note that this is markedly different from the size distribution illustrated in the standard protocol (300-1200bp; on page 49). To achieve the size distribution proposed above, we recommend performing successive shearing with multiple executions of the Covaris condition instructed in the standard protocol. In our experience, shearing the genomes of different species with the same condition can result in markedly different fragment size distributions. Thus, you need to optimize the condition specifically for your species of interest. For one of the species we worked on, we performed as many as 7 runs of Covaris shearing with the condition instructed in the standard protocol. You may feel an urge to perform QC with Bioanalyzer immediately after the Covaris shearing, but it will not give you a fair assessment of shearing results because you do not want to use a large quantity of sheared DNA for QC. Thus, we recommend to save as much DNA as possible at this stage and to measure the size distribution later in the step ‘ ’. Do as instructed in the standard protocol. To get as many unique mate-pair reads as possible, it is strongly recommended to reduce PCR cycles and avoid excessive amplification. We suggest performing no more than 10 cycles of PCR. This warning is supported by our experience of getting enough amount of products with 10 PCR cycles, even for samples that are supposed to require 15 cycles according to the standard protocol (for example, 100ng for libraries with mate distant range of 6-10kb; see [2] for details of cycle number estimate). In fact, we normally perform 8 PCR cycles, and only when we find the yield too low after AMPure clean-up do we perform additional PCR cycling (still, no more than 10 cycles in total). If you do not get enough products within 10 cycles, you had better first optimize the tagment condition to increase the yield for the targeted size range. With the illumina system, it seems that the insert lengths of many reads actually sequenced are shorter than the most frequent insert length of a library. Thus, be sure to perform greedy size selection with AMPure to get rid of molecules with short inserts, as instructed in the standard protocol (x0.67 AMPure to get rid of <300bp molecules), no matter what the size distribution of library inserts is. Modest size selection can result in high proportion of read pairs with too small lengths, and they may not suffice for effective scaffolding. Use Bioanalyzer or equivalent in this final QC before sequencing. Keep in mind that the size distribution is determined mostly by shearing condition and AMPure clean-up, rather than the choice of size range of mate distance.
  • 3. iMate Protocol (version 2.1) by Phyloinformatics Unit, RIKEN Kobe – May 19, 2017 NGS and Phyloinfo in Kobe http://www.clst.riken.jp/phylo/ 3 We use KAPA Library Quantification Kit (KK4835) in this step. Quantification should not be tricky if the library has an ordinary unimodal size distribution. The standard protocol says that you need 1.5nM-20nM of the synthesized library, but we think that 2nM is enough unless the sequencing facility you are working with requests much more than required in an actual sequencing run. In your first trial, it is advised to run a MiSeq for small-scale pilot sequencing to get 300nt-long paired-end reads from prepared libraries―sequencing as many as 10 libraries per MiSeq run should allow you fair validation of libraries. Obtained 300bp-long paired-end reads could also be used for simulating which read length yields the highest proportion of reads with junction adaptor, by chopping them at 100nt, 127nt and 171nt for example (if sequencing with HiSeq is planned next). The lengths of 127nt and 171nt may sound unusual, but with Rapid Run on HiSeq, one can obtain reads of these lengths by making the best use of extra cycles inherently assigned for Nextera dual indexing which we do not need in mate-pair sequencing. This trick allows you to get 127nt and 171nt, using three and four of the TruSeq Rapid SBS Kit for 50 cycles, respectively (see page 6 of the official manual for TruSeq Rapid SBS Kit). Please consult with the sequencing facility that you plan to work with, about the possibility of this extra-cycle sequencing. The intention to get 127nt or 171nt is to increase the proportion of reads with the junction adaptor inside, but if one plans to use all obtained reads including those without the junction adaptor, it may be wiser to respect cost-saving and go for 100nt reads or even shorter. In our experience, Rapid Run mode with v1 chemistry on older HiSeq Control Software (HCS) is vulnerable to suboptimal library pooling, such as the ‘low plex pooling’ issue (see this document by illumina). In the course of your mate pair sequencing, you may encounter a situation in which you have only 4 or fewer libraries to be sequenced in a Rapid Run. In this case there is a high chance that base composition in index reads will be too homogeneous, and you will get lower QV in index reads, resulting in a larger proportion of reads that failed in demultiplexing. To reduce this unfavorable effect, you could introduce multiple indices per library in the step above . As long as demultiplexing between libraries works out without any overlap of indices, this strategy is supposed to produce as many valid reads as possible, only with the cost of handling more data files in post-sequencing informatics steps. The latest versions of HCS (version 2.2.38 or higher) seems to be robust against low diversity samples, so you are suggested to contact the sequencing facility you are working with in advance to make sure if you need to be concerned with the low plex pooling issue. We recommend to first run on raw fastq files a recent version of FastQC (v0.11 or higher) to monitor some standard metrics, including the frequency of junction adaptor appearance along base positions (in the ‘Adapter Content’ view newly added in FastQC v0.11). After the primary QC, run a read processing program, such as NextClip [2] and assess PCR duplicate rate and what proportion of reads has the junction adaptors. After the NextClip run, be sure to rerun FastQC on processed fastq files of Category A, B and C, separately, in order to confirm that junction/external adaptors and low-quality bases were properly trimmed. 1. Wang Q, Gu L, Adey A, Radlwimmer B, Wang W, Hovestadt V, Bahr M, Wolf S, Shendure J, Eils R et al: Tagmentation-based whole-genome bisulfite sequencing. Nature protocols 2013, 8(10):2022-2032. 2. Heavens D, Garcia Accinelli G, Clavijo B, and Derek Clark M: A method to simultaneously construct up to 12 differently sized Illumina Nextera long mate pair libraries with reduced DNA input, time, and cost. BioTechniques 2015, 59(1):42-45. 3. Leggett RM, Clavijo BJ, Clissold L, Clark MD, Caccamo M: NextClip: an analysis and read preparation tool for Nextera Long Mate Pair libraries. Bioinformatics 2014, 30(4):566-568.