SlideShare a Scribd company logo
iMate Protocol (version 3.0) by Phyloinformatics Unit, RIKEN Kobe – December 5, 2017
NGS and Phyloinfo in Kobe
http://www.clst.riken.jp/phylo/
1
iMate Protocol: Improved and Inexpensive NexteraTM
Mate Pair Library Preparation
Authorized by Kaori Tatsumi, Osamu Nishimura, Kazu Itomi, Chiharu Tanegashima & Shigehiro Kuraku
Phyloinformatics Unit
RIKEN Center for Life Science Technologies (CLST)
Notice: When you present or publish data based on technical guidance in this protocol, you could think about
citing this protocol linked from our lab’s web site and our benchmark paper (Tatsumi et al., 2015).
This protocol outlines the modifications to the ‘Gel-plus’ version of the standard protocol for
Nextera Mate Pair Library Preparation and the logical background for them. Desiring optimal
scaffolding performance, we have optimized the protocol under the possibly conservative policy
that only read pairs with junction adaptors (bona fide ‘mates’) should be passed on to
scaffolding. The keys for this protocol are optimizing the 1) tagmentation condition, 2) Covaris
shearing condition, and 3) sequence read length, in order to enhance the yield of libraries and
the capability of detecting the junction adaptor in reads.
Basically, we understand that 4μg of starting genomic DNA, as formulated in the standard
protocol, is enough for preparation of mate-pair libraries with mate distance of >10kb (but,
unrealistic for >20kb). Our record of minimal starting genomic DNA amount was 1.7μg to
prepare a library with the mate distance of 6-10kb, amplified with 10 PCR cycles, based on the
‘Gel-plus’ protocol.
Ideally, we could optimize the tagmentation condition so that as much DNA as possible fall into
the targeted size range. For this purpose, perform tagment reaction with multiple conditions, for
example, in three tubes with 4, 8 and 12 μl of tagment enzyme supplied in the kit. The tagment
buffer can be self-made [1], which leads to cost-saving, if other limiting reagents are also saved.
Size distribution of tagmented DNA molecules should be analyzed with a trustworthy method,
such as pulse field electrophoresis (e.g., PippinPulse) or the Agilent TapeStation―the Agilent
Bioanalyzer does not perform well for this purpose. With comparable results from multiple
tagment reactions, you could figure out which tagment condition allows you to retrieve the
largest amount of DNA for the targeted size range.
Like the previous tagmentation step, the amounts of the supplied reagents used in this step are
the limiting factor in terms of how many libraries can be prepared with one purchased kit. Thus,
it would be preferable to find a way to decrease the amount of kit-supplied reagents required to
perform this step. We have achieved this by reducing the total volume of the reaction into a half,
with all the components therein also proportionally reduced. This can still allow the library
amplification with the same number of PCR cycles (see below) as the
library prepared in the full volume.
Previously, we suggested (in the iMate protocol versions 1.X) to perform strand displacement
with 1/4 volume for all reaction components, after size selection with BluePippin. But, we found
that this can result in contamination of read pairs with untargeted mate distances. Therefore, we
now do not recommend reversing the order of strand displacement and size selection.
Do as instructed in the standard protocol.
iMate Protocol (version 3.0) by Phyloinformatics Unit, RIKEN Kobe – December 5, 2017
NGS and Phyloinfo in Kobe
http://www.clst.riken.jp/phylo/
2
We use a BluePippin in this step and usually set a size range of 4 kb in width although this is a
matter of further consideration. So far, we have succeeded in preparing the libraries with the
mate distance ranges of 1-6kb, 2-5kb, 2-6kb, 4-8kb, 6-10Kb, and 7-10kb using the BluePippin’s
External Standards S1 for 1-10kb (BLF7503), 10-15kb, 12-15kb, 12-16kb and 12-18kb using the
External Standards U1 for 10-18kb (BUF7503), and 18-27kb and 20-27kb using the External
Standards T1 for 18-27kb (BMF7503).
We recommend quantifying the amount of DNA after size selection. After size selection, it is
ideal to retain at least 100 ng of DNA. Although the standard protocol mentions ‘150-400 ng’ (on
page 22), 100-200ng, or even less, is realistic and still promising, in our experience.
Do as instructed in the standard protocol.
Shearing determines the length of library inserts, which may well be coordinated with read
length in sequencing. If you regard only reads with adaptor junction as true mate pairs, we
propose a shearing condition which will ultimately result in the library size distribution of 300-700
bp with the peak at 450-500bp in the step far below. Note that this is
markedly different from the size distribution illustrated in the standard protocol (300-1200bp; on
page 49). To achieve the size distribution proposed above, we recommend performing
successive shearing with multiple executions of the Covaris condition instructed in the standard
protocol. In our experience, shearing the genomes of different species with the same condition
can result in markedly different fragment size distributions. Thus, you need to optimize the
condition specifically for your species of interest. For one of the species we worked on, we
performed as many as 7 runs of Covaris shearing with the condition instructed in the standard
protocol.
If you regard not only reads with junction adaptor but also reads without junction adaptor as true
mate pairs, you do not need to shear DNA that intensively. In this case, you can aim at the
abovementioned size distribution illustrated in the standard protocol (300-1200bp; on page 49).
Recently, we perform the mate pair library preparation along this policy, with only one shearing.
You may feel an urge to perform QC with Bioanalyzer immediately after the Covaris shearing,
but it will not give you a fair assessment of shearing results because you do not want to use a
large quantity of sheared DNA for QC. Thus, we recommend to save as much DNA as possible
at this stage and to measure the size distribution later in the step ‘ ’.
You can perform these steps as instructed in the standard protocol. But, if you want to achieve a
higher efficiency, we recommend to switch to use KAPA LTP Library Preparation Kit for Illumina
Platforms (KK8232) or an equivalent, instead of the components of the Nextera Mate Pair. In our
experience, this will result in the reduction of PCR cycles by two cycles.
To get as many unique mate-pair reads as possible, it is strongly recommended to reduce PCR
cycles and avoid excessive amplification. We suggest performing no more than 10 cycles of
PCR if the targeted mate distance range is below 10kb. This warning is supported by our
experience of getting enough amounts of products with 10 PCR cycles, even for samples that
are supposed to require 15 cycles according to the standard protocol (for example, 100ng for
libraries with mate distant range of 6-10kb). If you do not get enough products within 10 cycles,
you had better first optimize the tagment condition to increase the yield for the targeted size
iMate Protocol (version 3.0) by Phyloinformatics Unit, RIKEN Kobe – December 5, 2017
NGS and Phyloinfo in Kobe
http://www.clst.riken.jp/phylo/
3
range. But, importantly, if the targeted mate distance range is longer than 10kb, you may need
more than 10 PCR cycles.
To determine the optimal number of PCR cycles, we perform a preliminary PCR using an aliquot
of the DNA from the previous step (for example, 1.5 μl of the total of 10 μl eluant) with KAPA
Real-time PCR library amplification Kit with fluorescent standards (KK2702). Adopting the cycle
number between the standard 1 and 2 of the kit, the secondary PCR using the rest of the DNA is
performed with KAPA Library Amplification Kit (KK2602).
With the illumina system, it seems that the insert lengths of many reads actually sequenced are
shorter than the most frequent insert length of a library (Figure 2 of [2]). Thus, be sure to perform
greedy size selection with AMPure to get rid of molecules with short inserts, as instructed in the
standard protocol (x0.67 AMPure to get rid of <300bp molecules), no matter what the size
distribution of library inserts is. Modest size selection can result in high proportion of read pairs
with too small lengths, and they may not suffice for effective scaffolding.
Use Bioanalyzer or equivalent in this final QC before sequencing. Keep in mind that the size
distribution is determined mostly by shearing condition and AMPure clean-up, rather than the
choice of mate distance.
We use KAPA Library Quantification Kit (KK4835) in this step. Quantification should not be tricky
if the library has an ordinary unimodal size distribution. The standard protocol says that you
need 1.5nM-20nM of the synthesized library, but we think that 2nM is enough unless the
sequencing facility you are working with requests much more than required in an actual
sequencing run.
In your first trial, it is advised to run a MiSeq for small-scale pilot sequencing to get 300nt-long
paired-end reads from prepared libraries―sequencing as many as 10 libraries per MiSeq run
should allow you fair validation of libraries. Obtained 300bp-long paired-end reads could also be
used for simulating which read length yields the highest proportion of reads with junction
adaptor, by chopping them at 100nt, 127nt and 171nt for example (if sequencing with HiSeq is
planned next).
The lengths of 127nt and 171nt may sound unusual, but with Rapid Run on HiSeq, one can
obtain reads of these lengths by making the best use of extra cycles inherently assigned for
Nextera dual indexing which we do not need in mate-pair sequencing. This trick allows you to
get 127nt and 171nt, using three and four of the TruSeq Rapid SBS Kit for 50 cycles,
respectively (see page 6 of the official manual for TruSeq Rapid SBS Kit). Please consult with
the sequencing facility that you plan to work with, about the possibility of this extra-cycle
sequencing. The intention to get 127nt or 171nt is to increase the proportion of reads with the
junction adaptor inside, but if one plans to use all obtained reads including those without the
junction adaptor, it may be wiser to respect cost-saving and go for 100nt reads or even shorter.
In our experience, Rapid Run mode with v1 chemistry on older HiSeq Control Software (HCS) is
vulnerable to suboptimal library pooling, such as the ‘low plex pooling’ issue (see this document
by illumina). In the course of your mate pair sequencing, you may encounter a situation in which
you have only 4 or fewer libraries to be sequenced in a Rapid Run. In this case there is a high
chance that base composition in index reads will be too homogeneous, and you will get lower
QV in index reads, resulting in a larger proportion of reads that failed in demultiplexing. To
reduce this unfavorable effect, you could introduce multiple indices per library in the step above
. As long as demultiplexing between libraries works out without any
iMate Protocol (version 3.0) by Phyloinformatics Unit, RIKEN Kobe – December 5, 2017
NGS and Phyloinfo in Kobe
http://www.clst.riken.jp/phylo/
4
overlap of indices, this strategy is supposed to produce as many valid reads as possible, only
with the cost of handling more data files in post-sequencing informatics steps. The latest
versions of HCS (version 2.2.38 or higher) seems to be robust against low diversity samples, so
you are suggested to contact the sequencing facility you are working with in advance to make
sure if you need to be concerned with the low plex pooling issue.
We recommend to first run on raw fastq files a recent version of FastQC (v0.11 or higher) to
monitor some standard metrics, including the frequency of junction adaptor appearance along
base positions (in the ‘Adapter Content’ view newly added in FastQC v0.11).
After the primary QC, run a read processing program, such as NextClip [3] and assess PCR
duplicate rate and what proportion of reads has the junction adaptors. After the NextClip run, be
sure to rerun FastQC on processed fastq files of Category A, B and C, separately, in order to
confirm that junction/external adaptors and low-quality bases were properly trimmed.
1. Wang Q, Gu L, Adey A, Radlwimmer B, Wang W, Hovestadt V, Bahr M, Wolf S, Shendure J, Eils R et al:
Tagmentation-based whole-genome bisulfite sequencing. Nature protocols 2013, 8(10):2022-2032.
2. Hara Y, Tatsumi K, Yoshida M, Kajikawa E, Kiyonari H, Kuraku S. Optimizing and benchmarking de novo
transcriptome sequencing: from library preparation to assembly evaluation. BMC Genomics 2015, 16:
977.
3. Leggett RM, Clavijo BJ, Clissold L, Clark MD, Caccamo M: NextClip: an analysis and read preparation tool
for Nextera Long Mate Pair libraries. Bioinformatics 2014, 30(4):566-568.

More Related Content

What's hot

Bioo Scientific - Improving the Performance of SureSelectXT2 Target Capture
Bioo Scientific - Improving the Performance of SureSelectXT2 Target CaptureBioo Scientific - Improving the Performance of SureSelectXT2 Target Capture
Bioo Scientific - Improving the Performance of SureSelectXT2 Target Capture
Bioo Scientific
 
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTHigh efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
Integrated DNA Technologies
 
20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop
External RNA Controls Consortium
 
PrimeTime® qPCR products for gene expression
PrimeTime® qPCR products for gene expressionPrimeTime® qPCR products for gene expression
PrimeTime® qPCR products for gene expression
Integrated DNA Technologies
 
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Integrated DNA Technologies
 
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
Integrated DNA Technologies
 
MCNext Sybr qPCR quantification Kit
MCNext Sybr qPCR quantification KitMCNext Sybr qPCR quantification Kit
MCNext Sybr qPCR quantification Kit
Rui Wang
 
Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...
Integrated DNA Technologies
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
QIAGEN
 
PCR DNA PRIMER
PCR DNA PRIMERPCR DNA PRIMER
PCR DNA PRIMER
ARPUTHA SELVARAJ A
 
Cleaning illumina reads - LSCC Lab Meeting - Fri 23 Nov 2012
Cleaning illumina reads - LSCC Lab Meeting - Fri 23 Nov 2012Cleaning illumina reads - LSCC Lab Meeting - Fri 23 Nov 2012
Cleaning illumina reads - LSCC Lab Meeting - Fri 23 Nov 2012
Torsten Seemann
 
High throughput qPCR: tips for analysis across multiple plates
High throughput qPCR: tips for analysis across multiple platesHigh throughput qPCR: tips for analysis across multiple plates
High throughput qPCR: tips for analysis across multiple plates
Integrated DNA Technologies
 
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
QIAGEN
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Hong ChangBum
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysis
AdamCribbs1
 
2015 osu-metagenome
2015 osu-metagenome2015 osu-metagenome
2015 osu-metagenome
c.titus.brown
 
Overcoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAsOvercoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAs
Integrated DNA Technologies
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
Dongyan Zhao
 
Making powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondMaking powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyond
AdamCribbs1
 
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Integrated DNA Technologies
 

What's hot (20)

Bioo Scientific - Improving the Performance of SureSelectXT2 Target Capture
Bioo Scientific - Improving the Performance of SureSelectXT2 Target CaptureBioo Scientific - Improving the Performance of SureSelectXT2 Target Capture
Bioo Scientific - Improving the Performance of SureSelectXT2 Target Capture
 
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDTHigh efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
High efficiency qPCR with PrimeTime® Gene Expression Master Mix from IDT
 
20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop20140711 4 e_tseng_ercc2.0_workshop
20140711 4 e_tseng_ercc2.0_workshop
 
PrimeTime® qPCR products for gene expression
PrimeTime® qPCR products for gene expressionPrimeTime® qPCR products for gene expression
PrimeTime® qPCR products for gene expression
 
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
Alt-R™ CRISPR-Cas9 System: Ribonucleoprotein delivery optimization for improv...
 
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
Understanding Melt Curves for Improved SYBR® Green Assay Analysis and Trouble...
 
MCNext Sybr qPCR quantification Kit
MCNext Sybr qPCR quantification KitMCNext Sybr qPCR quantification Kit
MCNext Sybr qPCR quantification Kit
 
Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...Target capture of DNA from FFPE samples— recommendations for generating robus...
Target capture of DNA from FFPE samples— recommendations for generating robus...
 
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overvi...
 
PCR DNA PRIMER
PCR DNA PRIMERPCR DNA PRIMER
PCR DNA PRIMER
 
Cleaning illumina reads - LSCC Lab Meeting - Fri 23 Nov 2012
Cleaning illumina reads - LSCC Lab Meeting - Fri 23 Nov 2012Cleaning illumina reads - LSCC Lab Meeting - Fri 23 Nov 2012
Cleaning illumina reads - LSCC Lab Meeting - Fri 23 Nov 2012
 
High throughput qPCR: tips for analysis across multiple plates
High throughput qPCR: tips for analysis across multiple platesHigh throughput qPCR: tips for analysis across multiple plates
High throughput qPCR: tips for analysis across multiple plates
 
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
NGS Targeted Enrichment Technology in Cancer Research: NGS Tech Overview Webi...
 
Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo Protocol
 
Making powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysisMaking powerful science: an introduction to NGS data analysis
Making powerful science: an introduction to NGS data analysis
 
2015 osu-metagenome
2015 osu-metagenome2015 osu-metagenome
2015 osu-metagenome
 
Overcoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAsOvercoming the challenges of designing efficient and specific CRISPR gRNAs
Overcoming the challenges of designing efficient and specific CRISPR gRNAs
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
 
Making powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyondMaking powerful science: an introduction to NGS and beyond
Making powerful science: an introduction to NGS and beyond
 
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
Ribonucleoprotein delivery of CRISPR-Cas9 reagents for increased gene editing...
 

Similar to iMate Protocol Guide version 3.0

AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
Genome Reference Consortium
 
Primer designing
Primer designingPrimer designing
Primer designing
Ravi Gandham
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
hansjansen9999
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packages
Ravi Gandham
 
1073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_10121073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_1012
Elsa von Licy
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
Nikolay Vyahhi
 
Improved Reagents & Methods for Target Enrichment in Next Generation Sequencing
Improved Reagents & Methods for Target Enrichment in Next Generation SequencingImproved Reagents & Methods for Target Enrichment in Next Generation Sequencing
Improved Reagents & Methods for Target Enrichment in Next Generation Sequencing
Integrated DNA Technologies
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
GenomeInABottle
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2
BITS
 
Biomicrofluidics-9-044103-2015
Biomicrofluidics-9-044103-2015Biomicrofluidics-9-044103-2015
Biomicrofluidics-9-044103-2015
Gabriel Antonio S. Minero
 
Unit 2 Star Activity.pdf
Unit 2 Star Activity.pdfUnit 2 Star Activity.pdf
Unit 2 Star Activity.pdf
KhushiDuttVatsa
 
NEBNext Ultra DNA for Illumina NGS (ChIP-seq and HLA)_Biomek FXP Automated Wo...
NEBNext Ultra DNA for Illumina NGS (ChIP-seq and HLA)_Biomek FXP Automated Wo...NEBNext Ultra DNA for Illumina NGS (ChIP-seq and HLA)_Biomek FXP Automated Wo...
NEBNext Ultra DNA for Illumina NGS (ChIP-seq and HLA)_Biomek FXP Automated Wo...
Zachary Smith
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshop
c.titus.brown
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) Technology
QIAGEN
 
20160308 dtl ngs_focus_group_meeting_slideshare
20160308 dtl ngs_focus_group_meeting_slideshare20160308 dtl ngs_focus_group_meeting_slideshare
20160308 dtl ngs_focus_group_meeting_slideshare
hansjansen9999
 
Enabling RNA-Seq With Limited RNA Using Whole Transcriptome Amplification
Enabling RNA-Seq With Limited RNA Using Whole Transcriptome AmplificationEnabling RNA-Seq With Limited RNA Using Whole Transcriptome Amplification
Enabling RNA-Seq With Limited RNA Using Whole Transcriptome Amplification
QIAGEN
 
Application Note: A Simple One-Step Library Prep Method To Enable AmpliSeq Pa...
Application Note: A Simple One-Step Library Prep Method To Enable AmpliSeq Pa...Application Note: A Simple One-Step Library Prep Method To Enable AmpliSeq Pa...
Application Note: A Simple One-Step Library Prep Method To Enable AmpliSeq Pa...
QIAGEN
 
Fruitbreedomics workshop wp6 dna extraction methods
Fruitbreedomics workshop wp6 dna extraction methodsFruitbreedomics workshop wp6 dna extraction methods
Fruitbreedomics workshop wp6 dna extraction methods
fruitbreedomics
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...
Alex Clark
 
Microarray validation
Microarray validationMicroarray validation
Microarray validation
Elsa von Licy
 

Similar to iMate Protocol Guide version 3.0 (20)

AGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: FultonAGBT2017 Reference Workshop: Fulton
AGBT2017 Reference Workshop: Fulton
 
Primer designing
Primer designingPrimer designing
Primer designing
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
RSEM and DE packages
RSEM and DE packagesRSEM and DE packages
RSEM and DE packages
 
1073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_10121073958 wp guide-develop-pcr_primers_1012
1073958 wp guide-develop-pcr_primers_1012
 
Assembly and finishing
Assembly and finishingAssembly and finishing
Assembly and finishing
 
Improved Reagents & Methods for Target Enrichment in Next Generation Sequencing
Improved Reagents & Methods for Target Enrichment in Next Generation SequencingImproved Reagents & Methods for Target Enrichment in Next Generation Sequencing
Improved Reagents & Methods for Target Enrichment in Next Generation Sequencing
 
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
Genome in a Bottle - Towards new benchmarks for the “dark matter” of the huma...
 
RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2RNA-seq: analysis of raw data and preprocessing - part 2
RNA-seq: analysis of raw data and preprocessing - part 2
 
Biomicrofluidics-9-044103-2015
Biomicrofluidics-9-044103-2015Biomicrofluidics-9-044103-2015
Biomicrofluidics-9-044103-2015
 
Unit 2 Star Activity.pdf
Unit 2 Star Activity.pdfUnit 2 Star Activity.pdf
Unit 2 Star Activity.pdf
 
NEBNext Ultra DNA for Illumina NGS (ChIP-seq and HLA)_Biomek FXP Automated Wo...
NEBNext Ultra DNA for Illumina NGS (ChIP-seq and HLA)_Biomek FXP Automated Wo...NEBNext Ultra DNA for Illumina NGS (ChIP-seq and HLA)_Biomek FXP Automated Wo...
NEBNext Ultra DNA for Illumina NGS (ChIP-seq and HLA)_Biomek FXP Automated Wo...
 
2013 pag-equine-workshop
2013 pag-equine-workshop2013 pag-equine-workshop
2013 pag-equine-workshop
 
Introduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) TechnologyIntroduction to Next-Generation Sequencing (NGS) Technology
Introduction to Next-Generation Sequencing (NGS) Technology
 
20160308 dtl ngs_focus_group_meeting_slideshare
20160308 dtl ngs_focus_group_meeting_slideshare20160308 dtl ngs_focus_group_meeting_slideshare
20160308 dtl ngs_focus_group_meeting_slideshare
 
Enabling RNA-Seq With Limited RNA Using Whole Transcriptome Amplification
Enabling RNA-Seq With Limited RNA Using Whole Transcriptome AmplificationEnabling RNA-Seq With Limited RNA Using Whole Transcriptome Amplification
Enabling RNA-Seq With Limited RNA Using Whole Transcriptome Amplification
 
Application Note: A Simple One-Step Library Prep Method To Enable AmpliSeq Pa...
Application Note: A Simple One-Step Library Prep Method To Enable AmpliSeq Pa...Application Note: A Simple One-Step Library Prep Method To Enable AmpliSeq Pa...
Application Note: A Simple One-Step Library Prep Method To Enable AmpliSeq Pa...
 
Fruitbreedomics workshop wp6 dna extraction methods
Fruitbreedomics workshop wp6 dna extraction methodsFruitbreedomics workshop wp6 dna extraction methods
Fruitbreedomics workshop wp6 dna extraction methods
 
Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...Bringing bioassay protocols to the world of informatics, using semantic annot...
Bringing bioassay protocols to the world of informatics, using semantic annot...
 
Microarray validation
Microarray validationMicroarray validation
Microarray validation
 

Recently uploaded

Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
Sérgio Sacani
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Texas Alliance of Groundwater Districts
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
muralinath2
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
Leonel Morgado
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
Carl Bergstrom
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 

Recently uploaded (20)

Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
The debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically youngThe debris of the ‘last major merger’ is dynamically young
The debris of the ‘last major merger’ is dynamically young
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero WaterSharlene Leurig - Enabling Onsite Water Use with Net Zero Water
Sharlene Leurig - Enabling Onsite Water Use with Net Zero Water
 
Oedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptxOedema_types_causes_pathophysiology.pptx
Oedema_types_causes_pathophysiology.pptx
 
Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...Authoring a personal GPT for your research and practice: How we created the Q...
Authoring a personal GPT for your research and practice: How we created the Q...
 
The cost of acquiring information by natural selection
The cost of acquiring information by natural selectionThe cost of acquiring information by natural selection
The cost of acquiring information by natural selection
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 

iMate Protocol Guide version 3.0

  • 1. iMate Protocol (version 3.0) by Phyloinformatics Unit, RIKEN Kobe – December 5, 2017 NGS and Phyloinfo in Kobe http://www.clst.riken.jp/phylo/ 1 iMate Protocol: Improved and Inexpensive NexteraTM Mate Pair Library Preparation Authorized by Kaori Tatsumi, Osamu Nishimura, Kazu Itomi, Chiharu Tanegashima & Shigehiro Kuraku Phyloinformatics Unit RIKEN Center for Life Science Technologies (CLST) Notice: When you present or publish data based on technical guidance in this protocol, you could think about citing this protocol linked from our lab’s web site and our benchmark paper (Tatsumi et al., 2015). This protocol outlines the modifications to the ‘Gel-plus’ version of the standard protocol for Nextera Mate Pair Library Preparation and the logical background for them. Desiring optimal scaffolding performance, we have optimized the protocol under the possibly conservative policy that only read pairs with junction adaptors (bona fide ‘mates’) should be passed on to scaffolding. The keys for this protocol are optimizing the 1) tagmentation condition, 2) Covaris shearing condition, and 3) sequence read length, in order to enhance the yield of libraries and the capability of detecting the junction adaptor in reads. Basically, we understand that 4μg of starting genomic DNA, as formulated in the standard protocol, is enough for preparation of mate-pair libraries with mate distance of >10kb (but, unrealistic for >20kb). Our record of minimal starting genomic DNA amount was 1.7μg to prepare a library with the mate distance of 6-10kb, amplified with 10 PCR cycles, based on the ‘Gel-plus’ protocol. Ideally, we could optimize the tagmentation condition so that as much DNA as possible fall into the targeted size range. For this purpose, perform tagment reaction with multiple conditions, for example, in three tubes with 4, 8 and 12 μl of tagment enzyme supplied in the kit. The tagment buffer can be self-made [1], which leads to cost-saving, if other limiting reagents are also saved. Size distribution of tagmented DNA molecules should be analyzed with a trustworthy method, such as pulse field electrophoresis (e.g., PippinPulse) or the Agilent TapeStation―the Agilent Bioanalyzer does not perform well for this purpose. With comparable results from multiple tagment reactions, you could figure out which tagment condition allows you to retrieve the largest amount of DNA for the targeted size range. Like the previous tagmentation step, the amounts of the supplied reagents used in this step are the limiting factor in terms of how many libraries can be prepared with one purchased kit. Thus, it would be preferable to find a way to decrease the amount of kit-supplied reagents required to perform this step. We have achieved this by reducing the total volume of the reaction into a half, with all the components therein also proportionally reduced. This can still allow the library amplification with the same number of PCR cycles (see below) as the library prepared in the full volume. Previously, we suggested (in the iMate protocol versions 1.X) to perform strand displacement with 1/4 volume for all reaction components, after size selection with BluePippin. But, we found that this can result in contamination of read pairs with untargeted mate distances. Therefore, we now do not recommend reversing the order of strand displacement and size selection. Do as instructed in the standard protocol.
  • 2. iMate Protocol (version 3.0) by Phyloinformatics Unit, RIKEN Kobe – December 5, 2017 NGS and Phyloinfo in Kobe http://www.clst.riken.jp/phylo/ 2 We use a BluePippin in this step and usually set a size range of 4 kb in width although this is a matter of further consideration. So far, we have succeeded in preparing the libraries with the mate distance ranges of 1-6kb, 2-5kb, 2-6kb, 4-8kb, 6-10Kb, and 7-10kb using the BluePippin’s External Standards S1 for 1-10kb (BLF7503), 10-15kb, 12-15kb, 12-16kb and 12-18kb using the External Standards U1 for 10-18kb (BUF7503), and 18-27kb and 20-27kb using the External Standards T1 for 18-27kb (BMF7503). We recommend quantifying the amount of DNA after size selection. After size selection, it is ideal to retain at least 100 ng of DNA. Although the standard protocol mentions ‘150-400 ng’ (on page 22), 100-200ng, or even less, is realistic and still promising, in our experience. Do as instructed in the standard protocol. Shearing determines the length of library inserts, which may well be coordinated with read length in sequencing. If you regard only reads with adaptor junction as true mate pairs, we propose a shearing condition which will ultimately result in the library size distribution of 300-700 bp with the peak at 450-500bp in the step far below. Note that this is markedly different from the size distribution illustrated in the standard protocol (300-1200bp; on page 49). To achieve the size distribution proposed above, we recommend performing successive shearing with multiple executions of the Covaris condition instructed in the standard protocol. In our experience, shearing the genomes of different species with the same condition can result in markedly different fragment size distributions. Thus, you need to optimize the condition specifically for your species of interest. For one of the species we worked on, we performed as many as 7 runs of Covaris shearing with the condition instructed in the standard protocol. If you regard not only reads with junction adaptor but also reads without junction adaptor as true mate pairs, you do not need to shear DNA that intensively. In this case, you can aim at the abovementioned size distribution illustrated in the standard protocol (300-1200bp; on page 49). Recently, we perform the mate pair library preparation along this policy, with only one shearing. You may feel an urge to perform QC with Bioanalyzer immediately after the Covaris shearing, but it will not give you a fair assessment of shearing results because you do not want to use a large quantity of sheared DNA for QC. Thus, we recommend to save as much DNA as possible at this stage and to measure the size distribution later in the step ‘ ’. You can perform these steps as instructed in the standard protocol. But, if you want to achieve a higher efficiency, we recommend to switch to use KAPA LTP Library Preparation Kit for Illumina Platforms (KK8232) or an equivalent, instead of the components of the Nextera Mate Pair. In our experience, this will result in the reduction of PCR cycles by two cycles. To get as many unique mate-pair reads as possible, it is strongly recommended to reduce PCR cycles and avoid excessive amplification. We suggest performing no more than 10 cycles of PCR if the targeted mate distance range is below 10kb. This warning is supported by our experience of getting enough amounts of products with 10 PCR cycles, even for samples that are supposed to require 15 cycles according to the standard protocol (for example, 100ng for libraries with mate distant range of 6-10kb). If you do not get enough products within 10 cycles, you had better first optimize the tagment condition to increase the yield for the targeted size
  • 3. iMate Protocol (version 3.0) by Phyloinformatics Unit, RIKEN Kobe – December 5, 2017 NGS and Phyloinfo in Kobe http://www.clst.riken.jp/phylo/ 3 range. But, importantly, if the targeted mate distance range is longer than 10kb, you may need more than 10 PCR cycles. To determine the optimal number of PCR cycles, we perform a preliminary PCR using an aliquot of the DNA from the previous step (for example, 1.5 μl of the total of 10 μl eluant) with KAPA Real-time PCR library amplification Kit with fluorescent standards (KK2702). Adopting the cycle number between the standard 1 and 2 of the kit, the secondary PCR using the rest of the DNA is performed with KAPA Library Amplification Kit (KK2602). With the illumina system, it seems that the insert lengths of many reads actually sequenced are shorter than the most frequent insert length of a library (Figure 2 of [2]). Thus, be sure to perform greedy size selection with AMPure to get rid of molecules with short inserts, as instructed in the standard protocol (x0.67 AMPure to get rid of <300bp molecules), no matter what the size distribution of library inserts is. Modest size selection can result in high proportion of read pairs with too small lengths, and they may not suffice for effective scaffolding. Use Bioanalyzer or equivalent in this final QC before sequencing. Keep in mind that the size distribution is determined mostly by shearing condition and AMPure clean-up, rather than the choice of mate distance. We use KAPA Library Quantification Kit (KK4835) in this step. Quantification should not be tricky if the library has an ordinary unimodal size distribution. The standard protocol says that you need 1.5nM-20nM of the synthesized library, but we think that 2nM is enough unless the sequencing facility you are working with requests much more than required in an actual sequencing run. In your first trial, it is advised to run a MiSeq for small-scale pilot sequencing to get 300nt-long paired-end reads from prepared libraries―sequencing as many as 10 libraries per MiSeq run should allow you fair validation of libraries. Obtained 300bp-long paired-end reads could also be used for simulating which read length yields the highest proportion of reads with junction adaptor, by chopping them at 100nt, 127nt and 171nt for example (if sequencing with HiSeq is planned next). The lengths of 127nt and 171nt may sound unusual, but with Rapid Run on HiSeq, one can obtain reads of these lengths by making the best use of extra cycles inherently assigned for Nextera dual indexing which we do not need in mate-pair sequencing. This trick allows you to get 127nt and 171nt, using three and four of the TruSeq Rapid SBS Kit for 50 cycles, respectively (see page 6 of the official manual for TruSeq Rapid SBS Kit). Please consult with the sequencing facility that you plan to work with, about the possibility of this extra-cycle sequencing. The intention to get 127nt or 171nt is to increase the proportion of reads with the junction adaptor inside, but if one plans to use all obtained reads including those without the junction adaptor, it may be wiser to respect cost-saving and go for 100nt reads or even shorter. In our experience, Rapid Run mode with v1 chemistry on older HiSeq Control Software (HCS) is vulnerable to suboptimal library pooling, such as the ‘low plex pooling’ issue (see this document by illumina). In the course of your mate pair sequencing, you may encounter a situation in which you have only 4 or fewer libraries to be sequenced in a Rapid Run. In this case there is a high chance that base composition in index reads will be too homogeneous, and you will get lower QV in index reads, resulting in a larger proportion of reads that failed in demultiplexing. To reduce this unfavorable effect, you could introduce multiple indices per library in the step above . As long as demultiplexing between libraries works out without any
  • 4. iMate Protocol (version 3.0) by Phyloinformatics Unit, RIKEN Kobe – December 5, 2017 NGS and Phyloinfo in Kobe http://www.clst.riken.jp/phylo/ 4 overlap of indices, this strategy is supposed to produce as many valid reads as possible, only with the cost of handling more data files in post-sequencing informatics steps. The latest versions of HCS (version 2.2.38 or higher) seems to be robust against low diversity samples, so you are suggested to contact the sequencing facility you are working with in advance to make sure if you need to be concerned with the low plex pooling issue. We recommend to first run on raw fastq files a recent version of FastQC (v0.11 or higher) to monitor some standard metrics, including the frequency of junction adaptor appearance along base positions (in the ‘Adapter Content’ view newly added in FastQC v0.11). After the primary QC, run a read processing program, such as NextClip [3] and assess PCR duplicate rate and what proportion of reads has the junction adaptors. After the NextClip run, be sure to rerun FastQC on processed fastq files of Category A, B and C, separately, in order to confirm that junction/external adaptors and low-quality bases were properly trimmed. 1. Wang Q, Gu L, Adey A, Radlwimmer B, Wang W, Hovestadt V, Bahr M, Wolf S, Shendure J, Eils R et al: Tagmentation-based whole-genome bisulfite sequencing. Nature protocols 2013, 8(10):2022-2032. 2. Hara Y, Tatsumi K, Yoshida M, Kajikawa E, Kiyonari H, Kuraku S. Optimizing and benchmarking de novo transcriptome sequencing: from library preparation to assembly evaluation. BMC Genomics 2015, 16: 977. 3. Leggett RM, Clavijo BJ, Clissold L, Clark MD, Caccamo M: NextClip: an analysis and read preparation tool for Nextera Long Mate Pair libraries. Bioinformatics 2014, 30(4):566-568.