A presentation of the major nextgen sequencing technologies. Price indications were valid 1st quarter of 2009. Mac keynote file, half french half english. Hope this can help.
In her recent publication “Fast isogenic mapping-by-sequencing of EMS-induced mutant bulks” in Plant Physiology, Dr. Franziska Turck and her team introduced deep candidate resequencing (dCARE) using the Ion PGM™ Sequencer to their Arabidopsis mutant identification pipeline.
These slides are from her Decmeber 5th live webinar presentation about the application of isogenic mapping approach for plant gene identification with fast and cost-effective barcoding using the Ion PGM™ system. She shared with the webinar attendees her experience with the ways that the Ion PGM™ system improves her deep sequencing workflow.
Learn more about the Ion Proton™ and Ion PGM™ here http://owl.li/g19ix
This document discusses next generation sequencing (NGS) data preprocessing and quality control. It provides a brief history of DNA sequencing technologies and compares current NGS platforms. The importance of quality control and preprocessing NGS data is explained. Key metrics for assessing NGS data quality are described, including per base quality scores, GC content distributions, and k-mer content. Tools for preprocessing (Fastx-toolkit) and quality control (FastQC) are introduced.
NGS has enabled high-throughput genome sequencing and analysis, changing genomic research. Technologies like Roche 454, Solexa/Illumina, and SOLiD allow massively parallel sequencing of genomes. NGS has applications in de novo genome sequencing, resequencing, RNA-seq, ChIP-seq, methylation analysis, and more. It provides advantages over microarrays like detecting novel transcripts, splicing variants, and sequence variations. NGS data requires processing including quality control, mapping, and variant identification to realize its full potential to revolutionize genomic research and medicine.
Updated: New High Throughput Sequencing technologies at the Norwegian Sequenc...Lex Nederbragt
Un update of the previous talk with the same title. A talk I gave at the Computational Life Science initiative (University of Oslo) about new High Throughput Sequencing instruments at the Norwegian Sequencing Centre. I also mentioned future upgrades, and the upcoming nanopore sequencing platform of Oxford nanopore.
The document describes developing a pipeline for analyzing next generation sequencing (NGS) data. It discusses various NGS platforms, available tools for quality control, normalization, reference mapping, de novo assembly, and annotation. It assesses the performance of different tools and evaluates how read length affects the resolution of repeats for de novo assembly of prokaryotic genomes. The analysis finds that relatively modest read lengths can produce well-connected assemblies for most prokaryotes, and extending reads has diminishing returns.
The document provides an overview of plant genome sequence assembly, including:
1) A brief history of sequencing technologies and their improvements over time, from Sanger sequencing to newer technologies producing longer reads.
2) Key steps in a sequencing project including read processing, filtering, and corrections before assembly into contigs and scaffolds using appropriate software.
3) Factors to consider for experimental design and assembly optimization such as sequencing depth, library types, and software choices depending on the genome and data characteristics.
Overview of methods for variant calling from next-generation sequence dataThomas Keane
This document provides an overview of methods for variant calling from next-generation sequencing data. It discusses data formats and workflows, including SNP calling, short indels, and structural variation. The document describes alignment, BAM improvement through realignment and base quality recalibration, library merging, and duplicate removal. It also reviews software tools for these processes and introduces the variant call format (VCF) standard.
The field of next-generation sequencing (NGS) has been experiencing explosive growth over the past several years and shows little sign of slowing down. The increasing capabilities and dramatically lowered costs have expanded NGS's reach beyond that of the human genome into nearly every corner of biological research. An overview of the platforms on the market today, including an assessment of their relative strengths and weaknesses, will be presented. The presentation will conclude with a peek into where the technology is going and what will be available in the future.
In her recent publication “Fast isogenic mapping-by-sequencing of EMS-induced mutant bulks” in Plant Physiology, Dr. Franziska Turck and her team introduced deep candidate resequencing (dCARE) using the Ion PGM™ Sequencer to their Arabidopsis mutant identification pipeline.
These slides are from her Decmeber 5th live webinar presentation about the application of isogenic mapping approach for plant gene identification with fast and cost-effective barcoding using the Ion PGM™ system. She shared with the webinar attendees her experience with the ways that the Ion PGM™ system improves her deep sequencing workflow.
Learn more about the Ion Proton™ and Ion PGM™ here http://owl.li/g19ix
This document discusses next generation sequencing (NGS) data preprocessing and quality control. It provides a brief history of DNA sequencing technologies and compares current NGS platforms. The importance of quality control and preprocessing NGS data is explained. Key metrics for assessing NGS data quality are described, including per base quality scores, GC content distributions, and k-mer content. Tools for preprocessing (Fastx-toolkit) and quality control (FastQC) are introduced.
NGS has enabled high-throughput genome sequencing and analysis, changing genomic research. Technologies like Roche 454, Solexa/Illumina, and SOLiD allow massively parallel sequencing of genomes. NGS has applications in de novo genome sequencing, resequencing, RNA-seq, ChIP-seq, methylation analysis, and more. It provides advantages over microarrays like detecting novel transcripts, splicing variants, and sequence variations. NGS data requires processing including quality control, mapping, and variant identification to realize its full potential to revolutionize genomic research and medicine.
Updated: New High Throughput Sequencing technologies at the Norwegian Sequenc...Lex Nederbragt
Un update of the previous talk with the same title. A talk I gave at the Computational Life Science initiative (University of Oslo) about new High Throughput Sequencing instruments at the Norwegian Sequencing Centre. I also mentioned future upgrades, and the upcoming nanopore sequencing platform of Oxford nanopore.
The document describes developing a pipeline for analyzing next generation sequencing (NGS) data. It discusses various NGS platforms, available tools for quality control, normalization, reference mapping, de novo assembly, and annotation. It assesses the performance of different tools and evaluates how read length affects the resolution of repeats for de novo assembly of prokaryotic genomes. The analysis finds that relatively modest read lengths can produce well-connected assemblies for most prokaryotes, and extending reads has diminishing returns.
The document provides an overview of plant genome sequence assembly, including:
1) A brief history of sequencing technologies and their improvements over time, from Sanger sequencing to newer technologies producing longer reads.
2) Key steps in a sequencing project including read processing, filtering, and corrections before assembly into contigs and scaffolds using appropriate software.
3) Factors to consider for experimental design and assembly optimization such as sequencing depth, library types, and software choices depending on the genome and data characteristics.
Overview of methods for variant calling from next-generation sequence dataThomas Keane
This document provides an overview of methods for variant calling from next-generation sequencing data. It discusses data formats and workflows, including SNP calling, short indels, and structural variation. The document describes alignment, BAM improvement through realignment and base quality recalibration, library merging, and duplicate removal. It also reviews software tools for these processes and introduces the variant call format (VCF) standard.
The field of next-generation sequencing (NGS) has been experiencing explosive growth over the past several years and shows little sign of slowing down. The increasing capabilities and dramatically lowered costs have expanded NGS's reach beyond that of the human genome into nearly every corner of biological research. An overview of the platforms on the market today, including an assessment of their relative strengths and weaknesses, will be presented. The presentation will conclude with a peek into where the technology is going and what will be available in the future.
1000G/UK10K: Bioinformatics, storage, and compute challenges of large scale r...Thomas Keane
This document discusses the bioinformatics challenges of large-scale human genome resequencing projects like 1000 Genomes and UK10K. It notes that over 23 terabases of sequence data have been generated, requiring innovative solutions for storage, computing and data sharing due to the massive scale. A tiered storage model and use of transposed BAM files are proposed to help process and analyze the data more efficiently.
This tutorial provides an overview of working with next-generation sequencing data, including quality control, alignment, and variation analysis. It covers topics such as next-gen sequencing technologies and applications, quality control measures, short read alignment algorithms and tools, sequence assembly methods, and calling variants from sequencing data. The tutorial is presented by Thomas Keane and Jan Aerts at the 9th European Conference on Computational Biology.
The SOLiD 3 System provides high throughput DNA sequencing with several advantages over other technologies:
- It can sequence entire transcriptomes without any gaps, determine strand-specific expression patterns, and detect SNPs with low false positives.
- Applications include assessing DNA-protein interactions across multiple samples, discovering novel transcripts and splice variants without microarray bias, and characterizing structural rearrangements.
- The system uses emulsion PCR to clonally amplify template beads, followed by deposition of modified beads on a flow cell and sequencing by ligation using fluorescently labeled di-base probes.
Next-generation sequencing data format and visualization with ngs.plot 2015Li Shen
An introduction to the commonly used formats for the next-generation sequencing data. ngs.plot is a popular tool for the visualization and data mining of the NGS data.
Ion Torrent (Proton/PGM) and SOLiD sequencing are two types of next-generation sequencing technologies. Ion Torrent uses semiconductor sequencing to detect hydrogen ions released during DNA synthesis, while SOLiD uses ligation of octamer probes and fluorescent dyes to determine sequences in color space. Both have advantages such as fast run times and high throughput but also limitations including errors in homopolymers for Ion Torrent and issues with palindromic sequences for SOLiD.
The document summarizes Ion Torrent sequencing technology. It detects hydrogen ions released during DNA polymerization rather than using optics. The sequencing occurs on semiconductor chips patterned through photolithography into wells, each sequencing a different template. As nucleotides are incorporated, hydrogen ions change the pH detected by ion sensors below each well. This allows massively parallel sequencing that is faster, cheaper and simpler than previous technologies.
Neuroscience core lecture given at the Icahn school of medicine at Mount Sinai. This is the version 2 of the same topic. I have made some modifications to give a more gentle introduction and add a new example for ngs.plot.
Illumina Infinium sequencing is a next-generation sequencing technique that uses sequencing by synthesis. It involves randomly fragmenting DNA, ligating adapters, and amplifying fragments on a flow cell in clusters through bridge amplification. Sequencing occurs by adding fluorescently labeled, reversible terminator nucleotides one at a time while the fluorescence is detected to determine the sequence of each cluster. This allows for massively parallel sequencing of many DNA fragments simultaneously.
Neurotech seminar ish wish 2014 madunaTando Maduna
This document discusses in vitro transcription and fluorescent in situ hybridization (FISH) techniques for visualizing gene expression in tissue samples. It describes the process of designing gene-specific primers, amplifying the gene of interest via PCR, synthesizing fluorescently-labeled RNA probes from the PCR product, hybridizing the probes to tissue samples, and using fluorescence microscopy to visualize where in the tissue the gene is expressed at a cellular level. The document provides technical details on each step of the process and discusses ways to optimize and troubleshoot the technique.
Improving and validating the Atlantic Cod genome assembly using PacBioLex Nederbragt
This document summarizes work using PacBio long reads to improve the Atlantic cod genome assembly. Error-corrected and raw PacBio reads were used with different assembly programs. Both helped increase contig and scaffold lengths over the previous assembly, with raw reads performing best. Bridgemapper validation found misassemblies corrected by PacBio. The improved assembly met goals of <5% gaps and scaffold N50 over 1 Mbp. Lessons included developing programs to handle cod's heterozygosity and structural variation better. The new assembly version aims to have 23 pseudochromosomes and improved annotation.
1. Next-generation sequencing methods such as Roche 454, Illumina GAII, and ABI SOLiD allow for high throughput DNA sequencing through massive parallel sequencing.
2. These methods involve clonal amplification of DNA fragments on solid surfaces or in emulsion PCR followed by sequencing using pyrosequencing, sequencing by synthesis with reversible terminators, or sequencing by ligation approaches.
3. The resulting sequencing data requires high throughput management and analysis pipelines to process the large volumes of sequence data produced.
This document summarizes research characterizing DNA methylation in the Pacific oyster Crassostrea gigas. High-throughput bisulfite sequencing was used to analyze DNA methylation patterns at high resolution. Several genes were found to have different levels and patterns of methylation across tissues and developmental stages. The results provide evidence that DNA methylation plays an important regulatory role and may be involved in environmental responses in C. gigas. Future work will investigate how epigenetic mechanisms are affected by environmental stressors.
Ion torrent semiconductor sequencing technologyCD Genomics
Ion Torrent is the latest generation sequencing technology. Its core technology is the use of semiconductor technology in chemical and digital information to establish a direct link.
BioRuby is a bioinformatics library for the Ruby programming language. It provides object-oriented tools for tasks like sequence analysis, format conversion, running bioinformatics tools, and working with biological data. The latest version added features like improved support for phylogenetic XML (PhyloXML), next-generation sequencing FASTQ format reading/writing, and a REST API wrapper for the NCBI database. BioRuby development follows agile principles and its large developer community contributes new code frequently on GitHub. The project aims to improve integration with R and data visualization while maintaining a stable core.
The document discusses DNA, RNA, gene expression, and various techniques related to studying nucleic acids including microarray, PCR, and real-time PCR. It defines the key components and structures of DNA and RNA, describes the central dogma of molecular biology regarding gene expression through replication, transcription and translation, and provides details on experimental procedures like microarray, PCR cycling, and real-time quantitative PCR using different fluorescent probes.
Next-generation sequencing format and visualization with ngs.plotLi Shen
Lecture given at the department of neuroscience, Icahn school of medicine at Mount Sinai. ngs.plot has been published in BMC genomics. Link: http://www.biomedcentral.com/1471-2164/15/284
This document provides an introduction to next generation sequencing (NGS) technologies. It begins with an outline of topics to be covered, including the evolution of NGS technologies, their descriptions and comparisons, bioinformatics challenges of NGS data analysis, and some aspects of NGS data analysis workflows and tools. The document then delves into explanations of specific NGS platforms, their performance characteristics, and the sequencing processes. It discusses the large computational infrastructure and data management needs of NGS, as well as quality control, preprocessing of NGS data, and popular analysis tools and workflows.
A Summary Of Recent Advances In Dna Sequencing 2 24 10 ScSawahC
Recent advances in DNA sequencing technology include single molecule real-time sequencing from Pacific Biosciences and nanopore sequencing. Pacific Biosciences' SMRT sequencing detects fluorescent signals from a single DNA polymerase as it incorporates labeled nucleotides in real time, allowing sequencing of over 10,000 base pairs at a rate of 1-3 bases per second. Nanopore sequencing techniques thread DNA molecules through nanopores and detect individual bases electronically without the need for amplification or labeling. These new methods promise faster, cheaper, and longer read sequencing compared to traditional dye-terminator sequencing.
The document describes the steps of Illumina sequencing. Genomic DNA is first fragmented and adapters are ligated to create single-stranded DNA fragments. These fragments are attached to a flow cell and undergo bridge amplification to create clusters of identical DNA fragments. Sequencing occurs through cycles of reversible terminator-based sequencing using fluorescently labeled dNTPs, imaging of the fluorescence, and cleavage of the label and terminator to allow the next cycle. After multiple cycles, the sequenced reads are aligned to the reference genome to determine the original sequence.
1000G/UK10K: Bioinformatics, storage, and compute challenges of large scale r...Thomas Keane
This document discusses the bioinformatics challenges of large-scale human genome resequencing projects like 1000 Genomes and UK10K. It notes that over 23 terabases of sequence data have been generated, requiring innovative solutions for storage, computing and data sharing due to the massive scale. A tiered storage model and use of transposed BAM files are proposed to help process and analyze the data more efficiently.
This tutorial provides an overview of working with next-generation sequencing data, including quality control, alignment, and variation analysis. It covers topics such as next-gen sequencing technologies and applications, quality control measures, short read alignment algorithms and tools, sequence assembly methods, and calling variants from sequencing data. The tutorial is presented by Thomas Keane and Jan Aerts at the 9th European Conference on Computational Biology.
The SOLiD 3 System provides high throughput DNA sequencing with several advantages over other technologies:
- It can sequence entire transcriptomes without any gaps, determine strand-specific expression patterns, and detect SNPs with low false positives.
- Applications include assessing DNA-protein interactions across multiple samples, discovering novel transcripts and splice variants without microarray bias, and characterizing structural rearrangements.
- The system uses emulsion PCR to clonally amplify template beads, followed by deposition of modified beads on a flow cell and sequencing by ligation using fluorescently labeled di-base probes.
Next-generation sequencing data format and visualization with ngs.plot 2015Li Shen
An introduction to the commonly used formats for the next-generation sequencing data. ngs.plot is a popular tool for the visualization and data mining of the NGS data.
Ion Torrent (Proton/PGM) and SOLiD sequencing are two types of next-generation sequencing technologies. Ion Torrent uses semiconductor sequencing to detect hydrogen ions released during DNA synthesis, while SOLiD uses ligation of octamer probes and fluorescent dyes to determine sequences in color space. Both have advantages such as fast run times and high throughput but also limitations including errors in homopolymers for Ion Torrent and issues with palindromic sequences for SOLiD.
The document summarizes Ion Torrent sequencing technology. It detects hydrogen ions released during DNA polymerization rather than using optics. The sequencing occurs on semiconductor chips patterned through photolithography into wells, each sequencing a different template. As nucleotides are incorporated, hydrogen ions change the pH detected by ion sensors below each well. This allows massively parallel sequencing that is faster, cheaper and simpler than previous technologies.
Neuroscience core lecture given at the Icahn school of medicine at Mount Sinai. This is the version 2 of the same topic. I have made some modifications to give a more gentle introduction and add a new example for ngs.plot.
Illumina Infinium sequencing is a next-generation sequencing technique that uses sequencing by synthesis. It involves randomly fragmenting DNA, ligating adapters, and amplifying fragments on a flow cell in clusters through bridge amplification. Sequencing occurs by adding fluorescently labeled, reversible terminator nucleotides one at a time while the fluorescence is detected to determine the sequence of each cluster. This allows for massively parallel sequencing of many DNA fragments simultaneously.
Neurotech seminar ish wish 2014 madunaTando Maduna
This document discusses in vitro transcription and fluorescent in situ hybridization (FISH) techniques for visualizing gene expression in tissue samples. It describes the process of designing gene-specific primers, amplifying the gene of interest via PCR, synthesizing fluorescently-labeled RNA probes from the PCR product, hybridizing the probes to tissue samples, and using fluorescence microscopy to visualize where in the tissue the gene is expressed at a cellular level. The document provides technical details on each step of the process and discusses ways to optimize and troubleshoot the technique.
Improving and validating the Atlantic Cod genome assembly using PacBioLex Nederbragt
This document summarizes work using PacBio long reads to improve the Atlantic cod genome assembly. Error-corrected and raw PacBio reads were used with different assembly programs. Both helped increase contig and scaffold lengths over the previous assembly, with raw reads performing best. Bridgemapper validation found misassemblies corrected by PacBio. The improved assembly met goals of <5% gaps and scaffold N50 over 1 Mbp. Lessons included developing programs to handle cod's heterozygosity and structural variation better. The new assembly version aims to have 23 pseudochromosomes and improved annotation.
1. Next-generation sequencing methods such as Roche 454, Illumina GAII, and ABI SOLiD allow for high throughput DNA sequencing through massive parallel sequencing.
2. These methods involve clonal amplification of DNA fragments on solid surfaces or in emulsion PCR followed by sequencing using pyrosequencing, sequencing by synthesis with reversible terminators, or sequencing by ligation approaches.
3. The resulting sequencing data requires high throughput management and analysis pipelines to process the large volumes of sequence data produced.
This document summarizes research characterizing DNA methylation in the Pacific oyster Crassostrea gigas. High-throughput bisulfite sequencing was used to analyze DNA methylation patterns at high resolution. Several genes were found to have different levels and patterns of methylation across tissues and developmental stages. The results provide evidence that DNA methylation plays an important regulatory role and may be involved in environmental responses in C. gigas. Future work will investigate how epigenetic mechanisms are affected by environmental stressors.
Ion torrent semiconductor sequencing technologyCD Genomics
Ion Torrent is the latest generation sequencing technology. Its core technology is the use of semiconductor technology in chemical and digital information to establish a direct link.
BioRuby is a bioinformatics library for the Ruby programming language. It provides object-oriented tools for tasks like sequence analysis, format conversion, running bioinformatics tools, and working with biological data. The latest version added features like improved support for phylogenetic XML (PhyloXML), next-generation sequencing FASTQ format reading/writing, and a REST API wrapper for the NCBI database. BioRuby development follows agile principles and its large developer community contributes new code frequently on GitHub. The project aims to improve integration with R and data visualization while maintaining a stable core.
The document discusses DNA, RNA, gene expression, and various techniques related to studying nucleic acids including microarray, PCR, and real-time PCR. It defines the key components and structures of DNA and RNA, describes the central dogma of molecular biology regarding gene expression through replication, transcription and translation, and provides details on experimental procedures like microarray, PCR cycling, and real-time quantitative PCR using different fluorescent probes.
Next-generation sequencing format and visualization with ngs.plotLi Shen
Lecture given at the department of neuroscience, Icahn school of medicine at Mount Sinai. ngs.plot has been published in BMC genomics. Link: http://www.biomedcentral.com/1471-2164/15/284
This document provides an introduction to next generation sequencing (NGS) technologies. It begins with an outline of topics to be covered, including the evolution of NGS technologies, their descriptions and comparisons, bioinformatics challenges of NGS data analysis, and some aspects of NGS data analysis workflows and tools. The document then delves into explanations of specific NGS platforms, their performance characteristics, and the sequencing processes. It discusses the large computational infrastructure and data management needs of NGS, as well as quality control, preprocessing of NGS data, and popular analysis tools and workflows.
A Summary Of Recent Advances In Dna Sequencing 2 24 10 ScSawahC
Recent advances in DNA sequencing technology include single molecule real-time sequencing from Pacific Biosciences and nanopore sequencing. Pacific Biosciences' SMRT sequencing detects fluorescent signals from a single DNA polymerase as it incorporates labeled nucleotides in real time, allowing sequencing of over 10,000 base pairs at a rate of 1-3 bases per second. Nanopore sequencing techniques thread DNA molecules through nanopores and detect individual bases electronically without the need for amplification or labeling. These new methods promise faster, cheaper, and longer read sequencing compared to traditional dye-terminator sequencing.
The document describes the steps of Illumina sequencing. Genomic DNA is first fragmented and adapters are ligated to create single-stranded DNA fragments. These fragments are attached to a flow cell and undergo bridge amplification to create clusters of identical DNA fragments. Sequencing occurs through cycles of reversible terminator-based sequencing using fluorescently labeled dNTPs, imaging of the fluorescence, and cleavage of the label and terminator to allow the next cycle. After multiple cycles, the sequenced reads are aligned to the reference genome to determine the original sequence.
New Generation Sequencing Technologies: an overviewPaolo Dametto
The document provides a history of DNA sequencing technologies. It begins with the discovery of DNA's structure in 1953 and the development of recombinant DNA technology in the 1970s. First generation Sanger sequencing produced short reads over 1,000 years to sequence the human genome. Next generation sequencing (NGS) platforms since 2005 have dramatically reduced costs while increasing throughput. NGS methods like Roche/454 pyrosequencing, Illumina/Solexa sequencing by synthesis, SOLiD ligation sequencing, and single-molecule real-time sequencing by Pacific Biosciences now enable large-scale genome and transcriptome analysis.
Next Gen Sequencing (NGS) Technology OverviewDominic Suciu
Next generation sequencing (NGS) provides several new technologies for DNA sequencing that have significantly increased throughput and reduced costs compared to previous methods. NGS technologies include Roche/454, Illumina, ABI SOLiD, Ion Torrent, and PacBio. These technologies have various applications including whole genome sequencing, detection of genetic mutations associated with diseases, RNA sequencing to study gene expression, and ChIP sequencing to identify DNA-binding sites. NGS is revolutionizing genomic research by allowing comprehensive study of genomes, transcriptomes, and gene regulation.
This document provides an overview of next generation sequencing (NGS) technologies. It discusses the history and evolution of DNA sequencing, from early manual methods developed by Sanger to modern high-throughput NGS approaches. Key NGS methods described include Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, 454 pyrosequencing, and SOLiD ligation sequencing. Compared to Sanger, NGS allows massively parallel sequencing of many samples at lower cost and higher throughput. While NGS has advanced biological research, each method still has advantages and limitations related to read length, accuracy, and cost.
Illumina TruSeq DNA PCR-Free_Biomek FXP Automated WorkstationZachary Smith
The document describes automating the Illumina TruSeq DNA PCR-Free Sample Preparation Kit using the Beckman Coulter Biomek FXP automated liquid handler. The automation allows generating up to 96 individually barcoded DNA libraries in approximately 5 hours without PCR amplification. The method was tested by constructing libraries from human genomic DNA, which were then sequenced and shown to have the expected size distribution and quality.
This document provides an overview of DNA sequencing technologies. It begins with a brief history of DNA sequencing, including the discovery of DNA's structure and Sanger sequencing. The document then focuses on next generation sequencing technologies, describing several platforms such as 454 sequencing, Illumina sequencing, Ion Torrent sequencing, and Pacific Biosciences sequencing. It also discusses third generation sequencing and compares the sequencing approaches, workflows, and applications of various sequencing technologies. In conclusion, the document notes the progress and future directions of sequencing, including increased clinical applications and reduced costs.
The document describes the key steps in the NGS workflow including library construction, preparation of the substrate, sequencing, and data analysis. It provides examples of fragmenting genomic DNA, constructing libraries for Illumina and Ion Torrent sequencing, and quality control steps like size selection and quantification of libraries. Different applications of NGS are also summarized such as targeted sequencing using probe hybridization or PCR and epigenomics approaches involving ChIP-seq and bisulfite sequencing.
Global run-on sequencing (GRO-Seq) is a method to map the binding sites of transcriptionally active RNA polymerase II. It involves allowing RNA polymerase II to actively transcribe in the presence of labeled nucleotides, followed by purification and sequencing of the newly synthesized RNA. This provides sequences of RNAs that are currently being transcribed, without prior knowledge of transcription sites. While it directly determines relative transcriptional activity, GRO-Seq is limited to cell cultures and may introduce artifacts during nuclear preparation or transcription run-on.
GTC group 8 - Next Generation SequencingYanqi Chan
DNA sequencing is the process of determining the precise order of nucleotides within a DNA molecule. Discuss the application of next generation sequencing in cancer treatment.
The document discusses techniques for DNA sequencing, including early methods developed in the 1970s by Maxam and Gilbert as well as Sanger. It provides details on how both methods work, such as using specific chemical or enzymatic reactions to generate labeled DNA fragments of different lengths corresponding to nucleotide positions in the sequence. The document also describes how these methods were later automated, using fluorescent tags on dideoxynucleotides and capillary electrophoresis to simultaneously sequence multiple samples in a single gel. This allowed rapid determination of thousands of nucleotides and enabled large genome sequencing projects such as the Human Genome Project.
Journal club slides for "Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches" and a description of the software pipeline digit
DNA sequencing: rapid improvements and their implicationsJeffrey Funk
these slides analyze the rapid improvements in DNA sequencers and the implications for these rapid improvements for drug discovery, new crops, materials creation, and new bio-fuels. Many of the rapid improvements are from "reductions in scale." As with integrated circuits, reducing the size of features on DNA sequencers has enabled many orders of magnitude improvements in them. Unlike integrated circuits, the improvements are also due to changes in technology. For example, changes from pyrosequencing to semiconductor and nanopore sequencing have also been needed to achieve the reductions in scale. Second, pyrosequencing also benefited from improvements in lasers and camera chips.
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...Jonathan Eisen
This document contains slides for a talk on the evolution of DNA sequencing technologies. It reviews early manual sequencing methods developed by Sanger and others. It then summarizes the development of next-generation sequencing platforms including Roche 454 pyrosequencing, Illumina sequencing by synthesis, and others. The slides describe the key steps in library preparation, cluster generation, sequencing chemistry, and data analysis for various platforms. It provides a historical timeline of major advances that have enabled massive parallel sequencing of DNA.
Jonathan Eisen talk at #UCDavis 10/19/15 on "Microbiomes in Food and Agricult...Jonathan Eisen
Slides for talk on "Microbiomes in Food and Agriculture" by JonathanEisen - note - not all slides were used in talk. These were there to stimulate discussion ...
2011 course on Molecular Diagnostic Automation - Part 3 - DetectionPatrick Merel
2011 course on Molecular Diagnostic Automation - Part 3 - Detection.
This is from early 2011. Prices and Specifications of instruments may have changed a lot.
Part 3 of 3
The document compares older Sanger sequencing and newer next-generation sequencing technologies. It discusses that Sanger sequencing involved 4 parallel reactions with dye-terminator chemistry to read one base at a time. Next-generation sequencers like Roche 454, Illumina, and Applied Biosystems SOLiD can generate much more data in less time through massively parallel sequencing reactions, but provide shorter read lengths. These new technologies have greatly reduced the cost of sequencing and enabled whole genome sequencing.
Next-generation sequencing techniques such as Illumina and 454 pyrosequencing were discussed for applications including microbial genome sequencing and metagenomic profiling of microbial communities from targeted gene markers or shotgun sequencing. Key steps include library preparation, sequencing, and downstream bioinformatics analysis of sequencing data for tasks like genome assembly, gene annotation, and taxonomic classification of microbial taxa.
This document provides an overview and comparison of popular next-generation sequencing platforms. It discusses the common sequencing pipeline including library preparation, massively parallel sequencing, and bioinformatics analysis. Popular platforms like Roche 454, Illumina, and SOLiD are described in detail focusing on their specific sequencing chemistries and performance characteristics. Newer third-generation platforms such as Ion Torrent, PacBio, and Oxford Nanopore are also introduced. A wide range of NGS applications from whole genome sequencing to RNA-seq are outlined.
The document discusses RNA-seq analysis. It begins with an introduction to Mikael Huss, a bioinformatics scientist, and provides an overview of how genomics, RNA profiles, protein profiles, and interactomics relate within systems biology. The document then discusses how gene expression analysis can provide insights into basic research questions regarding tissue and cell identity, as well as insights into diseases by identifying genes that are over- or under-expressed in patients. Finally, it provides a brief overview of the typical workflow for RNA-seq analysis, which involves mapping RNA sequencing reads to a reference genome or transcriptome.
This document discusses high-throughput DNA sequencing technologies and their application to genome assembly projects. It provides a brief history of DNA sequencing, from early chemical and chain termination methods to current massively parallel sequencing technologies. It also describes several long-read sequencing technologies, including Pacific Biosciences SMRT sequencing and Oxford Nanopore sequencing. Examples are given of genome projects utilizing these technologies along with short-read sequencing data.
This document provides an overview and discussion of next-generation sequencing technologies by C. Titus Brown. It begins by outlining some basics of shotgun sequencing and how increasing density leads to squared increases in the number of sequences obtained. It then discusses current costs for Illumina sequencing and the amount of data needed for different applications. Some challenges and problems with sequencing data are also reviewed, such as systematic bias, genome assembly and scaffolding difficulties, reference gene models, and mRNA isoform construction. Emerging long-read sequencing technologies are also briefly discussed.
The document summarizes a study that used Illumina Hi-seq sequencing to analyze taxon diversity in bulk insect samples. The researchers tested two approaches: 1) PCR-based amplification of the COI barcode region followed by Illumina sequencing, and 2) direct shotgun sequencing of total mitochondrial DNA without PCR. Both approaches showed potential for high-throughput environmental barcoding, though methodological improvements are still needed to address issues like taxonomic and biomass biases. The study demonstrates that Illumina sequencing can perform comparably to other platforms for analyzing mixed insect samples and may help solve amplification biases through a PCR-free method.
New High Throughput Sequencing technologies at the Norwegian Sequencing Centr...Lex Nederbragt
A talk I gave at the Microbiology Research Group (University of Oslo) about new High Throughput Sequencing instruments at the Norwegian Sequencing Centre. I also mentioned future upgrades, and the upcoming nanopore sequencing platform of Oxford nanopore
Next generation sequencing techniques were discussed including an overview of various sequencing platforms, their output, and common analysis workflows. Mapping short reads to reference genomes using alignment programs is a key first step for most applications. Formats like FASTQ, SAM, and BAM are commonly used to store sequencing reads and mapping results.
Miten Generating high-quality reference human genomes using Promethion nanopo...GenomeInABottle
This document discusses using nanopore sequencing to generate high-quality reference human genomes. It outlines how nanopore sequencing can generate very long reads, including reads exceeding 1 megabase in length, that can help assemble complex genomic regions that have been difficult to assemble with short read sequencing. The document describes ongoing work to sequence the human genomes of two individuals using the PromethION nanopore sequencing device, generating over 60-130 gigabases of sequence data per flow cell with read lengths over 20-30 kilobases on average. Long read nanopore sequencing will play an important role in fully resolving the human genome sequence and understanding DNA modifications.
This document summarizes SV calling from 10X Linked-Read data. It discusses how 10X generates Linked-Reads from high molecular weight DNA, and how these Linked-Reads make SV detection easier by allowing for phasing and longer range information. It then provides examples of detecting deletions and other SVs from a trio using 10X data, including improved resolution of breakpoints and ability to call SVs in repetitive or hard to map regions not possible with short read data. Future areas for development are also discussed.
This document discusses the evolution of metagenomics from culturing microorganisms to direct high-throughput sequencing using next-generation sequencing (NGS) technologies. It describes how early metagenomics relied on cloning environmental DNA into libraries for Sanger sequencing, but NGS allows direct sequencing without cloning. NGS produces large volumes of sequence data at low cost, enabling assembly of large DNA fragments and reliable annotation of genes and pathways. The future of metagenomics involves comprehensively cataloging human and environmental microbiomes using NGS and exploiting microbial diversity for biotechnology applications like enzymes, antibiotics, and probiotics.
How to cluster and sequence an ngs library (james hadfield160416)James Hadfield
A presentation for people intersted in understanding how Illumina adapter ligation, clustering ands SBS sequencing work. Follow core-genomics http://core-genomics.blogspot.co.uk/
As storage capacities increase dramatically over the next 5 years, the document predicts several consequences: 1) Disks will replace tapes as the preferred archive media due to lower costs per terabyte of storage. 2) RAID10 configurations, which use mirroring, will replace RAID5, which uses parity, because higher performance will be needed to access very large disks. 3) Disks themselves will be packaged in "disc packs" with multiple read/write arms to provide higher bandwidth and access rates for extremely large single disks.
Database Research on Modern Computing ArchitectureKyong-Ha Lee
This document provides an overview of a talk on database research related to modern computing architecture given on September 10, 2010. The talk discusses the immense changes in computer hardware, including a variety of computing resources and increasing intra-node parallelism. It also covers how database technology can facilitate modern hardware features like parallelism. Specific topics covered include memory hierarchy changes, the memory wall problem, and latency issues compared to increasing bandwidth.
Next-generation genomics: an integrative approachHong ChangBum
This document summarizes a presentation on next-generation genomics and integrative analysis. It discusses the types of genomic data available from techniques like genome sequencing, RNA sequencing, ChIP-seq, and epigenomics. It explains that integrative analysis can help annotate functional features, infer variant function, and understand gene regulation. Approaches to integration include data reduction, unsupervised clustering, and supervised Bayesian networks. Large-scale datasets can be accessed through browsers, add-ons, and standalone tools to generate novel hypotheses. Future work includes more integrated community resources with search capabilities.
Here are the steps to visualize a potential indel region after realignment:
1. Run GATK IndelRealigner on the target list:
java -jar $EBROOTGATK/GenomeAnalysisTK.jar -T IndelRealigner -R ../human_g1k_v37.fasta -I sample.dedup.bam -targetIntervals sample.intervals -o sample.realigned.bam
2. Index the realigned BAM:
samtools index sample.realigned.bam
3. Load the realigned BAM into IGV and navigate to a region of interest from the target list (sample.intervals).
4. In I
Long read sequencing - WEHI bioinformatics seminar - tue 16 june 2015Torsten Seemann
Long read sequencing - the good, the bad, and the really cool. Covers Illumina SLR, Pacbio RSII and Oxford Nanopore as of June 2015. Discusses bioinformatics differences of long reads over short reads.
Making powerful science: an introduction to NGS and beyondAdamCribbs1
This slide deck is from the Botnar Research Centre introduction to NGS sequencing workshop 2021- an overview of the theoretical concepts behind sequencing are given
Similar to Next Gen Sequencing Technologies Overview (20)
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
High performance Serverless Java on AWS- GoTo Amsterdam 2024Vadym Kazulkin
Java is for many years one of the most popular programming languages, but it used to have hard times in the Serverless community. Java is known for its high cold start times and high memory footprint, comparing to other programming languages like Node.js and Python. In this talk I'll look at the general best practices and techniques we can use to decrease memory consumption, cold start times for Java Serverless development on AWS including GraalVM (Native Image) and AWS own offering SnapStart based on Firecracker microVM snapshot and restore and CRaC (Coordinated Restore at Checkpoint) runtime hooks. I'll also provide a lot of benchmarking on Lambda functions trying out various deployment package sizes, Lambda memory settings, Java compilation options and HTTP (a)synchronous clients and measure their impact on cold and warm start times.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
"What does it really mean for your system to be available, or how to define w...Fwdays
We will talk about system monitoring from a few different angles. We will start by covering the basics, then discuss SLOs, how to define them, and why understanding the business well is crucial for success in this exercise.
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
"Scaling RAG Applications to serve millions of users", Kevin GoedeckeFwdays
How we managed to grow and scale a RAG application from zero to thousands of users in 7 months. Lessons from technical challenges around managing high load for LLMs, RAGs and Vector databases.
Session 1 - Intro to Robotic Process Automation.pdfUiPathCommunity
👉 Check out our full 'Africa Series - Automation Student Developers (EN)' page to register for the full program:
https://bit.ly/Automation_Student_Kickstart
In this session, we shall introduce you to the world of automation, the UiPath Platform, and guide you on how to install and setup UiPath Studio on your Windows PC.
📕 Detailed agenda:
What is RPA? Benefits of RPA?
RPA Applications
The UiPath End-to-End Automation Platform
UiPath Studio CE Installation and Setup
💻 Extra training through UiPath Academy:
Introduction to Automation
UiPath Business Automation Platform
Explore automation development with UiPath Studio
👉 Register here for our upcoming Session 2 on June 20: Introduction to UiPath Studio Fundamentals: https://community.uipath.com/events/details/uipath-lagos-presents-session-2-introduction-to-uipath-studio-fundamentals/
The Department of Veteran Affairs (VA) invited Taylor Paschal, Knowledge & Information Management Consultant at Enterprise Knowledge, to speak at a Knowledge Management Lunch and Learn hosted on June 12, 2024. All Office of Administration staff were invited to attend and received professional development credit for participating in the voluntary event.
The objectives of the Lunch and Learn presentation were to:
- Review what KM ‘is’ and ‘isn’t’
- Understand the value of KM and the benefits of engaging
- Define and reflect on your “what’s in it for me?”
- Share actionable ways you can participate in Knowledge - - Capture & Transfer
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
3. Virus: 3500 à 8 x 105 bases
Bactéries plus de1Mb
(Escherichia coli = 4,7 Mb)
Basics
1 kilobase 1kb
= 1 000 bases
Eucaryotes de 10 à 3 x 105 Mb
levure = 1,3 Mb
drosophile = 165 Mb
1 mégabase 1Mb
1 000 000 bases
1 million
Homo sapiens 3400 Mb 3Gb
20 000-25 000 genes
Transcriptome = 2% Genome
1 gigabase 1 Gb
1000 Mb
1 milliard
4. Avant: le séquencage enzymatique
= SANGER Sequencing
ADN simple brin + ADN polymérase
addition d ’un didéoxy.en petite quantité (ddNTP)
4 réactions pour les 4 bases en //, chacune avec 1 didéoxy.
différent
synthèse arrêtée à chaque incorporation d ’un didéoxy.
statistiquement, autant de fragments avortés que de fois où la
base est représentée
5. Avant: le séquencage enzymatique
= SANGER Sequencing
ADN simple brin + ADN polymérase
addition d ’un didéoxy.en petite quantité (ddNTP)
4 réactions pour les 4 bases en //, chacune avec 1 didéoxy.
différent
synthèse arrêtée à chaque incorporation d ’un didéoxy.
statistiquement, autant de fragments avortés que de fois où la
base est représentée
10. Cout séquencage:
Idée du
3+1+(0.4+4.5+0.4)x2=14.6€/1séq. ds de 700b
CEQ 8 capillaires: 33.000b ds/24h (48x2x700b)
cout du
Cout séquencage de 33.000b ds: 688€
CEquencing
Cout séquencage de 1Mb ds: 20.848€
Bioinformatique, confirmation:
5min/1000b 7hrs/33.000b
11. Roche GS-FLXti
0.4 Gb/run
Next Generation 1m reads @ 400b
Sequencers €5990/run
€14.97/Mb
€500k/inst.
Illumina GA2
NextGen Sequencers - NextGen Sequencing
5-10 Gb/run
(NGS)
60m reads @ 50b
Whole Genome Sequencer - Whole Genome
Sequencing (WGS) $8250(€6180)/run (5Gb)
$0,33(€0,25)/Mb
$460k(€344k)/inst.
AB Solid 3.0
10-20 Gb/run
100m reads @ 50b
€5300/run 5+5Gb
The competition:
€0,53/Mb
Helicos Biosciences, Pacific Biosiences, George Church Lab.,
Nanopores sequencing, ZS-Genetics, Sequencing by TEM...
€462k/inst.
12. The Polonator G.007 is the
first quot;open sourcequot; gene
sequencing instrument to hit
Other Players
the lab market in which the
instrument's software (Web
ware) and specifications are
freely available to the public.
At $150,000, the Polonator is
the cheapest instrument on
the market
George Church Lab. + Danaher Motion: Polonator G.007 The HeliScope™
Single Molecule
Sequencer is the first
Helicos BioSciences Corp.: HeliScope SMS genetic analyzer to
harness the power of
direct DNA
measurement,
enabled by Helicos
ZS-Genetics: Electron Microscopy Sequencing. By the first True Single Molecule
Sequencing (tSMS)™
half of 2009, the system is expected to read complete a haploid
technology.
human genome in approximately 8 days, with 4X coverage, at a cost
in the tens of thousands of dollars.
Pacific BioSciences published technology for Single
Molecule Realtime Sequencing SMRT. Instrument by 2010
Moebius Biosystems: Nexus. Over 6 Gigabases in 24hrs.
Nanopore sequencing: Oxorf Nanopore, Sequenom...etc
Pacific BioSciences
14. • GS-FLXti Data
emPCR Sequencing
DNA Library Preparation and Titration
4.5 h and 10.5 h 8h 10 h
Genome fragmented by
nebulization
No cloning; no colony
picking
sstDNA library created
with adapters
A/B fragments selected
using avidin-biotin
purification
gDNA sstDNA library
Process Steps
1. DNA library preparation
15. • GS-FLXti Data
emPCR Sequencing
DNA Library Preparation and Titration
4.5 h and 10.5 h 8h 10 h
Break microreactors,
Anneal sstDNA to Emulsify beads and Clonal amplification
enrich for DNA-
an excess of DNA PCR reagents in water- occurs inside
positive beads
capture beads in-oil microreactors microreactors
sstDNA library Clonally-amplified sstDNA attached to bead
Process Steps
2. emulsion PCR
16. •Multiple optical
fibers are fused to
form an optical
array.
•Proprietary
etching method
produces wells
that serve as
picoliter reaction
vessels.
•Each well is only
able to accept a
single DNA bead.
Load sequencing
Load PicoTiterPlate
Load genome into
•Reactions in the reagents
device on instrument
PicoTiterPlate device
wells are
Close and Press GO! – sequence genome
measured of the
CCD camera.
Process Steps
•Titanium plate:
3. Sequencing with the PicoTiterPlate
3.4m wells
device
17. •Multiple optical
fibers are fused to
form an optical
array.
•Proprietary
etching method
produces wells
that serve as
picoliter reaction
vessels.
•Each well is only
able to accept a
single DNA bead.
Load sequencing
Load PicoTiterPlate
Load genome into
•Reactions in the reagents
device on instrument
PicoTiterPlate device
wells are
Close and Press GO! – sequence genome
measured of the
CCD camera.
Process Steps
•Titanium plate:
3. Sequencing with the PicoTiterPlate
3.4m wells
device
18. •Multiple optical
fibers are fused to
form an optical
array.
•Proprietary
etching method
produces wells
that serve as
picoliter reaction
vessels.
•Each well is only
able to accept a
single DNA bead.
Load sequencing
Load PicoTiterPlate
Load genome into
•Reactions in the reagents
device on instrument
PicoTiterPlate device
wells are
Close and Press GO! – sequence genome
measured of the
CCD camera.
Process Steps
•Titanium plate:
3. Sequencing with the PicoTiterPlate
3.4m wells
device
19. •Multiple optical
fibers are fused to
form an optical
array.
•Proprietary
etching method
produces wells
that serve as
picoliter reaction
vessels.
•Each well is only
able to accept a
single DNA bead.
Load sequencing
Load PicoTiterPlate
Load genome into
•Reactions in the reagents
device on instrument
PicoTiterPlate device
wells are
Close and Press GO! – sequence genome
measured of the
CCD camera.
Process Steps
•Titanium plate:
3. Sequencing with the PicoTiterPlate
3.4m wells
device
20. •Multiple optical
fibers are fused to
form an optical
array.
•Proprietary
etching method
produces wells
that serve as
picoliter reaction
vessels.
•Each well is only
able to accept a
single DNA bead.
Load sequencing
Load PicoTiterPlate
Load genome into
•Reactions in the reagents
device on instrument
PicoTiterPlate device
wells are
Close and Press GO! – sequence genome
measured of the
CCD camera.
Process Steps
•Titanium plate:
3. Sequencing with the PicoTiterPlate
3.4m wells
device
21. •Multiple optical
fibers are fused to
form an optical
array.
•Proprietary
etching method
produces wells
that serve as
picoliter reaction
vessels.
•Each well is only
able to accept a
single DNA bead.
Load sequencing
Load PicoTiterPlate
Load genome into
•Reactions in the reagents
device on instrument
PicoTiterPlate device
wells are
Close and Press GO! – sequence genome
measured of the
CCD camera.
Process Steps
•Titanium plate:
3. Sequencing with the PicoTiterPlate
3.4m wells
device
22. DNA Library Preparation and Titration emPCR Sequencing
• GS-FLXti Data 4.5 h and 10.5 h 8h 10 h
3.4 m wells
3.4 m reads obtained in parallel
A single clonally amplified sstDNA bead
is deposited per well.
Amplified sstDNA library beads Quality filtered bases
DNA capture 4 bases (TACG)
bead containing cycled 200 times
millions of copies Chemiluminescent
of a single clonal signal generation
fragment Signal processing to
determine base
sequence and quality
score
Amplified sstDNA library beads Quality filtered bases
Process Steps
3. Sequencing
23. T
•Raw data is C
G
processed
A
from a series
of individual T
images.
•Each well’s
data is
extracted,
Signal output from a single well
Metric and image viewing software
quantified,
(flowgram)
and
normalized.
•Read data is
converted
into
flowgrams.
Process Steps
4. Signal-processing
24. •Raw data is
processed
from a series
of individual
images.
Key sequence = TCAG for identifying wells and calibration
•Each well’s Flow of individual bases (TCAG) is 42 times.
data is
TA
extracted, CG
quantified,
and
normalized.
TTCTGCGAA
•Read data is
converted
into
flowgrams.
Base flow
Signal strength
Process Steps
4. Signal-processing
25. • Quality filtered bases
• GS-FLXti Data
400-500 bp average read length
> 0.4 Gb or 1m reads with a 70 x 75 mm FLXti PicoTiterPlate device
10 hours run time
• Phred-like quality score for use in available assemblers or viewers
• Consensus base-called contig files - FASTA file of assembled reads
mapping against known scaffold (resequencing)
de novo assembly of individual bases in consensus contigs
• Viewer-ready genome file - assembly file in .ace format
• Assembly metric files
• Run-time metrics files - summarize important information pertaining to
sequencing quality for each run
Process Steps
5. Data output
27. • GS-FLXti Data
Sanger: Weeks
454: 4 days
Sanger Technology
7 days Weeks
Preparation* Total Sequencing Time
- DNA Library Preparation - 180 runs (1 per 4 hours)
- Cloning - 2-million-base (Mb) genome
- Template Preparation - 6x coverage
454 Technology
2.5 days 1 day
Preparation Total Sequencing Time
- DNA Library Preparation - 1 run (10 hours)
- Titration of Library Beads - 400-600 million-base (Mb)
- emPCR
Technology Comparison
Sanger vs. 454 technology
for a 2-million-base genome
28. NextGen
Sequencers Roche GS-FLX:
Workflow
IT steps:
Workflow 3-4 days (setup) + 1 day (run)
1. Generation of a single-stranded template DNA library (~8-16 hours)
2. Emulsion-based clonal amplification of the library (~8 hours)
GS-FLX Software
3. Data generation via sequencing-by-synthesis (9 hours)
4. Image and Base calling analysis (~8 hours)
▪GS Reference Mapper
5. Data analysis using different bioinformatics tools
▪GS De Novo Assembler
•Long Single Reads / Standard Shotgun (required input = 3–5μg,5μg recommended)
▪GS Amplicon Variant Analyzer
~1,000,000 single reads with an average read length of 400 bases
•Paired End Reads (required input = 5μg @25 ng/μl or above, in TE; >10kb)
◦3K Long-Tag Paired End Reads. Sequence 100 bases from each end of a 3,000 base span
on a single sequence read (Figure). Co-assemble GS FLX Titanium shotgun reads with 3K
Third Party Software
Long-Tag Paired Ends reads from Standard series runs.
•Sequence Capture (required input = 3–5μg)
◦Roche NimbleGen Sequence Capture using a single microarray hybridization-based
enrichment process.
•Amplicon Sequencing (1-5ng or 10-50ng)
◦The DNA-sample preparation for Amplicon Sequencing with the GS FLX System consists of a
simple PCR amplification reaction with special Fusion Primers. The Fusion Primer consists of a
20-25 bp target-specific sequence (3' end) and a 19 bp fixed sequence (Primer A or Primer B
on the 5' end).
29. NextGen
Sequencers Roche GS-FLX:
Workflow
IT steps:
Workflow 3-4 days (setup) + 1 day (run)
1. Generation of a single-stranded template DNA library (~8-16 hours)
2. Emulsion-based clonal amplification of the library (~8 hours)
GS-FLX Software
3. Data generation via sequencing-by-synthesis (9 hours)
4. Image and Base calling analysis (~8 hours)
▪GS Reference Mapper
5. Data analysis using different bioinformatics tools
▪GS De Novo Assembler
•Long Single Reads / Standard Shotgun (required input = 3–5μg,5μg recommended)
▪GS Amplicon Variant Analyzer
~1,000,000 single reads with an average read length of 400 bases
•Paired End Reads (required input = 5μg @25 ng/μl or above, in TE; >10kb)
◦3K Long-Tag Paired End Reads. Sequence 100 bases from each end of a 3,000 base span
on a single sequence read (Figure). Co-assemble GS FLX Titanium shotgun reads with 3K
Third Party Software
Long-Tag Paired Ends reads from Standard series runs.
•Sequence Capture (required input = 3–5μg)
◦Roche NimbleGen Sequence Capture using a single microarray hybridization-based
enrichment process.
•Amplicon Sequencing (1-5ng or 10-50ng)
◦The DNA-sample preparation for Amplicon Sequencing with the GS FLX System consists of a
simple PCR amplification reaction with special Fusion Primers. The Fusion Primer consists of a
20-25 bp target-specific sequence (3' end) and a 19 bp fixed sequence (Primer A or Primer B
on the 5' end).
30. NextGen Roche GS-FLX:
Sequencers
add-ons
not included
- Nebulizers + nitrogen tank
Nebulization is required to shear fragments for DNA >70-800bp
- emPCR Breaking Kit
This device is required for the preparation of consistently sized reactors
for emulsion PCR.
- Magnetic Concentrator IVGN +€5000
- MT plate centrifuge BCI +€15.000
- Multisizer™ 3 COULTER counter +€15.000
The most versatile and accurate particle sizing and counting analyzer
available today. Using The Coulter Principle, also known as ESZ (Electrical
Sensing Zone Method), the Multisizer 3 COULTER COUNTER provides
number, volume, mass and surface area size distributions in one
measurement, with an overall sizing range of 0.4 µm to 1,200
- Agilent BioAnalyzer +€20.000
- Titanium cluster station +€29.000
31. Roche FLXti:
Next Generation 0.5 Gb/run
1m reads @ 400b
Sequencers €5990/run
€14.97/Mb
€585k/inst. tot
The Roche Roche FLXti:
Setup time: 3-4 d
0.5 Gb/run
System Run time: 10 hrs
images: 27 GB
Primary Analysis: 15 GB
PA CPU time: 80-220 hrs
(6-7 hrs with cluster st)
Final file size: 4 GB
notes:
400-500b frag. length sequencing
future dev. up to 1000b
x coverage with long frag. vs x+n
coverage with short reads vs cost/
Mb
10 systems in France
Multiplexing capacity
≈200 publications
33. Illumina's Solexa Sequencing Technology
Step 1: Sample Preparation
The DNA sample of interest is sheared to appropriate size (average
~800bp) using a compressed air device known as a nebulizer. The
ends of the DNA are polished, and two unique adapters are ligated
to the fragments. Ligated fragments of the size range of 150-200bp
are isolated via gel extraction and amplified using limited cycles of
PCR. 1.5 days.
Steps 2-6: Cluster Generation by Bridge
Amplification
In contrast to the 454 and ABI methods which use a bead-based
emulsion PCR to generate quot;poloniesquot;, Illumina utilizes a unique
quot;bridgedquot; amplification reaction that occurs on the surface of the
flow cell.
The flow cell surface is coated with single stranded oligonucleotides
that correspond to the sequences of the adapters ligated during the
sample preparation stage. Single-stranded, adapter-ligated
fragments are bound to the surface of the flow cell exposed to
reagents for polyermase-based extension. Priming occurs as the
free/distal end of a ligated fragment quot;bridgesquot; to a complementary
oligo on the surface.
Repeated denaturation and extension results in localized
amplification of single molecules in millions of unique locations
across the flow cell surface. This process occurs in what is referred
to as Illumina's quot;cluster stationquot;, an automated flow cell processor.
8hrs.
34. Illumina's Solexa Sequencing Technology
Step 1: Sample Preparation
The DNA sample of interest is sheared to appropriate size (average
~800bp) using a compressed air device known as a nebulizer. The
ends of the DNA are polished, and two unique adapters are ligated
to the fragments. Ligated fragments of the size range of 150-200bp
are isolated via gel extraction and amplified using limited cycles of
PCR. 1.5 days.
Steps 2-6: Cluster Generation by Bridge
Amplification
In contrast to the 454 and ABI methods which use a bead-based
emulsion PCR to generate quot;poloniesquot;, Illumina utilizes a unique
quot;bridgedquot; amplification reaction that occurs on the surface of the
flow cell.
The flow cell surface is coated with single stranded oligonucleotides
that correspond to the sequences of the adapters ligated during the
sample preparation stage. Single-stranded, adapter-ligated
fragments are bound to the surface of the flow cell exposed to
reagents for polyermase-based extension. Priming occurs as the
free/distal end of a ligated fragment quot;bridgesquot; to a complementary
oligo on the surface.
Repeated denaturation and extension results in localized
amplification of single molecules in millions of unique locations
across the flow cell surface. This process occurs in what is referred
to as Illumina's quot;cluster stationquot;, an automated flow cell processor.
8hrs.
35. Illumina's Solexa Sequencing Technology
Step 1: Sample Preparation
The DNA sample of interest is sheared to appropriate size (average
~800bp) using a compressed air device known as a nebulizer. The
ends of the DNA are polished, and two unique adapters are ligated
to the fragments. Ligated fragments of the size range of 150-200bp
are isolated via gel extraction and amplified using limited cycles of
PCR. 1.5 days.
Steps 2-6: Cluster Generation by Bridge
Amplification
In contrast to the 454 and ABI methods which use a bead-based
emulsion PCR to generate quot;poloniesquot;, Illumina utilizes a unique
quot;bridgedquot; amplification reaction that occurs on the surface of the
flow cell.
The flow cell surface is coated with single stranded oligonucleotides
that correspond to the sequences of the adapters ligated during the
sample preparation stage. Single-stranded, adapter-ligated
fragments are bound to the surface of the flow cell exposed to
reagents for polyermase-based extension. Priming occurs as the
free/distal end of a ligated fragment quot;bridgesquot; to a complementary
oligo on the surface.
Repeated denaturation and extension results in localized
amplification of single molecules in millions of unique locations
across the flow cell surface. This process occurs in what is referred
to as Illumina's quot;cluster stationquot;, an automated flow cell processor.
8hrs.
36. Illumina's Solexa Sequencing Technology
Step 1: Sample Preparation
The DNA sample of interest is sheared to appropriate size (average
~800bp) using a compressed air device known as a nebulizer. The
ends of the DNA are polished, and two unique adapters are ligated
to the fragments. Ligated fragments of the size range of 150-200bp
are isolated via gel extraction and amplified using limited cycles of
PCR. 1.5 days.
Steps 2-6: Cluster Generation by Bridge
Amplification
In contrast to the 454 and ABI methods which use a bead-based
emulsion PCR to generate quot;poloniesquot;, Illumina utilizes a unique
quot;bridgedquot; amplification reaction that occurs on the surface of the
flow cell.
The flow cell surface is coated with single stranded oligonucleotides
that correspond to the sequences of the adapters ligated during the
sample preparation stage. Single-stranded, adapter-ligated
fragments are bound to the surface of the flow cell exposed to
reagents for polyermase-based extension. Priming occurs as the
free/distal end of a ligated fragment quot;bridgesquot; to a complementary
oligo on the surface.
Repeated denaturation and extension results in localized
amplification of single molecules in millions of unique locations
across the flow cell surface. This process occurs in what is referred
to as Illumina's quot;cluster stationquot;, an automated flow cell processor.
8hrs.
37. Illumina's Solexa Sequencing Technology
Step 1: Sample Preparation
The DNA sample of interest is sheared to appropriate size (average
~800bp) using a compressed air device known as a nebulizer. The
ends of the DNA are polished, and two unique adapters are ligated
to the fragments. Ligated fragments of the size range of 150-200bp
are isolated via gel extraction and amplified using limited cycles of
PCR. 1.5 days.
Steps 2-6: Cluster Generation by Bridge
Amplification
In contrast to the 454 and ABI methods which use a bead-based
emulsion PCR to generate quot;poloniesquot;, Illumina utilizes a unique
quot;bridgedquot; amplification reaction that occurs on the surface of the
flow cell.
The flow cell surface is coated with single stranded oligonucleotides
that correspond to the sequences of the adapters ligated during the
sample preparation stage. Single-stranded, adapter-ligated
fragments are bound to the surface of the flow cell exposed to
reagents for polyermase-based extension. Priming occurs as the
free/distal end of a ligated fragment quot;bridgesquot; to a complementary
oligo on the surface.
Repeated denaturation and extension results in localized
amplification of single molecules in millions of unique locations
across the flow cell surface. This process occurs in what is referred
to as Illumina's quot;cluster stationquot;, an automated flow cell processor.
8hrs.
38. Illumina's Solexa Sequencing Technology
Step 1: Sample Preparation
The DNA sample of interest is sheared to appropriate size (average
~800bp) using a compressed air device known as a nebulizer. The
ends of the DNA are polished, and two unique adapters are ligated
to the fragments. Ligated fragments of the size range of 150-200bp
are isolated via gel extraction and amplified using limited cycles of
PCR. 1.5 days.
Steps 2-6: Cluster Generation by Bridge
Amplification
In contrast to the 454 and ABI methods which use a bead-based
emulsion PCR to generate quot;poloniesquot;, Illumina utilizes a unique
quot;bridgedquot; amplification reaction that occurs on the surface of the
flow cell.
The flow cell surface is coated with single stranded oligonucleotides
that correspond to the sequences of the adapters ligated during the
sample preparation stage. Single-stranded, adapter-ligated
fragments are bound to the surface of the flow cell exposed to
reagents for polyermase-based extension. Priming occurs as the
free/distal end of a ligated fragment quot;bridgesquot; to a complementary
oligo on the surface.
Repeated denaturation and extension results in localized
amplification of single molecules in millions of unique locations
across the flow cell surface. This process occurs in what is referred
to as Illumina's quot;cluster stationquot;, an automated flow cell processor.
8hrs.
39. Illumina's Solexa Sequencing Technology
Steps 7-12: Sequencing by Synthesis
A flow cell containing millions of unique clusters is now loaded into
the 1G sequencer for automated cycles of extension and imaging.
The first cycle of sequencing consists first of the incorporation of a
single fluorescent nucleotide, followed by high resolution imaging of
the entire flow cell. These images represent the data collected for
the first base. Any signal above background identifies the physical
location of a cluster (or polony), and the fluorescent emission
identifies which of the four bases was incorporated at that position.
This cycle is repeated, one base at a time, generating a series of
images each representing a single base extension at a specific
cluster. Base calls are derived with an algorithm that identifies the
emission color over time. At this time reports of useful Illumina
reads range from 26-50 bases.
The use of physical location to identify unique reads is a critical
concept for all next generation sequencing systems. The density of
the reads and the ability to image them without interfering noise is
vital to the throughput of a given instrument. Each platform has its
own unique issues that determine this number, 454 is limited by the
number of wells in their PicoTiterPlate, Illumina is limited by
fragment length that can effectively quot;bridgequot;, and all providers are
limited by flow cell real estate. 2-6 days (18-36 cycles).
40. Illumina's Solexa Sequencing Technology
Steps 7-12: Sequencing by Synthesis
A flow cell containing millions of unique clusters is now loaded into
the 1G sequencer for automated cycles of extension and imaging.
The first cycle of sequencing consists first of the incorporation of a
single fluorescent nucleotide, followed by high resolution imaging of
the entire flow cell. These images represent the data collected for
the first base. Any signal above background identifies the physical
location of a cluster (or polony), and the fluorescent emission
identifies which of the four bases was incorporated at that position.
This cycle is repeated, one base at a time, generating a series of
images each representing a single base extension at a specific
cluster. Base calls are derived with an algorithm that identifies the
emission color over time. At this time reports of useful Illumina
reads range from 26-50 bases.
The use of physical location to identify unique reads is a critical
concept for all next generation sequencing systems. The density of
the reads and the ability to image them without interfering noise is
vital to the throughput of a given instrument. Each platform has its
own unique issues that determine this number, 454 is limited by the
number of wells in their PicoTiterPlate, Illumina is limited by
fragment length that can effectively quot;bridgequot;, and all providers are
limited by flow cell real estate. 2-6 days (18-36 cycles).
41. Illumina's Solexa Sequencing Technology
Steps 7-12: Sequencing by Synthesis
A flow cell containing millions of unique clusters is now loaded into
the 1G sequencer for automated cycles of extension and imaging.
The first cycle of sequencing consists first of the incorporation of a
single fluorescent nucleotide, followed by high resolution imaging of
the entire flow cell. These images represent the data collected for
the first base. Any signal above background identifies the physical
location of a cluster (or polony), and the fluorescent emission
identifies which of the four bases was incorporated at that position.
This cycle is repeated, one base at a time, generating a series of
images each representing a single base extension at a specific
cluster. Base calls are derived with an algorithm that identifies the
emission color over time. At this time reports of useful Illumina
reads range from 26-50 bases.
The use of physical location to identify unique reads is a critical
concept for all next generation sequencing systems. The density of
the reads and the ability to image them without interfering noise is
vital to the throughput of a given instrument. Each platform has its
own unique issues that determine this number, 454 is limited by the
number of wells in their PicoTiterPlate, Illumina is limited by
fragment length that can effectively quot;bridgequot;, and all providers are
limited by flow cell real estate. 2-6 days (18-36 cycles).
42. Illumina's Solexa Sequencing Technology
Steps 7-12: Sequencing by Synthesis
A flow cell containing millions of unique clusters is now loaded into
the 1G sequencer for automated cycles of extension and imaging.
The first cycle of sequencing consists first of the incorporation of a
single fluorescent nucleotide, followed by high resolution imaging of
the entire flow cell. These images represent the data collected for
the first base. Any signal above background identifies the physical
location of a cluster (or polony), and the fluorescent emission
identifies which of the four bases was incorporated at that position.
This cycle is repeated, one base at a time, generating a series of
images each representing a single base extension at a specific
cluster. Base calls are derived with an algorithm that identifies the
emission color over time. At this time reports of useful Illumina
reads range from 26-50 bases.
The use of physical location to identify unique reads is a critical
concept for all next generation sequencing systems. The density of
the reads and the ability to image them without interfering noise is
vital to the throughput of a given instrument. Each platform has its
own unique issues that determine this number, 454 is limited by the
number of wells in their PicoTiterPlate, Illumina is limited by
fragment length that can effectively quot;bridgequot;, and all providers are
limited by flow cell real estate. 2-6 days (18-36 cycles).
43. Illumina's Solexa Sequencing Technology
Steps 7-12: Sequencing by Synthesis
A flow cell containing millions of unique clusters is now loaded into
the 1G sequencer for automated cycles of extension and imaging.
The first cycle of sequencing consists first of the incorporation of a
single fluorescent nucleotide, followed by high resolution imaging of
the entire flow cell. These images represent the data collected for
the first base. Any signal above background identifies the physical
location of a cluster (or polony), and the fluorescent emission
identifies which of the four bases was incorporated at that position.
This cycle is repeated, one base at a time, generating a series of
images each representing a single base extension at a specific
cluster. Base calls are derived with an algorithm that identifies the
emission color over time. At this time reports of useful Illumina
reads range from 26-50 bases.
The use of physical location to identify unique reads is a critical
concept for all next generation sequencing systems. The density of
the reads and the ability to image them without interfering noise is
vital to the throughput of a given instrument. Each platform has its
own unique issues that determine this number, 454 is limited by the
number of wells in their PicoTiterPlate, Illumina is limited by
fragment length that can effectively quot;bridgequot;, and all providers are
limited by flow cell real estate. 2-6 days (18-36 cycles).
44. Illumina's Solexa Sequencing Technology
Steps 7-12: Sequencing by Synthesis
A flow cell containing millions of unique clusters is now loaded into
the 1G sequencer for automated cycles of extension and imaging.
The first cycle of sequencing consists first of the incorporation of a
single fluorescent nucleotide, followed by high resolution imaging of
the entire flow cell. These images represent the data collected for
the first base. Any signal above background identifies the physical
location of a cluster (or polony), and the fluorescent emission
identifies which of the four bases was incorporated at that position.
This cycle is repeated, one base at a time, generating a series of
images each representing a single base extension at a specific
cluster. Base calls are derived with an algorithm that identifies the
emission color over time. At this time reports of useful Illumina
reads range from 26-50 bases.
The use of physical location to identify unique reads is a critical
concept for all next generation sequencing systems. The density of
the reads and the ability to image them without interfering noise is
vital to the throughput of a given instrument. Each platform has its
own unique issues that determine this number, 454 is limited by the
number of wells in their PicoTiterPlate, Illumina is limited by
fragment length that can effectively quot;bridgequot;, and all providers are
limited by flow cell real estate. 2-6 days (18-36 cycles).
45. Pipeline software highlights
Automated image calibration: maximizes the number of clusters used to generate sequence data
Accurate cluster intensity scoring algorithms: allow efficient filtering for high-quality reads
Quality-calibrated base calls: minimize the propagation of downstream sequencing errors
Highly optimized genomic alignment tools: minimize the need for elaborate computer
infrastructures
Open source code: enables researchers to customize the software to meet their needs
46. Sanger: Weeks
Illumina: <7 days
Technology Comparison
Sanger vs. Solexa technology
for a 2-Gigabase genome
47. Sanger: Weeks
Illumina: <7 days
Technology Comparison
Sanger vs. Solexa technology
for a 2-Gigabase genome
48. NextGen
Illumina GA2:
Sequencers
Workflow
▪
Tracking Samples ready for sample prep
▪
Samples ready for cluster prep
Workflow 2-3 days (setup) + 2-3 days (run)
▪
Flow cells ready for sequencing
1.
Non amplified DNA/RNA Sample
2.
QC and possibly purify
3.
Process with appropriate Sample Prep Kit
4.
QC sample prep
▪
Serve analysis files to DAS2 enabled genome
DAS2 server
5.
Assemble 7 samples with the same number of cycles, library
browsers for direct visualization of results
types, and sample types
without file download
6.
Process grouped samples with appropriate Cluster Generation Kit
▪
Private server up and going using Authentication
7.
Run cluster generation
Mapping application (to handle 5-100 million 15-50bp sequences)
8.
Transfer flow cell onto Genome Analyzer
▪
9.
Run sequencing 1st cycle Filter sequences by quality score
▪
10.
QC 1st cycle Count and remove identical sequences
▪
11.
Run remaining cycles Map sequences to reference genome
12.
Export data
Filter application
▪
Take binary map files and filter based on type of
13.
Run analysis
aligment and # of counts
▪
Export filtered universal binary for downstream
applications
Distributed Annotation System (DAS) defines a communication protocol used to exchange biological annotations
49. NextGen Illumina GA2:
Sequencers
add-ons
not included - Cluster Station +$50.000
The Cluster Station is a standalone, software-
controlled system for the automated generation
of clonal clusters from single molecule fragments
on Illumina Genome Analyzer flow cells.
- Paired-End Module +$45.000
The Paired-End Module provides fully automated
template preparation for the second round of
sequencing in a paired-end sequencing run.
- IPAR +$60.000
IPAR is a bundled hardware and software solution
that provides real-time quality control and
integrated online processing of primary data
during sequencing runs
- Agilent BioAnalyzer +€20.000
Total: €126.000
50. Illumina GA2:
Next Generation 5-10 Gb/run (50b)
$8250 (€6180)/run (5Gb)
Sequencers $0,33/Mb
€480/inst. tot
The Illumina Illumina GA2:
Setup time: 2-3 d
6-11 Gb/run
System Run time: 3-6 d
images: 900 GB
Primary Analysis: 350 GB
PA CPU time: 100 hrs
Final File Size: 75 GB
notes:
7/15 Gb by end of 2009
72 frag. length
9 systems in France
325 publications
Multiplexing capacity
52. SOLiD v2 instrument components
The SOLiD™ Instrument consists of
the following components:
• Reagent delivery system
• Electronics
• Camera (4 megapixel)
• Monitor stand
• Independently controlled dual flow
cells
• Liquid waste container
SOLiD v2 computer system
instrument controller
• Hardware: Intel® Xeon® processors
• Operating system: Microsoft®
Windows® XP Pro
• Installed RAM: 4 GB
• Hard disk storage: dual 80 GB
SATA hard drives (RAID-1)
head node
• Hardware: Intel® Xeon® Dual Core
processors (2)
• Operating system: 64-bit LINUX
• Installed RAM: 8 GB
• Hard disk storage: dual 750 GB
SATA hard drives (RAID-1)
compute nodes (each)
• Hardware: Intel® Xeon® Dual Core
processors (2)
• Operating System: 64-bit LINUX
• Installed RAM: 8 GB
SOLID in details
• Hard disk storage: 80 GB SATA hard
drives
storage
• Hard disk storage:
15x 750 GB SATA hard drives
• Operating system: 64-bit LINUX
• RAID-5 w/ hot spare
53. Figure 1. Library generation schematic.
Sequencing on the SOLiD machine starts with library preparation. In the simplest
fragment library, two different adapters are ligated to sheared genomic DNA (left
panel of Fig. 1). If more rigorous structural analysis is desired, a “mate-pair”
library can be generated in a similar fashion, by incorporating a circularization/
cleavage step prior to adapter ligation (right panel of Fig.1).
ABI's SOLID Sequencing
Technology
54. Figure 2. Clonal bead library generation via emulsion PCR.
Once the adapters are ligated to the library, emulsion PCR is conducted using the
common primers to generate “bead clones” which each contain a single nucleic
acid species.
ABI's SOLID Sequencing
Technology
55. Figure 3. Depositing beads into flow cell via end modifications.
Each bead is then attached to the surface of a flow cell via 3’ modifications to the
DNA strands.
At this point, we have a flow cell (basically a microscope slide that can be serially
exposed to any liquids desired) whose surface is coated with thousands of beads
each containing a single genomic DNA species, with unique adapters on either
end.
Each microbead can be considered a separate sequencing reaction which is
monitored simultaneously via sequential digital imaging. Up to this point all next-
gen sequencing technologies are very similar, this is where ABI/SOLiD diverges
dramatically (see next).
ABI's SOLID Sequencing
Technology
56. Each oligo has degenerate
positions at 3’ bases 1-3
(N’s), and one of 16 specific
dinucleotides at positions
4-5. Positions 6 through the
5’ are also degenerate, and
hold one of four fluorescent
dyes. The sequencing
involves:
1. Hybridization and
ligation of a specific
oligo whose 4th & 5th
bases match that of the
template
2. Detection of the specific
fluor
3. Cleavage of all bases to
the 5’ of base 5
4. Repeat, this time
querying the 9th & 10th
Figure 4. Schematic of ABI SOLiD sequencing chemistry.
bases
5. After 5-7 cycles of this,
perform a “reset”, in
which the initial primer
and all ligated portions
The actual base detection is no longer done by the polymerase-driven incorporation of
are melted from the
labeled dideoxy terminators. Instead, SOLiD uses a mixture of labeled oligonucleotides
template and
and queries the input strand with ligase. Understanding the labeled oligo mixture is
discarded.
key to understanding SOLiD technology.
6. Next a new initial
primer is used that is
N-1 in length.
Repeating the initial
cycling (steps 1-4) now
ABI's SOLID Sequencing
generates an
overlapping data set
(bases 3/4, 8/9, etc,
Technology
see Fig 5).
57. Each oligo has degenerate
positions at 3’ bases 1-3
(N’s), and one of 16 specific
dinucleotides at positions
4-5. Positions 6 through the
5’ are also degenerate, and
hold one of four fluorescent
dyes. The sequencing
involves:
1. Hybridization and
ligation of a specific
oligo whose 4th & 5th
bases match that of the
template
2. Detection of the specific
fluor
3. Cleavage of all bases to
the 5’ of base 5
4. Repeat, this time
querying the 9th & 10th
Figure 4. Schematic of ABI SOLiD sequencing chemistry.
bases
5. After 5-7 cycles of this,
perform a “reset”, in
which the initial primer
and all ligated portions
The actual base detection is no longer done by the polymerase-driven incorporation of
are melted from the
labeled dideoxy terminators. Instead, SOLiD uses a mixture of labeled oligonucleotides
template and
and queries the input strand with ligase. Understanding the labeled oligo mixture is
discarded.
key to understanding SOLiD technology.
6. Next a new initial
primer is used that is
N-1 in length.
Repeating the initial
cycling (steps 1-4) now
ABI's SOLID Sequencing
generates an
overlapping data set
(bases 3/4, 8/9, etc,
Technology
see Fig 5).
58. For example (see Fig.
4), the dinucleotides
CA, AC, TG, and GT are
all encoded by the
green dye.
Because each base is
queried twice it is
possible, using the two
colors, to determine
which bases were at
which positions.
This two color query
Figure 5. Sequencing coverage during SOLiD sequencing cycles
system (known as
“color space” in ABI- Thus, 5-7 ligation reactions followed by a 4-5 primer reset cycles are repeated
speak) has some generating sequence data for ~35 contiguous bases, in which each base has
interesting been queried by two different oligonucleotides.
consequences with
regard to the If you’re doing the math you’ve realized there are 16 possible dinucleotides
identification of errors. (4^2) and only 4 dyes. So data from a single color does not tell you what base is
at a given position. This is where the brilliance (and potential confusion) comes
about with regard to SOLiD. There are 4 oligos for every dye, meaning there are
four dinucleotides that are encoded by each dye.
ABI's SOLID Sequencing
Technology
59. For example (see Fig.
4), the dinucleotides
CA, AC, TG, and GT are
all encoded by the
green dye.
Because each base is
queried twice it is
possible, using the two
colors, to determine
which bases were at
which positions.
This two color query
Figure 5. Sequencing coverage during SOLiD sequencing cycles
system (known as
“color space” in ABI- Thus, 5-7 ligation reactions followed by a 4-5 primer reset cycles are repeated
speak) has some generating sequence data for ~35 contiguous bases, in which each base has
interesting been queried by two different oligonucleotides.
consequences with
regard to the If you’re doing the math you’ve realized there are 16 possible dinucleotides
identification of errors. (4^2) and only 4 dyes. So data from a single color does not tell you what base is
at a given position. This is where the brilliance (and potential confusion) comes
about with regard to SOLiD. There are 4 oligos for every dye, meaning there are
four dinucleotides that are encoded by each dye.
ABI's SOLID Sequencing
Technology
60. NextGen
AB Solid 3.0:
Sequencers
Workflow
Workflow: 3-4 days (setup) + 4-10 days (run)
61. NextGen
AB Solid 3.0:
Sequencers
Workflow
Workflow: 3-4 days (setup) + 4-10 days (run)
62. NextGen
AB Solid 3.0:
Sequencers
Workflow
Workflow: 3-4 days (setup) + 4-10 days (run)
63. NextGen
AB Solid 3.0:
Sequencers
Workflow
Workflow: 3-4 days (setup) + 4-10 days (run)
64. NextGen AB Solid 3.0:
Sequencers
add-ons
Covaris S2 System ULTRA-TURRAX Tube
Drive from IKA
The Covaris™ S2 System is required
sample preparation instrument for use
This device is required for the
in the SOLiD™ System workflow. The
preparation of consistently sized
instrument is an essential part of the
reactors for emulsion PCR.
emulsion PCR process used to prepare
the beads for emulsion PCR. The
Hydroshear from
Covaris System is also used to shear
DNA into 60 bp fragments for fragment
Genomic Solutions
library preparation.
The Hydroshear® from Genomic
Solutions® is a reproducible and
included controllable method for generating
random DNA fragments of specific
sizes. Use this to prepare mate-
paired libraries for the SOLiD™
System.
not included - Agilent BioAnalyzer +€20.000
65. AB Solid 3.0
Next Generation 10-20 Gb/run
100m reads @ 50b
€5300/run 5+5Gb
Sequencers €0,53/Mb
€482k/inst. tot
The SOLID AB Solid 3.0:
Setup time: 3-5 d
5-12.5 Gb/run/slide
System Run time: 3.5-10 d
images: 2.5 TB
Primary Analysis: 750 GB
PA CPU time: in run time
Final file size: 140 GB
notes:
The Scientist Top Innovation of 2008
125-400m reads in 2009
30/40Gb
potential for 12x human genome @
$10.000
3 systems in France
Multiplexing capacity
66. Roche GS-FLXti: Roche GS-FLXti:
0.5 Gb/run Setup time: 3-4 d
1m reads @ 400b 0.4Gb/run
Run time: 10 hrs
images: 27 GB
€5990/run Primary Analysis: 15 GB
PA CPU time: 220 hrs
€14.97/Mb
Final file size: 4 GB
€585k/inst. tot
Illumina GA2: Illumina GA2:
5-10 Gb/run (50b) Setup time: 2-3 d
6-11 Gb/run
€6180/run (5Gb) Run time: 3-6 d
images: 900 GB
Primary Analysis: 350 GB
€0,25/Mb
PA CPU time: 100 hrs
€480/inst. tot Final File Size: 75 GB
AB Solid 3.0 AB Solid 3.0:
10-20 Gb/run Setup time: 3-5 d
100m reads @ 50b 5-12.5 Gb/run/slide
Run time: 3.5-10 d
€5300/run 5+5Gb images: 2.5 TB
Primary Analysis: 750 GB
PA CPU time: in run time
€0,53/Mb
Final file size: 140 GB
€482k/inst. tot
67. Roche GS-FLXti
General
Infrastructure
Laboratory 1 Controlled
Room
(emPCR)
Amplicon
Room
Requirements General
Laboratory 2
BioIT room
Illumina GA2
- Lab space, dedicated rooms General
Laboratory 1
- Hands on IT infrastructure Cluster
Station room
- Data Storage capacity General
Laboratory 2
BioIT room
-Sample and wor kflow
tracability solutions
General
Laboratory 1 Controlled
Room
(emPCR)
Amplicon
- BioIT group support for 3rd Room
General
party analysis Laboratory 2
BioIT room
AB Solid 3.0
68. NextGen Sequencing Service
Providers
Europe
Many locations Cogenics http://www.cogenics.com/sequencing/s...ingService.cfm
Many locations GATC Biotech http://www.gatc-biotech.com/en/index.php
Germany dkfz http://www.dkfz.de/gpcf/ngs_sequencing.html
Germany Functional Genomics Center zurich http://www.fgcz.ethz.ch/applications/gt/ngsequencing
Germany Eurofins MWG Operon http://www.eurofinsdna.com/products-...equencing.html
Hungary BAYGEN http://baygen.hu/
The Netherlands ServiceXS http://www.servicexs.com/servicexs+i...+ii+sequencing
Spain Sistemas Genómicos http://www.sistemasgenomicos.com/
Sweden Sweden Uppsala Genome Center http://www.genpat.uu.se/node453
Switzerland Fasteris http://www.fasteris.com/
UK AGOWA - LGC http://www.lgc.co.uk/pdf/Next%20gen%...lyer%20web.p
UK The Gene Pool https://www.wiki.ed.ac.uk/display/GenePool/Home
UK Geneservice http://www.geneservice.co.uk/services/sequencing/
UK University of Liverpool http://www.liv.ac.uk/agf/index.html
Belgium DNAVision (soon available) http://www.dnavision.be/
GATC
Illumina platform based: 3500 € HT 1/8 flow cell vs 772 €
Roche platform based: 10.150 € HT 1/2 picoplate vs 2995 €
Cogenics 10/2008
Roche platform based: 15.000 € HT 1 full picoplate vs 5990 €
69. Whole genome Amplicon seq. Transcriptome seq.
- Mutations / SNP - cDNA
sequencing
- Small RNA
- de novo sequencing
- comparative seq.
Methylation seq. Metagenomics ChIP sequencing
Les Applications
70. AB:
Roche
1, 4, 8 regions
slides
16-128 samples/slide
with barcoding
AB
Illumina:
2, 4, 8, regions Flow Cell
flow cells – 1.4mm wide channel design
– 40% more usable area
Roche:
2, 4, 8, 16 regions
Illumina
plates
Multiple Sample Sequencing
71. Roche (192)
AB (256)
Illumina (96)
Increase Sample Throughput
via Multiplex Identifiers