whole genome analysis
history
needs
steps involved
human genome data
NGS
pyrosequencing
illumina
SOLiD
Ion torrent
PacBio
applications
problems
benefits
Original Next Gen Seq Methods set of slides prepared for Technorazz Vibes 2016. There is also a shorter version.
This starts with an introduction to qPCR followed by an introduction to Library Complexity. Microarrays are discussed as well along with a very short introduction to FISH. Finally discussion of Next gen seq methods is done where generation of sequencers are discussed and a short discussion of the ILLUMINA protocol. Finally comparison of ILLUMINA amongst other 3rd gen sequencer, description of the standard pipeline and the omics technologies that have risen from this seq data.
Deciphering DNA sequences is essential for virtually all branches of biological research. With the
advent of capillary electrophoresis (CE)-based Sanger sequencing, scientists gained the ability to
elucidate genetic information from any given biological system. This technology has become widely
adopted in laboratories around the world, yet has always been hampered by inherent limitations in
throughput, scalability, speed, and resolution that often preclude scientists from obtaining the essential
information they need for their course of study. To overcome these barriers, an entirely new technology
was required—Next-Generation Sequencing (NGS), a fundamentally different approach to sequencing
that triggered numerous ground-breaking discoveries and ignited a revolution in genomic science.
Next Generation Sequencing (NGS) Is A Modern And Cost Effective Sequencing Technology Which Enables Scientists To Sequence Nucleic Acids At Much Faster Rate. In This Presentation, You Will Learn About What is NGS, Idea Behind NGS, Methodology And Protocol, Widely Adapted NGS Protocols, Applications And References For Further Study.
Sequencing is one of the major technological advancement that has taken shape in the last two or three decade. Starting from Sanger and Maxam-Gilbert sequencing methods to the latest high-throughput methods, sequencing technologies has changed the the landscape of biological sciences.
This slide takes a look a the major sequencing methods over time.
Note: Several images included here have been sourced from GOOGLE IMAGES. The content has been extracted from several SCIENTIFIC PAPERS and WEBSITES.
PLEASE DO CONTACT THE AUTHOR DIRECTLY IF ANY COPYRIGHT ISSUE ARISES.
This presentation is explains about the genome sequencing, its traditional method and modern method. This basically focus on Next Generation Sequencing and its types.
whole genome analysis
history
needs
steps involved
human genome data
NGS
pyrosequencing
illumina
SOLiD
Ion torrent
PacBio
applications
problems
benefits
Original Next Gen Seq Methods set of slides prepared for Technorazz Vibes 2016. There is also a shorter version.
This starts with an introduction to qPCR followed by an introduction to Library Complexity. Microarrays are discussed as well along with a very short introduction to FISH. Finally discussion of Next gen seq methods is done where generation of sequencers are discussed and a short discussion of the ILLUMINA protocol. Finally comparison of ILLUMINA amongst other 3rd gen sequencer, description of the standard pipeline and the omics technologies that have risen from this seq data.
Deciphering DNA sequences is essential for virtually all branches of biological research. With the
advent of capillary electrophoresis (CE)-based Sanger sequencing, scientists gained the ability to
elucidate genetic information from any given biological system. This technology has become widely
adopted in laboratories around the world, yet has always been hampered by inherent limitations in
throughput, scalability, speed, and resolution that often preclude scientists from obtaining the essential
information they need for their course of study. To overcome these barriers, an entirely new technology
was required—Next-Generation Sequencing (NGS), a fundamentally different approach to sequencing
that triggered numerous ground-breaking discoveries and ignited a revolution in genomic science.
Next Generation Sequencing (NGS) Is A Modern And Cost Effective Sequencing Technology Which Enables Scientists To Sequence Nucleic Acids At Much Faster Rate. In This Presentation, You Will Learn About What is NGS, Idea Behind NGS, Methodology And Protocol, Widely Adapted NGS Protocols, Applications And References For Further Study.
Sequencing is one of the major technological advancement that has taken shape in the last two or three decade. Starting from Sanger and Maxam-Gilbert sequencing methods to the latest high-throughput methods, sequencing technologies has changed the the landscape of biological sciences.
This slide takes a look a the major sequencing methods over time.
Note: Several images included here have been sourced from GOOGLE IMAGES. The content has been extracted from several SCIENTIFIC PAPERS and WEBSITES.
PLEASE DO CONTACT THE AUTHOR DIRECTLY IF ANY COPYRIGHT ISSUE ARISES.
This presentation is explains about the genome sequencing, its traditional method and modern method. This basically focus on Next Generation Sequencing and its types.
Sanger sequencing is a method of DNA sequencing based on the selective incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication.
NEED OF GENETIC SEQUENCING
- Understanding the particular DNA sequence can shed light on a genetic condition and offer hope for the eventual development of treatment.
- An alteration in a DNA sequence can lead to an altered or non functional protein and hence to a harmful effect in a plant or animal.
- Simple point mutations can cause altered protein shape and function.
Detailed explanation about gene sequencing methods
Sequencing the gene is an important step toward understanding the gene.
A gene sequence contains some clues about where genes are.
Gene sequencing give us understanding how the genome as a whole works-how genes work together to direct the growth, development and maintenance of an entire organism.
It help scientists to study the part of genome outside the genes-regulatory regions
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
2. OUTLINES
• Introduction DNA Sequencing
• Principles of DNA Sequencing
• Maxam–Gilbert Sequencing
• Sanger Sequencing
• Shotgun Sequencing
• Primer Walking
• Introduction to Genome Sequencing
• Hierarchical shotgun sequencing
• Whole-genome shotgun sequencing
• Next-Generation Sequencing (NGS)
• NGS Sequencing Technology
• NGS Principles
• Library Preparation
• Amplification Methods
• Sequencing Methods
• NGS Platform Comparison
• Introduction to DNA Sequence Alignment
• Needleman-Wunsch Algorithm
2
3. PRINCIPLE OF DNA SEQUENCING
• DNA sequencing is the process of finding the order or
sequence of nucleotides in DNA molecule
• In 1977 two DNA sequencing methods were developed and
published. Now known as First Generation Sequencing:
• Maxam–Gilbert Sequencing by chemical degradation method
• Sanger Sequencing by chain termination method
• Although initially Maxam–Gilbert was more popular eventually
Sanger sequencing method become more preferred due to
several technical and safety reasons
• It is important to note that at that time there was no PCR,
which was discovered in 1983
3
(Pareek, Smoczynski, & Tretyn, 2011)
4. MAXAM–GILBERT SEQUENCING
4
(Pareek, Smoczynski, & Tretyn, 2011) (SnipAcademy, 2017)
1. Double-stranded DNA is denatured to single-
stranded
2. 5' end of the DNA fragment that will be used for
sequencing is radioactively labeled by a kinase
reaction using gamma-32P.
3. DNA strands are then cleaved at specific
positions using chemical reactions in 4 different
reaction tubes. For example
• Dimethyl sulphate selectively attacks purine (A
and G)
• Hydrazine selectively attacks pyrimidines (C
and T)
• A+G means that attack will be on A but may
also cleave at G
4. Cleaved fragments in each tube are then passed
to gel electrophoresis for size separation
5. Under X-ray film gel reveal bands with
radiolabeled DNA molecules.
6. Fragments are then ordered by size to help
derive the original sequence of the DNA
molecule
5. SANGER SEQUENCING
5
1. Clone one DNA fragment into vectors and
amplify
2. Attach primers to the fragments
3. To each tube add:
• 4 standard dNTPs
• Only one type of ddNTPs
4. Add DNA polymerase to initiate synthesis
• DNA polymerase will add dNTPs and will
stop when encounters any ddNTP
5. Separate single stranded DNAs in gel
electrophoresis
6. Read the sequence from bands
(Shendure & Ji, 2008) (Snipcademy, 2017)
6. AUTOMATED SANGER SEQUENCING
6
• Same process but ddNTPs are fluorescently labeled with 4 different colors
• Instead of gel electrophoresis high-resolution capillary electrophoresis is coupled to
four color detection of emission spectra is used to automatically read sequence as it
exits capillary
(Shendure & Ji, 2008) (Snipcademy, 2017)
7. SHOTGUN SEQUENCING
7
• Long DNA molecules can not be directly sequenced by Sanger
sequencing method
• In such case Shotgun sequencing method must be used
• Method involve breaking down long DNA molecule
mechanically or enzymatically into smaller random fragments
of approximately 1000 bases long
• Followed by cloning individual fragments into universal vector
and amplification
• Random fragments (aka DNA inserts) are then individually
sequenced with universal primer of cloning vector
• Fragments that have overlapping regions are then called
contigs(contiguous sequence) and are used to assemble
original sequence
(Shendure et al., 2017) (Amy Rogers, 2008)
9. PRIMER WALKING
• Primer Walking is required when:
• Fragment is too long to be sequenced by Sanger method in a single run
• Final assembly of shotgun sequencing have gaps that were not covered by
contigs
• Simply primer walking is addition of new primer upstream(right-to-left)
of first sequence
• New primers are synthesized once the first sequence is complete,
annealed to the original fragment and sequencing is restarted
• If many gaps are present in final assembly method become time
consuming but efficiently reveals missed fragments
9
(SnipAcademy, 2017)
10. INTRODUCTION TO GENOME
SEQUENCING
• Very large DNA molecules of such as chromosomal DNA or even
whole genomes require slightly modified shotgun sequencing
approaches also called genome sequencing
• There are three classical genome sequencing methods:
• Hierarchical shotgun sequencing
• Whole-genome shotgun sequencing
• Double-barrel shotgun sequencing
• Initially Human Genome Project used hierarchical shotgun
sequencing later whole-genome shotgun sequencing method
were developed. Both contributed to the sequencing of the first
human genome.
10
(McClean, 2005) (Chial, 2008)
11. HIERARCHICALSHOTGUNSEQUENCING
• First genomic DNA is fragmented
into relatively large pieces (~150Mb)
• Pieces are then inserted into
bacterial clones BACs, grown in E.
coli and recovered
• BAC libraries are then organized and
mapped to find physical location of
pieces, and form physical map. This
step is also referred as Golden Tiling
Path.
• BAC libraries are then individually
sequenced using standard shotgun
sequencing
• BAC sequences are then assembled
and aligned to physical map to find
original genomic DNA
• BAC libraries are useful because each
piece can be sequenced in different
labs by different researchers
11
(McClean, 2005) (Chial, 2008)
12. WHOLE-GENOMESHOTGUNSEQUENCING
• Whole-genome shotgun sequencing is simply shotgun sequencing but
without preparation of physical map.
• In this method whole genomic DNA is fragmented into fragments of
relatively smaller size 100Kb
• Fragments are then cloned into plasmids, grown in bacteria and
recovered
• Each fragment in plasmids are fragmented again and then sequenced to
form contigs
• Long contiguous sequences are then assembled to form final DNA
sequence. This is also known as de novo sequencing
• This method and double-barrel shotgun sequencing are very similar
with only main difference that in the latter sequencing is performed
from both ends of DNA inserts. This is also known as pairwise‐end
sequencing
• Sequencing of large genomic DNAs using this method was only possible
with technological advances both in DNA sequencing efficiency and
computational tools that are used for assembly of contigs 12
(McClean, 2005) (Chial, 2008)
15. NGS PRINCIPLES
• Conceptually second generation NGS platforms share same work
flow
1. LIBRARY PREPARATION
2. AMPLIFICATION/ENRICHMENT
3. SEQUENCING
4. DATA PROCESSING
• Methods involved in each step may differ by platform
• Library Preparation refers to DNA pretreatment process before
sequencing and is typically combined with amplification step
• Actual sequencing is performed in the last step and the
technology involved in this step is highly dependent on the
preferred platform
• Generated data during sequencing step is then handled by
bioinformatic tools and further analyzed as needed
15
(Shendure & Ji, 2008) (Yohe & Thyagarajan, 2017)
16. LIBRARY PREPARATION
• None of the second generation sequencing methods are capable of
sequencing large whole DNA molecule at once
• Prior to amplification DNA must be randomly fragmented and
resulting fragments treated based on requirements of the
platform
• Most NGS platforms have similar library preparation steps
• Libraries can be single-end/pair-end(SE/PE) or mate-pair(ME)
• Typically final DNA fragments, so called DNA library, must satisfy
most of the following requirements depending on the platform:
• Short DNA fragments of similar size
• Blunt ended with no 3’ or 5’ overhangs
• 5’ phosphate and 3′ hydroxyl groups at blunt ends
• dA tailing (dAMP incorporation onto 3’ ends)
• Ligated adapters (synthetic oligonucleotides)
16
(Shendure & Ji, 2008) (van Dijk, Jaszczyszyn, & Thermes, 2014)
18. •Initial DNA template must be randomly fragmented via one of the following approaches
•Mechanically (shearing, sonication, etc.)
•Enzymatically (restriction enzymes or transposases)
•Chemically
STEP 1 - FRAGMENTATION
•DNA fragments (Inserts Size) of desired length are selected using either Solid-Phase
Reversible Immobilization (SPRI) beads or Agarose gel electrophoresis
STEP 2 - SIZE SELECTION
•Many fragments may end up with overhanging sticky 5’ or 3’ ends
•These sticky ends must be repaired by transforming sticky ends into blunt ends using T4 DNA
polymerase and DNA Polymerase I, Large (Klenow) Fragment by
•Filling of 5’ overhangs by 5’ to 3’ polymerase activity
•Degrading 3’ overhangs by 3’ to 5’ exonuclease activity
•After blunting is complete, 5′ ends are phosphorylated for ligation and 3’ dA tailing is
performed for Illumina or 454 platforms
STEP 3 – END REPAIR (BLUNTING)
•Following adapters are ligated to both ends of the DNA fragments via DNA ligase
•Universal PCR primers for PCR amplification
•Hybridization sequences for surface immobilization
•Molecular barcodes for multiplex pooled sequencing
•Recognition sites to initiate sequencing
STEP 4 - ADAPTER LIGATION
18
LIBRARY PREPARATION (SE/PE )
(Shendure & Ji, 2008) (van Dijk, Jaszczyszyn, & Thermes, 2014) (Head et al., 2014)
19. AMPLIFICATION/ENRICHMENT
• During amplification/enrichment step DNA libraries are
clonally amplified
• Amplification is necessary because most sequencing detection
systems are not capable of detecting one molecule. Except
for Heliscope platform by Helicos BioSciences that filed for
bankruptcy in 2012
• Amplification methods depend on used platform but most
common are:
• Emulsion PCR (emPCR) – 454, SOLiD and Ion Torrent
• Solid-phase amplification – Illumina
19
(Kchouk, Gibrat, & Elloumi, 2017) (Metzker, 2010)
22. SEQUENCING PRINCIPLES
• There are three main principles for next generation DNA
sequencing
• Sequencing-by-Synthesis(SBS)
• Sequencing-by-Ligation(SBL)
• Sequencing-by-Hybridization(SBH)
• Each has advantages and challenges but most common
sequencing method is by synthesis followed by SBL
• Each sequencing platform have its own unique differences
that are based on one or more than one of the principles
above
• Although SBH is cost effective it is mostly used for genome-
wide association studies and variant detection rather than de
novo sequencing
22
(Kchouk, Gibrat, & Elloumi, 2017) (Shendure & Ji, 2008)
23. NGS TECHNICAL DIFFERENCES
23
Technology Sequencing Method Detection
Rosche/454 Pyrosequencing (SNA) Luminescence (Pyrophosphate)
Illumina/Solexa Illumina Dye Sequencing (CRT) Fluorescence (4 color Fl-dNTP)
Ion Torrent Post-Light Sequencing (SNA) ∆pH/Proton Detection
Heliscope
Single Molecule Fluorescent
Sequencing (CRT)
Fluorescence (1 color Fl-dNTP)
ABI/SOLiD
Support Oligonucleotide Ligation
Detection
Fluorescence (4 color 1,2 dNTP
probes)
(Kchouk, Gibrat, & Elloumi, 2017) (Metzker, 2010) (Fuller et al., 2009)
CRT (Cyclic Reversible Termination), SNA (Single-nucleotide addition)
25. PYROSEQUENCING
25
(Metzker, 2010)
• Components:
• single-strand DNA
• one of dNTPs for each cycle
• DNA polymerase
• ATP sulfurylase, luciferase and apyrase
• adenosine 5´ phosphosulfate (APS)
• luciferin
• Generated light is proportional to ATP and can
be detected by high-resolution charge-couple
device(CCD)
• Each cycle is completed when DNA polymerase
runs out of dNTPs
26. ILLUMINA DYE SEQUENCING
26
(Metzker, 2010)
1. DNA polymerase used to
incorporate one of four colored
Illumina’s reversible terminator
dNTPs with inactive 3’-hydroxyl
group
2. Wash four colored dNTPs and
perform imaging
3. Cleavage step removes
fluorescent dyes and reducing
agent tris(2-
carboxyethyl)phosphine (TCEP)
is used to regenerate 3′-OH
group to allow incorporation of
next dNTPs
27. SINGLEMOLECULEFLUORESCENTSEQUENCING
1. DNA polymerase used to
incorporate one colored
Helicos Virtual Terminators
Cy5-2′-deoxyribonucleoside
triphosphate (dNTP)
2. Wash one colored dNTPs and
perform imaging
3. Cleavage step removes
fluorescent dyes and reducing
agent tris(2-
carboxyethyl)phosphine
(TCEP) is used to remove
inhibitory group to allow
incorporation of next dNTPs
27
(Metzker, 2010)
28. POST-LIGHT SEQUENCING
• Sequencing is performed on
semiconductor Ion Chip with pH
sensor
• DNA Polymerase incorporates one
non-modified dNTP at time
• DNA Polymerase activity releases
one pyrophosphate and one
hydrogen ion
• Released H+ ion changes pH in the
single well
• Change in pH is instantly detected
by sensor plate(pH detector)
• After detection chip is washed and
new dNTP is added
28
(Mardis, 2013)
30. SUPPORTOLIGONUCLEOTIDELIGATIONDETECTION
• For 50 bp read length sequencing is performed in 5
primer rounds and 10 ligation rounds
• 4 octamer probes are with 2 dNTPs (16 possible
combinations) and 6 degenerated dNTPs with one of
4 fluorescent labels
• During ligation rounds one of four possible octamer
probes is hybridized with complementary template
region followed with fluorescence detection
• Next, last 3 dNTPs with fluorescent label are removed
and probe is ligated with previous dNTP
• 9 more ligation rounds are performed until template
is completely covered
• Extended primer product is denatured and removed
to complete first primer round
• Second primer round is performed with new (n-1)
offset primer with 10 ligation rounds
• This is performed until (n-5) primer round is complete
• Finally generated data is deconvoluted and final
sequence is obtained for each DNA fragment bead
30
(Metzker, 2010) (Voelkerding, Dames, & Durtschi, 2009)
31. NGS PLATFORM COMPARISON
31
Rosche/454 Illumina/Solexa Ion Torrent Heliscope ABI/SOLiD
Read Length (bp) 200-1000 75-300 200-400 30-35 35/50/75
Reads per Run 100-1M 25M-6B 0.4M-80M 600M-1B 1B-6B
Appx. Data Generated 0.45Gb 1.8Tb 10Gb 37 Gb 160Gb
Error Type
(Avg. % Rate)
Indel
(1%)
Mismatch
(0.1%)
Indel
(1%)
Indel
(1%)
Mismatch
(0.06%)
Run Time ~10 h ~10 days ~7 days ~ 8 days ~12 days
Appx. Device Cost
(~2012 Prices)
100,000 $ –
500,000 $
100,000 $ –
600,000 $
80,000 $ –
150,000 $
1 Million $ 500,000 $
Avg. Cost per Gb
(~2012 Prices)
10,000$ 40$-500$ 1000$ 1000$ 130$
(Kchouk, Gibrat, & Elloumi, 2017; Liu et al., 2012; Metzker, 2010; Quail et al., 2012; Reinert, Langmead, Weese, & Evers, 2015; van Dijk, Auger,
Jaszczyszyn, & Thermes, 2014)(nextgenseek, 2012)
32. INTRODUCTION TO
DNA SEQUENCE ALIGNMENT
• Sequence Alignment is a process of lining up two or more
sequences of nucleotides or amino acids in order to find any global
or regional similarity that may denote some structural, functional
or evolutionary relationship
• There are two type of sequence alignments:
• Pair-wise alignment – Only two sequences
• Multiple sequence alignment – More than two sequences
• There are two computational approaches for alignment:
• Global alignment – Complete end-to-end alignment
• Local alignment – Regional alignment
• Alignment of NGS reads are called Short Read Alignment and is a
subject of significant research due to alignment of massive
amounts of short sequences to a references genome
32
(Diniz & Canduri, 2017)(Reinert, Langmead, Weese, & Evers, 2015)
33. ALIGNMENT METHODS
There are four main sequence alignment algorithms:
• Brute Force
• Simplest and most inefficient
• Dot-Matrix
• Can provide quick visualization of two sequences
• Dynamic Programming
• Guarantee optimal alignment and provide scores but not efficient
method for aligning long sequences
• Word Method (aka k-tuple method)
• Use heuristic algorithms that do not guarantee optimal
alignments but significantly more efficient than any other
method. Uses seed-and-extent algorithm and is implemented in
most well known tools such as BLAST and FASTA
33
(Diniz & Canduri, 2017)
35. DYNAMIC PROGRAMMING
• There are two main dynamic programming algorithms for
sequence alignment
• Needleman-Wunsch algorithm for global alignments
• Smith-Waterman algorithm for local alignment
• Both use dot matrix approach with distance measurements
such as Manhattan distance to generate score matrix that is
used to find optimal alignment
• Score are calculated using various scoring matrices such as
• Simple identity matrix for nucleotide scoring
• PAM and BLOSUM matrices for amino acid scoring
35
(Marketa & Jeremy, 2008)
36. NEEDLEMAN-WUNSCH ALGORITHM
EXAMPLE
• Say that you want to globally align following sequences
• Top Sequence (S1): GTCACATGCC – 10 base pairs
• Side Sequence (S2): GCCGACAGT – 9 base pairs
• Alignment can be performed in following steps:
1. Matrix Initialization – Select matrix scheme and initiate matrix
2. Matrix Fill – Apply recurrence relations to every cell
3. Matrix Traceback – Find path back to the most top-left (0,0)
position starting from most lower-right position
• First let’s set matrix scheme with following alignment
parameters:
• Match: 5 points (𝛿𝑖,𝑗)
• Mismatch: -3 points (𝛿𝑖,𝑗)
• Gap Penalty: -5 points (𝑤) 36
(Shu & Ouw, 2004)
37. MATRIX
37
G T C A C A T G C C
G
C
C
G
A
C
A
G
T
Sequence S1 length was 9 and S2 length was 10. Resulting matrix
size is (9+1)x(10+1) = 10x11
38. MATRIX INITIALIZATION
38
0 1 2 3 4 5 6 7 8 9 10
G T C A C A T G C C
0 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50
1 G -5
2 C -10
3 C -15
4 G -20
5 A -25
6 C -30
7 A -35
8 G -40
9 T -45
Initiate matrix by cumulatively adding gap penalty (-5) to
first raw and column starting from position (0,0)
Indices
39. MATRIX FILL
39
𝑠𝑖,𝑗 = 𝑚𝑎𝑥
𝑠𝑖−1,𝑗−1 + 𝛿𝑖,𝑗
𝑠𝑖,𝑗−1 + 𝑤
𝑠𝑖−1,𝑗 + 𝑤
0 1 2 3 4 5 6 7 8 9 10
G T C A C A T G C C
0 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50
1 G -5 5
2 C -10 0
3 C -15
4 G -20
5 A -25
6 C -30
7 A -35
8 G -40
9 T -45
Recurrence relations are:
Where
• 𝑖 stands for rows of side sequence
• 𝑗 stands for columns of top sequence
• 𝑠𝑖,𝑗 is score of the at position 𝑖, 𝑗
• 𝛿𝑖,𝑗 is +5 reward if i and j are match and -3 penalty if mismatch
• 𝑤 is -5 points gap penalty
Above function is applied for every cell and maximal value out of three is chosen
𝑠1,1 = 𝑚𝑎𝑥
𝒔 𝟏−𝟏,𝟏−𝟏 + 𝜹 𝟏,𝟏 = 𝟎 + 𝟓 = 𝟓
𝑠1,1−1 + 𝑤 = −5 − 5 = −10
𝑠1−1,1 + 𝑤 = −5 − 5 = −10
= 5
𝑠2,1 = 𝑚𝑎𝑥
𝑠2−1,1−1 + 𝛿2,1 = −5 − 3 = −8
𝑠2,1−1 + 𝑤 = −10 − 5 = −15
𝒔 𝟐−𝟏,𝟏 + 𝒘 = 𝟓 − 𝟓 = 𝟎
= 0
40. MATRIX FILL
40
0 1 2 3 4 5 6 7 8 9 10
G T C A C A T G C C
0 0 -5 -10 -15 -20 -25 -30 -35 -40 -45 -50
1 G -5 5 0 -5 -10 -15 -20 -25 -30 -35 -40
2 C -10 0 2 5 0 -5 -10 -15 -20 -25 -30
3 C -15 -5 -3 7 2 5 0 -5 -10 -15 -20
4 G -20 -10 -8 2 4 0 2 -3 0 -5 -10
5 A -25 -15 -13 -3 7 2 5 0 -5 -3 -8
6 C -30 -20 -18 -8 2 12 7 2 -3 0 2
7 A -35 -25 -23 -13 -3 7 17 12 7 2 -3
8 G -40 -30 -28 -18 -8 2 12 14 17 12 7
9 T -45 -35 -25 -23 -13 -3 7 17 12 14 9
𝑠4,2 = 𝑚𝑎𝑥
𝒔 𝟒−𝟏,𝟐−𝟏 + 𝜹 𝟒,𝟐 = −𝟓 − 𝟑 = −𝟖
𝑠4,2−1 + 𝑤 = −10 − 5 = −15
𝒔 𝟒−𝟏,𝟐 + 𝒘 = −𝟑 − 𝟓 = −𝟖
= −8
FINAL POSITION SCORE = ALIGNMENT SCORE
42. REFERENCES
• Amy Rogers, M. D., Ph.D. (2008). Shotgun sequencing. Retrieved from https://www.csus.edu/indiv/r/rogersa/Bio181/SeqShotgun.pdf
• Ansorge, W. J. (2009). Next-generation DNA sequencing techniques. N Biotechnol, 25(4), 195-203. doi:10.1016/j.nbt.2008.12.009
• Anthony, S. (2011). New Silicon Chip Sequences Complete Genome in Three Hours. Retrieved from https://www.extremetech.com/computing/84340-new-silicon-chip-sequences-complete-genome-in-three-hours
• ATDBio. Next generation sequencing. Retrieved from https://www.atdbio.com/content/58/Next-generation-sequencing
• Brown, T. A. (2016). Gene cloning and DNA analysis: an introduction: John Wiley & Sons.
• CeGaT. (2018). Next-Generation Sequencing Services. Retrieved from https://www.cegat.de/en/services/next-generation-sequencing/
• Chial, H. (2008). DNA sequencing technologies key to the Human Genome Project. Nature Education, 1(1), 219.
• Diniz, W. J., & Canduri, F. (2017). REVIEW-ARTICLE Bioinformatics: an overview and its applications. Genet Mol Res, 16(1). doi:10.4238/gmr16019645
• EpiNext. (2017). EpiNext High-Sensitivity DNA Library Preparation Kit (Illumina) Retrieved from https://www.epigentek.com/catalog/epinext-high-sensitivity-dna-library-preparation-kit-illumina-p-
3632.html
• Frese, K. S., Katus, H. A., & Meder, B. (2013). Next-generation sequencing: from understanding biology to personalized medicine. Biology (Basel), 2(1), 378-398. doi:10.3390/biology2010378
• Fuller, C. W., Middendorf, L. R., Benner, S. A., Church, G. M., Harris, T., Huang, X., . . . Vezenov, D. V. (2009). The challenges of sequencing by synthesis. Nat Biotechnol, 27(11), 1013-1023. doi:10.1038/nbt.1585
• Head, S. R., Komori, H. K., LaMere, S. A., Whisenant, T., Van Nieuwerburgh, F., Salomon, D. R., & Ordoukhanian, P. (2014). Library construction for next-generation sequencing: overviews and challenges.
Biotechniques, 56(2), 61-64, 66, 68, passim. doi:10.2144/000114133
• Heather, J. M., & Chain, B. (2016). The sequence of sequencers: The history of sequencing DNA. Genomics, 107(1), 1-8. doi:10.1016/j.ygeno.2015.11.003
• Hui, P. (2014). Next generation sequencing: chemistry, technology and applications. Top Curr Chem, 336, 1-18. doi:10.1007/128_2012_329
• Illumina. (2018). Patterned Flow Cell Technology. Retrieved from https://www.illumina.com/science/technology/next-generation-sequencing/sequencing-technology/patterned-flow-cells.html
• Illumina, I. (2015). An introduction to next-generation sequencing technology.
• Kchouk, M., Gibrat, J.-F., & Elloumi, M. (2017). Generations of sequencing technologies: From first to next generation. Biology and Medicine, 9(3).
• Kelly, B. T., Baret, J. C., Taly, V., & Griffiths, A. D. (2007). Miniaturizing chemistry and biology in microdroplets. Chem Commun (Camb)(18), 1773-1788. doi:10.1039/b616252e
• Liu, L., Li, Y., Li, S., Hu, N., He, Y., Pong, R., . . . Law, M. (2012). Comparison of next-generation sequencing systems. J Biomed Biotechnol, 2012, 251364. doi:10.1155/2012/251364
• Mardis, E. R. (2013). Next-generation sequencing platforms. Annu Rev Anal Chem (Palo Alto Calif), 6, 287-303. doi:10.1146/annurev-anchem-062012-092628
• Mardis, E. R. (2017). DNA sequencing technologies: 2006-2016. Nat Protoc, 12(2), 213-218. doi:10.1038/nprot.2016.182
• Marketa, Z., & Jeremy, O. (2008). Understanding bioinformatics. Garland Science.
• McClean, P. (2005). Genome Sequencing Retrieved from https://www.ndsu.edu/pubweb/~mcclean/plsc411/Genome%20Sequencing%20Complete.pdf
• Metzker, M. L. (2010). Sequencing technologies - the next generation. Nat Rev Genet, 11(1), 31-46. doi:10.1038/nrg2626
• nextgenseek. (2012). Comparing Price and Tech. Specs. of Illumina MiSeq, Ion Torrent PGM, 454 GS Junior, and PacBio RS. Retrieved from http://nextgenseek.com/2012/08/comparing-price-and-tech-specs-of-
illumina-miseq-ion-torrent-pgm-454-gs-junior-and-pacbio-rs/
• Pareek, C. S., Smoczynski, R., & Tretyn, A. (2011). Sequencing technologies and genome sequencing. J Appl Genet, 52(4), 413-435. doi:10.1007/s13353-011-0057-x
• Quail, M. A., Smith, M., Coupland, P., Otto, T. D., Harris, S. R., Connor, T. R., . . . Gu, Y. (2012). A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq
sequencers. BMC genomics, 13(1), 341.
• Reinert, K., Langmead, B., Weese, D., & Evers, D. J. (2015). Alignment of Next-Generation Sequencing Reads. Annu Rev Genomics Hum Genet, 16, 133-151. doi:10.1146/annurev-genom-090413-025358
• Rodney E. Shackelford, D. O., Ph.D. (30 May 2018). Pathology Outlines - Pyrosequencing. Retrieved from http://www.pathologyoutlines.com/topic/molecularpathdnaseqpyro.html
• Shendure, J., Balasubramanian, S., Church, G. M., Gilbert, W., Rogers, J., Schloss, J. A., & Waterston, R. H. (2017). DNA sequencing at 40: past, present and future. Nature, 550(7676), 345-353.
doi:10.1038/nature24286
• Shendure, J., & Ji, H. (2008). Next-generation DNA sequencing. Nat Biotechnol, 26(10), 1135-1145. doi:10.1038/nbt1486
• Shu, J. J., & Ouw, L. S. (2004). Pairwise alignment of the DNA sequence using hypercomplex number representation. Bull Math Biol, 66(5), 1423-1438. doi:10.1016/j.bulm.2004.01.005
• SnipAcademy. (2017). Bioinformatics Lessons. Retrieved from https://binf.snipcademy.com/lessons/
• Sterky, F., & Lundeberg, J. (2000). Sequence analysis of genes and genomes. Journal of Biotechnology, 76(1), 1-31.
• van Dijk, E. L., Auger, H., Jaszczyszyn, Y., & Thermes, C. (2014). Ten years of next-generation sequencing technology. Trends Genet, 30(9), 418-426. doi:10.1016/j.tig.2014.07.001
• van Dijk, E. L., Jaszczyszyn, Y., & Thermes, C. (2014). Library preparation methods for next-generation sequencing: tone down the bias. Exp Cell Res, 322(1), 12-20. doi:10.1016/j.yexcr.2014.01.008
• Voelkerding, K. V., Dames, S. A., & Durtschi, J. D. (2009). Next-generation sequencing: from basic research to diagnostics. Clin Chem, 55(4), 641-658. doi:10.1373/clinchem.2008.112789
• Yohe, S., & Thyagarajan, B. (2017). Review of Clinical Next-Generation Sequencing. Arch Pathol Lab Med, 141(11), 1544-1557. doi:10.5858/arpa.2016-0501-RA
42