SlideShare a Scribd company logo
SAJAN SINGH RATHORE
Roll No. :- MCA/25019/18
 SEQUENCING  Sequencing is the operation of
determining the precise order of nucleotides of a
given DNA molecule. It is used to determine the
order of the four bases adenine(A),guanine(G),
cytosine(C),and thymine(T), in a strand of DNA.
 SEQUENCE ASSEMBLY  The process of aligning
and merging fragments from a longer DNA
sequence in order to reconstruct the original
sequence is known as sequence Assembly.
 The shotgun sequencing method using the
Sanger sequencing operates as follows: The
target DNA molecule is broken into small
fragments , each of which is sequenced .
 Sequence is assembled by searching for overlaps
between the sequences of individual fragments.
 Whole- genome “shotgun” sequencing starts by
copying and fragmenting the DNA.
 Shotgun refers to the random fragmentation of
the whole genome
 EXAMPLE:-
 INPUT : GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT
 Copy: GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT
 GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT
 GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT
 GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT
Fragment: GGCGTCTA TATCTCGG CTCTAGGCCCTC ATTTTTT
GGC GTCTATAT CTCGGCTCTAGGCCCTCA TTTTTT
GGCGTC TATATCT CGGCTCTAGGCCCT CATTTTTT
GGCGTCTAT ATCTCGGCTCTAG GCCCTCA TTTTTT

 Assume sequencing produces such a large fragments that almost all
genome positions are covered by many fragments….
 CTAGGCCCTCAATTTTT
 CTCTAGGCCCTCAATTTTT
 GGCTCTAGGCCCTCATTTTTT
 CTCGGCTCTAGCCCCTCATTTT
 TATCTCGACTCTAGGCCCTCA
 TATCTCGACTCTAGGCC
 TCTATATCTCGGCTCTAGG
GGCGTCTATATCTCG
GGCGTCGATATCT
GGCGTCTATATCT
GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT (Reconstruct this).
 CTAGGCCCTCAATTTTT
 CTCTAGGCCCTCAATTTTT
 GGCTCTAGGCCCTCATTTTTT
 CTCGGCTCTAGCCCCTCATTTT
 TATCTCGACTCTAGGCCCTCA
 TATCTCGACTCTAGGCC
 TCTATATCTCGGCTCTAGG
 GGCGTCTATATCTCG Overlapping
GGCGTCGATATCT
GGCGTCTATATCT Coverage at this point = 6
 Key term: Usually it’s short term for average coverage: the average number
of reads covering a position in the genome.
 CTAGGCCCTCAATTTTT
 CTCTAGGCCCTCAATTTTT
 GGCTCTAGGCCCTCATTTTTT
 CTCGGCTCTAGCCCCTCATTTT
 TATCTCGACTCTAGGCCCTCA 177 nucleotides
 TATCTCGACTCTAGGCC
 TCTATATCTCGGCTCTAGG
GGCGTCTATATCTCG
GGCGTCGATATCT
GGCGTCTATATCT
GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT 35 nucleotides
Average coverage = 177 / 35 = 5.05
 Using Long reads :-
 Greater than 10,000bp reads are common.
 Higher error rate (5-15%).
 Key computational challenge: overcome high
error rate.
 Long read assembly pipeline :-
 First Reads  than build overlap graph  layout
(Bundle stretches of the overlap graph into
contigs )  consensus(Pick most likely
nucleotides sequence for each contig )  contigs

 HIGH accuracy, very high throughput.
 Short read length limits ability to resolve
repeats .
 Key computational challenge: efficiently
assemble large numbers of short reads.
 Large Genomes :-
 Short reads : ~10,000bp contigs.
 Long reads: ~1,000000bp contigs
 Long reads data is much more expensive.
Sequence Assembly

More Related Content

What's hot

Sequence comparison techniques
Sequence comparison techniquesSequence comparison techniques
Sequence comparison techniques
ruchibioinfo
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
Ramya S
 
Prosite
PrositeProsite
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
Athira RG
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysis
Dr. Naveen Gaurav srivastava
 
Automated DNA sequencing ; Protein sequencing
Automated DNA sequencing ; Protein sequencingAutomated DNA sequencing ; Protein sequencing
Automated DNA sequencing ; Protein sequencing
Rima Joseph
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expressionTapeshwar Yadav
 
Mapping Techniques - Fluorescent in situ Hybridization(FISH) and Sequence Tag...
Mapping Techniques - Fluorescent in situ Hybridization(FISH) and Sequence Tag...Mapping Techniques - Fluorescent in situ Hybridization(FISH) and Sequence Tag...
Mapping Techniques - Fluorescent in situ Hybridization(FISH) and Sequence Tag...
Girish Kumar K
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
Dr. Naveen Gaurav srivastava
 
DNA protein interaction.pptx
DNA protein interaction.pptxDNA protein interaction.pptx
DNA protein interaction.pptx
shwetaliprajapati
 
DNA footprinting
DNA footprintingDNA footprinting
DNA footprinting
Saajida Sultaana
 
Phylogenetic analysis
Phylogenetic analysis Phylogenetic analysis
Phylogenetic analysis
Nitin Naik
 
Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
Karan Veer Singh
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
naveed ul mushtaq
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicshemantbreeder
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignmentavrilcoghlan
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interaction
Aashish Patel
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
Shifa Ansari
 

What's hot (20)

Sequence comparison techniques
Sequence comparison techniquesSequence comparison techniques
Sequence comparison techniques
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Prosite
PrositeProsite
Prosite
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysis
 
Automated DNA sequencing ; Protein sequencing
Automated DNA sequencing ; Protein sequencingAutomated DNA sequencing ; Protein sequencing
Automated DNA sequencing ; Protein sequencing
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expression
 
Mapping Techniques - Fluorescent in situ Hybridization(FISH) and Sequence Tag...
Mapping Techniques - Fluorescent in situ Hybridization(FISH) and Sequence Tag...Mapping Techniques - Fluorescent in situ Hybridization(FISH) and Sequence Tag...
Mapping Techniques - Fluorescent in situ Hybridization(FISH) and Sequence Tag...
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
EMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology LaboratoryEMBL- European Molecular Biology Laboratory
EMBL- European Molecular Biology Laboratory
 
Sequence assembly
Sequence assemblySequence assembly
Sequence assembly
 
DNA protein interaction.pptx
DNA protein interaction.pptxDNA protein interaction.pptx
DNA protein interaction.pptx
 
DNA footprinting
DNA footprintingDNA footprinting
DNA footprinting
 
Phylogenetic analysis
Phylogenetic analysis Phylogenetic analysis
Phylogenetic analysis
 
Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Pairwise sequence alignment
Pairwise sequence alignmentPairwise sequence alignment
Pairwise sequence alignment
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interaction
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 

Similar to Sequence Assembly

DNA Compression (Encoded using Huffman Encoding Method)
DNA Compression (Encoded using Huffman Encoding Method)DNA Compression (Encoded using Huffman Encoding Method)
DNA Compression (Encoded using Huffman Encoding Method)
Marwa Al-Rikaby
 
ICLP 2009 doctoral consortium presentation; Logic-Statistic Models with Const...
ICLP 2009 doctoral consortium presentation; Logic-Statistic Models with Const...ICLP 2009 doctoral consortium presentation; Logic-Statistic Models with Const...
ICLP 2009 doctoral consortium presentation; Logic-Statistic Models with Const...Christian Have
 
Sage technology
Sage technologySage technology
Sage technology
Prasanthperceptron
 
Polymerase chain reaction (pcr) & its
Polymerase chain reaction (pcr) & itsPolymerase chain reaction (pcr) & its
Polymerase chain reaction (pcr) & its
deepak deshkar
 
Polymerase chain reaction (pcr) & its
Polymerase chain reaction (pcr) & itsPolymerase chain reaction (pcr) & its
Polymerase chain reaction (pcr) & its
deepak deshkar
 
Mreps efficient and flexible detection of tandem repeats in dna
Mreps  efficient and flexible detection of tandem repeats in dnaMreps  efficient and flexible detection of tandem repeats in dna
Mreps efficient and flexible detection of tandem repeats in dna
Fellowship at Vodafone FutureLab
 
MUTATION OF DNA IN AN ORGANISM DELETION INSERTION
MUTATION OF DNA IN AN ORGANISM DELETION INSERTIONMUTATION OF DNA IN AN ORGANISM DELETION INSERTION
MUTATION OF DNA IN AN ORGANISM DELETION INSERTION
MaryJoyBAtendido
 
Gene Mutations
Gene MutationsGene Mutations
Gene Mutations
Ahmad Raza
 
Intro chapter 10 part2a
Intro chapter 10 part2aIntro chapter 10 part2a
Intro chapter 10 part2amjbarba
 
Ysr Presentation Animesh Rev
Ysr Presentation Animesh RevYsr Presentation Animesh Rev
Ysr Presentation Animesh Rev
sharma_animesh
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly Problem
Mark Chang
 
proteome.pptx
proteome.pptxproteome.pptx
proteome.pptx
MohamedHasan816582
 
Dna translation and protein synthesis
Dna translation and protein synthesisDna translation and protein synthesis
Dna translation and protein synthesissbarkanic
 
Clinical significance of transcript alignment discrepancies gne - 20141016
Clinical significance of transcript alignment discrepancies   gne - 20141016Clinical significance of transcript alignment discrepancies   gne - 20141016
Clinical significance of transcript alignment discrepancies gne - 20141016
Reece Hart
 

Similar to Sequence Assembly (20)

DNA Compression (Encoded using Huffman Encoding Method)
DNA Compression (Encoded using Huffman Encoding Method)DNA Compression (Encoded using Huffman Encoding Method)
DNA Compression (Encoded using Huffman Encoding Method)
 
ICLP 2009 doctoral consortium presentation; Logic-Statistic Models with Const...
ICLP 2009 doctoral consortium presentation; Logic-Statistic Models with Const...ICLP 2009 doctoral consortium presentation; Logic-Statistic Models with Const...
ICLP 2009 doctoral consortium presentation; Logic-Statistic Models with Const...
 
Sage technology
Sage technologySage technology
Sage technology
 
Polymerase chain reaction (pcr) & its
Polymerase chain reaction (pcr) & itsPolymerase chain reaction (pcr) & its
Polymerase chain reaction (pcr) & its
 
Polymerase chain reaction (pcr) & its
Polymerase chain reaction (pcr) & itsPolymerase chain reaction (pcr) & its
Polymerase chain reaction (pcr) & its
 
cloning
cloningcloning
cloning
 
cloning
cloningcloning
cloning
 
C:\fakepath\cloning
C:\fakepath\cloningC:\fakepath\cloning
C:\fakepath\cloning
 
Cloning
CloningCloning
Cloning
 
Cloning
CloningCloning
Cloning
 
Gemoda
GemodaGemoda
Gemoda
 
Mreps efficient and flexible detection of tandem repeats in dna
Mreps  efficient and flexible detection of tandem repeats in dnaMreps  efficient and flexible detection of tandem repeats in dna
Mreps efficient and flexible detection of tandem repeats in dna
 
MUTATION OF DNA IN AN ORGANISM DELETION INSERTION
MUTATION OF DNA IN AN ORGANISM DELETION INSERTIONMUTATION OF DNA IN AN ORGANISM DELETION INSERTION
MUTATION OF DNA IN AN ORGANISM DELETION INSERTION
 
Gene Mutations
Gene MutationsGene Mutations
Gene Mutations
 
Intro chapter 10 part2a
Intro chapter 10 part2aIntro chapter 10 part2a
Intro chapter 10 part2a
 
Ysr Presentation Animesh Rev
Ysr Presentation Animesh RevYsr Presentation Animesh Rev
Ysr Presentation Animesh Rev
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly Problem
 
proteome.pptx
proteome.pptxproteome.pptx
proteome.pptx
 
Dna translation and protein synthesis
Dna translation and protein synthesisDna translation and protein synthesis
Dna translation and protein synthesis
 
Clinical significance of transcript alignment discrepancies gne - 20141016
Clinical significance of transcript alignment discrepancies   gne - 20141016Clinical significance of transcript alignment discrepancies   gne - 20141016
Clinical significance of transcript alignment discrepancies gne - 20141016
 

More from Meghaj Mallick

24 partial-orderings
24 partial-orderings24 partial-orderings
24 partial-orderings
Meghaj Mallick
 
PORTFOLIO BY USING HTML & CSS
PORTFOLIO BY USING HTML & CSSPORTFOLIO BY USING HTML & CSS
PORTFOLIO BY USING HTML & CSS
Meghaj Mallick
 
Introduction to Software Testing
Introduction to Software TestingIntroduction to Software Testing
Introduction to Software Testing
Meghaj Mallick
 
Introduction to System Programming
Introduction to System ProgrammingIntroduction to System Programming
Introduction to System Programming
Meghaj Mallick
 
MACRO ASSEBLER
MACRO ASSEBLERMACRO ASSEBLER
MACRO ASSEBLER
Meghaj Mallick
 
Icons, Image & Multimedia
Icons, Image & MultimediaIcons, Image & Multimedia
Icons, Image & Multimedia
Meghaj Mallick
 
Project Tracking & SPC
Project Tracking & SPCProject Tracking & SPC
Project Tracking & SPC
Meghaj Mallick
 
Peephole Optimization
Peephole OptimizationPeephole Optimization
Peephole Optimization
Meghaj Mallick
 
Routing in MANET
Routing in MANETRouting in MANET
Routing in MANET
Meghaj Mallick
 
Macro assembler
 Macro assembler Macro assembler
Macro assembler
Meghaj Mallick
 
Architecture and security in Vanet PPT
Architecture and security in Vanet PPTArchitecture and security in Vanet PPT
Architecture and security in Vanet PPT
Meghaj Mallick
 
Design Model & User Interface Design in Software Engineering
Design Model & User Interface Design in Software EngineeringDesign Model & User Interface Design in Software Engineering
Design Model & User Interface Design in Software Engineering
Meghaj Mallick
 
Text Mining of Twitter in Data Mining
Text Mining of Twitter in Data MiningText Mining of Twitter in Data Mining
Text Mining of Twitter in Data Mining
Meghaj Mallick
 
DFS & BFS in Computer Algorithm
DFS & BFS in Computer AlgorithmDFS & BFS in Computer Algorithm
DFS & BFS in Computer Algorithm
Meghaj Mallick
 
Software Development Method
Software Development MethodSoftware Development Method
Software Development Method
Meghaj Mallick
 
Secant method in Numerical & Statistical Method
Secant method in Numerical & Statistical MethodSecant method in Numerical & Statistical Method
Secant method in Numerical & Statistical Method
Meghaj Mallick
 
Motivation in Organization
Motivation in OrganizationMotivation in Organization
Motivation in Organization
Meghaj Mallick
 
Communication Skill
Communication SkillCommunication Skill
Communication Skill
Meghaj Mallick
 
Partial-Orderings in Discrete Mathematics
 Partial-Orderings in Discrete Mathematics Partial-Orderings in Discrete Mathematics
Partial-Orderings in Discrete Mathematics
Meghaj Mallick
 
Hashing In Data Structure
Hashing In Data Structure Hashing In Data Structure
Hashing In Data Structure
Meghaj Mallick
 

More from Meghaj Mallick (20)

24 partial-orderings
24 partial-orderings24 partial-orderings
24 partial-orderings
 
PORTFOLIO BY USING HTML & CSS
PORTFOLIO BY USING HTML & CSSPORTFOLIO BY USING HTML & CSS
PORTFOLIO BY USING HTML & CSS
 
Introduction to Software Testing
Introduction to Software TestingIntroduction to Software Testing
Introduction to Software Testing
 
Introduction to System Programming
Introduction to System ProgrammingIntroduction to System Programming
Introduction to System Programming
 
MACRO ASSEBLER
MACRO ASSEBLERMACRO ASSEBLER
MACRO ASSEBLER
 
Icons, Image & Multimedia
Icons, Image & MultimediaIcons, Image & Multimedia
Icons, Image & Multimedia
 
Project Tracking & SPC
Project Tracking & SPCProject Tracking & SPC
Project Tracking & SPC
 
Peephole Optimization
Peephole OptimizationPeephole Optimization
Peephole Optimization
 
Routing in MANET
Routing in MANETRouting in MANET
Routing in MANET
 
Macro assembler
 Macro assembler Macro assembler
Macro assembler
 
Architecture and security in Vanet PPT
Architecture and security in Vanet PPTArchitecture and security in Vanet PPT
Architecture and security in Vanet PPT
 
Design Model & User Interface Design in Software Engineering
Design Model & User Interface Design in Software EngineeringDesign Model & User Interface Design in Software Engineering
Design Model & User Interface Design in Software Engineering
 
Text Mining of Twitter in Data Mining
Text Mining of Twitter in Data MiningText Mining of Twitter in Data Mining
Text Mining of Twitter in Data Mining
 
DFS & BFS in Computer Algorithm
DFS & BFS in Computer AlgorithmDFS & BFS in Computer Algorithm
DFS & BFS in Computer Algorithm
 
Software Development Method
Software Development MethodSoftware Development Method
Software Development Method
 
Secant method in Numerical & Statistical Method
Secant method in Numerical & Statistical MethodSecant method in Numerical & Statistical Method
Secant method in Numerical & Statistical Method
 
Motivation in Organization
Motivation in OrganizationMotivation in Organization
Motivation in Organization
 
Communication Skill
Communication SkillCommunication Skill
Communication Skill
 
Partial-Orderings in Discrete Mathematics
 Partial-Orderings in Discrete Mathematics Partial-Orderings in Discrete Mathematics
Partial-Orderings in Discrete Mathematics
 
Hashing In Data Structure
Hashing In Data Structure Hashing In Data Structure
Hashing In Data Structure
 

Recently uploaded

Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
Frederic Leger
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
amekonnen
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
Faculty of Medicine And Health Sciences
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
gharris9
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Sebastiano Panichella
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AwangAniqkmals
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
Howard Spence
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
Sebastiano Panichella
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
gharris9
 
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Rosie Wells
 
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
SkillCertProExams
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
khadija278284
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
kkirkland2
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Sebastiano Panichella
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Dutch Power
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Dutch Power
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
faizulhassanfaiz1670
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
eCommerce Institute
 

Recently uploaded (19)

Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf2024-05-30_meetup_devops_aix-marseille.pdf
2024-05-30_meetup_devops_aix-marseille.pdf
 
Tom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issueTom tresser burning issue.pptx My Burning issue
Tom tresser burning issue.pptx My Burning issue
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
 
Gregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptxGregory Harris' Civics Presentation.pptx
Gregory Harris' Civics Presentation.pptx
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
 
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
AWANG ANIQKMALBIN AWANG TAJUDIN B22080004 ASSIGNMENT 2 MPU3193 PHILOSOPHY AND...
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
 
Gregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics PresentationGregory Harris - Cycle 2 - Civics Presentation
Gregory Harris - Cycle 2 - Civics Presentation
 
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie WellsCollapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
Collapsing Narratives: Exploring Non-Linearity • a micro report by Rosie Wells
 
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
Mastering the Concepts Tested in the Databricks Certified Data Engineer Assoc...
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
 
Burning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdfBurning Issue Presentation By Kenmaryon.pdf
Burning Issue Presentation By Kenmaryon.pdf
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
 
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
Presentatie 4. Jochen Cremer - TU Delft 28 mei 2024
 
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
Presentatie 8. Joost van der Linde & Daniel Anderton - Eliq 28 mei 2024
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
 

Sequence Assembly

  • 1. SAJAN SINGH RATHORE Roll No. :- MCA/25019/18
  • 2.  SEQUENCING  Sequencing is the operation of determining the precise order of nucleotides of a given DNA molecule. It is used to determine the order of the four bases adenine(A),guanine(G), cytosine(C),and thymine(T), in a strand of DNA.  SEQUENCE ASSEMBLY  The process of aligning and merging fragments from a longer DNA sequence in order to reconstruct the original sequence is known as sequence Assembly.
  • 3.  The shotgun sequencing method using the Sanger sequencing operates as follows: The target DNA molecule is broken into small fragments , each of which is sequenced .  Sequence is assembled by searching for overlaps between the sequences of individual fragments.  Whole- genome “shotgun” sequencing starts by copying and fragmenting the DNA.  Shotgun refers to the random fragmentation of the whole genome
  • 4.  EXAMPLE:-  INPUT : GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT  Copy: GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT  GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT  GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT  GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT Fragment: GGCGTCTA TATCTCGG CTCTAGGCCCTC ATTTTTT GGC GTCTATAT CTCGGCTCTAGGCCCTCA TTTTTT GGCGTC TATATCT CGGCTCTAGGCCCT CATTTTTT GGCGTCTAT ATCTCGGCTCTAG GCCCTCA TTTTTT 
  • 5.  Assume sequencing produces such a large fragments that almost all genome positions are covered by many fragments….  CTAGGCCCTCAATTTTT  CTCTAGGCCCTCAATTTTT  GGCTCTAGGCCCTCATTTTTT  CTCGGCTCTAGCCCCTCATTTT  TATCTCGACTCTAGGCCCTCA  TATCTCGACTCTAGGCC  TCTATATCTCGGCTCTAGG GGCGTCTATATCTCG GGCGTCGATATCT GGCGTCTATATCT GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT (Reconstruct this).
  • 6.  CTAGGCCCTCAATTTTT  CTCTAGGCCCTCAATTTTT  GGCTCTAGGCCCTCATTTTTT  CTCGGCTCTAGCCCCTCATTTT  TATCTCGACTCTAGGCCCTCA  TATCTCGACTCTAGGCC  TCTATATCTCGGCTCTAGG  GGCGTCTATATCTCG Overlapping GGCGTCGATATCT GGCGTCTATATCT Coverage at this point = 6
  • 7.  Key term: Usually it’s short term for average coverage: the average number of reads covering a position in the genome.  CTAGGCCCTCAATTTTT  CTCTAGGCCCTCAATTTTT  GGCTCTAGGCCCTCATTTTTT  CTCGGCTCTAGCCCCTCATTTT  TATCTCGACTCTAGGCCCTCA 177 nucleotides  TATCTCGACTCTAGGCC  TCTATATCTCGGCTCTAGG GGCGTCTATATCTCG GGCGTCGATATCT GGCGTCTATATCT GGCGTCTATATCTCGGCTCTAGGCCCTCATTTTTT 35 nucleotides Average coverage = 177 / 35 = 5.05
  • 8.  Using Long reads :-  Greater than 10,000bp reads are common.  Higher error rate (5-15%).  Key computational challenge: overcome high error rate.  Long read assembly pipeline :-  First Reads  than build overlap graph  layout (Bundle stretches of the overlap graph into contigs )  consensus(Pick most likely nucleotides sequence for each contig )  contigs 
  • 9.  HIGH accuracy, very high throughput.  Short read length limits ability to resolve repeats .  Key computational challenge: efficiently assemble large numbers of short reads.  Large Genomes :-  Short reads : ~10,000bp contigs.  Long reads: ~1,000000bp contigs  Long reads data is much more expensive.