SlideShare a Scribd company logo
1 of 22
BIRLA INSTITUTE OF TECHNOLOGY MESRA,
JAIPUR CAMPUS
NAME :- NIKHIL AGRAWAL
ROLL NO :- MCA/25004/18
TOPIC:- Sequence Alignment
Sequence Alignment
 Sequence alignment is a way of arranging sequences of
DNA,RNA or protein to identify regions of similarity .The
similarity may indicate the functional , structural and
evolutionary significance of the sequence.
 The known sequence is called reference sequence . The
unknown sequence is called query sequence.
Interpretation of sequence
alignment
 Sequence alignment is useful for discovering structural, functional and
evolutionary information.
 Sequences that are very much alike may have similar secondary and 3D
structure, similar function and likely a common ancestral sequence. It is
extremely unlikely that such sequences obtained similarity by chance. For
DNA molecules with n nucleotides such probability is very low P = 4-n . For
proteins the probability even much lower P = 20-n, where n is a number of
amino acid residues.
 Large scale genome studies revealed existence of horizontal transfer of
genes and other sequences between species, which may cause similarity
between some sequences in very distant species.
Alignment
 Alignment is the task of locating “equivalent” regions of two or more
sequences to maximize their similarity
 NIKESH NARAYANAN (RED : Mismatches)
 NIGESH NARAYAN- - ( gaps )
 Alignment can reveal homology between sequences.
 Similarity is descriptive term that tells about the degree of match between
the two sequences
 Sequence similarity does not always imply a common function.
 Conserved function does not always imply similarity at the sequence level.
Scoring Alignments: The Main
Principles
 Alignments of related sequences is expected to give good
scores compared with alignments of randomly chosen
sequences.
 The correct alignment of two related sequences should ideally
be the one that gives the best score.
Classifications of sequence
alignments
 Based on Completeness
 Global
 Local
 Based on Numbers
 Pair wise alignment
 Multiple sequence Alignment
Global/local sequence alignment
1. Global alignment
 Input: treat the two sequences as potentially equivalent
 Goal: identify conserved regions and differences
 Algorithm: Needleman-Wunsch dynamic programming
 Applications:
 Comparing two genes with same function (in human vs. mouse).
 Comparing two proteins with similar function
Example :
Global/local sequence alignment
2. Local alignment
 Input: The two sequences may or may not be related
 Goal: see whether a substring in one sequence aligns well with a substring
in the other
 Algorithm: Smith-Waterman dynamic programming
 Note: for local matching, overhangs at the ends are not treated as gaps
 Applications:
 Searching for local similarities in large sequences (e.g., newly sequenced
genomes).
 Looking for conserved domains or motifs in two proteins.
Example:
Pairwise/multiple sequence
alignment
 Pairwise sequence alignment
 The process of lining up two sequences to achieve maximal levels of
identity (and conservation, for amino acid sequences) for the purpose
of assessing the degree of similarity and the possibility of homology.
 A pairwise sequence alignment is an alignment of 2 sequences
obtained by inserting gaps (“-”) such that the resulting sequences
the same length and where each pair of residues represents a
homologous position.
Pairwise/multiple sequence
alignment
 Multiple sequence alignment (MSA)
 Multiple sequence alignment (MSA) can be seen as a generalization of
Sequence Alignment - instead of aligning two sequences, n sequences are
aligned simultaneously, where n is > 2 .
 Definition: A multiple sequence alignment is an alignment of n > 2 sequences
obtained by inserting gaps (“-”) into sequences such that the resulting
sequences have all length L and can be arranged in a matrix of N rows and L
columns where each column represents a homologous position.
 To construct a multiple alignment, one may have to introduce gaps in
sequences at positions where there were no gaps in the corresponding
alignment Multiple alignments typically contain more gaps than any given
of aligned sequences.
Which algorithm to use for
database similarity search?
 BLAST > FASTA > Smith-Waterman (It is VERY SLOW
and uses a LOT OF COMPUTER POWER)
 FASTA is more sensitive, misses less homologues.
 Smith-Waterman is even more sensitive.
 BLAST(basic local alignment search tool) calculates
probabilities .
 FASTA more accurate for DNA-DNA search then
BLAST.
Method of sequence
Alignment
 Dot matrix method
 The dynamic programming (DP) algorithm
 Word or k-tuple methods
Dot matrix analysis
 A dot matrix is a grid system where the similar nucleotides of two DNA sequences
are represented as dots.
 It also called dot plots.
 It is a pairwise sequence alignment made in the computer.
 The dots appear as colorless dots in the computer screen.
 In dot matrix , nucleotides of one sequence are written from the left to right on the
top row and those of the other sequence are written from the top to bottom on the
left side (column) of the matrix . At every point, where the two nucleotides are the
same , a dot in the intersection of row and column becomes a dark dot. When all
these darken dots are connected, it gives a graph called dot plot. The line found in
the dot plot is called recurrence plot. Each dot in the plot represents a matching
nucleotide or amino acid.
Dot matrix analysis
 Dot matrix method is a qualitative and simple to analyze
sequences however ,it takes much time to analyze large
sequences.
 Dot matrix method is useful for the following studies :
 Sequence similarity between two nucleotide sequences or two
amino acid sequences.
 Insertion of short stretches in DNA or amino acid sequence.
 Deletion of short stretches from a DNA or amino acid sequence.
 Repeats or inserted repeats in a DNA or amino acid sequence.
Dot matrix analysis: two similar
sequences
 Nucleic Acids Dot Plots of genes
Dynamic Programming Method
 Dynamic programming method is the process of solving problems where one
needs to find the best decision one after another.
 It was introduced by Richard Bellman in 1940.
 The word programming here denotes finding an acceptable plan of action not
computer programming.
 It is useful in aligning nucleotide sequence of DNA and amino acid sequence
of proteins coded by that DNA .
 Dynamic programming is a three step process that involves :
1. Breaking of the problem into small subproblems.
2. Solving subproblems using recursive methods.
3. Construction of optimal solutions for original problem using the optimal solutions
Dynamic programming algorithm
for sequence alignment
 The method compares every pair of characters in the two sequences and
generates an alignment, which is the best or optimal.
 This is a highly computationally demanding method. However the latest
algorithmic improvements and ever increasing computer capacity make
possible to align a query sequence against a large database in a few minutes.
 Each alignments has its own score and it is essential to recognize that several
different alignments may have nearly identical scores, which is an indication
that the dynamic programming methods may produce more than one optimal
alignment. However intelligent manipulation of some parameters is important
and may discriminate the alignments with similar scores.
 Global alignment program is based on Needleman-Wunsch algorithm and
local alignment on Smith-Waterman. Both algorithms are derivates from the
basic dynamic programming algorithm.
Word Method or K-tuple Method
 It is used to find an optimal alignment solution, but it is more than
dynamic programming .
 This method is useful in large-scale database searches to find
whether there is significant match available with the query
sequence.
 Word method is used in the database search tools FASTA and the
BLAST family .
 They identify a series of short ,non-overlapping subsequences
(words) of the query sequence.
 Then they are matched to candidate database sequences to get
result .
Word Method or K-tuple
Method
 In the FASTA method ,the user defines a value k to use as the word length
to search the database .It is slower but more sensitive at lower values of k
.They are also preferred for searches involving a very short query sequence
.
 The BLAST provides a number of algorithms optimized for particular types
of queries ,for distantly related sequence matches.
 It is a good alternative to FASTA .However , the results are not very
accurate .
THANK YOU.

More Related Content

What's hot

Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmProshantaShil
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matricesAshwini
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-naveed ul mushtaq
 
History and scope in bioinformatics
History and scope in bioinformaticsHistory and scope in bioinformatics
History and scope in bioinformaticsKAUSHAL SAHU
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fastaALLIENU
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencingShital Pal
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENTMariya Raju
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignmentRamya S
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence AlignmentRavi Gandham
 
sequence alignment
sequence alignmentsequence alignment
sequence alignmentammar kareem
 
History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)Madan Kumar Ca
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsNikesh Narayanan
 

What's hot (20)

Needleman-Wunsch Algorithm
Needleman-Wunsch AlgorithmNeedleman-Wunsch Algorithm
Needleman-Wunsch Algorithm
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
 
dot plot analysis
dot plot analysisdot plot analysis
dot plot analysis
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 
History and scope in bioinformatics
History and scope in bioinformaticsHistory and scope in bioinformatics
History and scope in bioinformatics
 
SEQUENCE ANALYSIS
SEQUENCE ANALYSISSEQUENCE ANALYSIS
SEQUENCE ANALYSIS
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
NCBI
NCBINCBI
NCBI
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
UPGMA
UPGMAUPGMA
UPGMA
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)History and devolopment of bioinfomatics.ppt (1)
History and devolopment of bioinfomatics.ppt (1)
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 
Est database
Est databaseEst database
Est database
 
Fasta
FastaFasta
Fasta
 

Similar to Sequence Alignment

Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...journal ijrtem
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...IJRTEMJOURNAL
 
Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02PILLAI ASWATHY VISWANATH
 
Bioinformatics_Sequence Analysis
Bioinformatics_Sequence AnalysisBioinformatics_Sequence Analysis
Bioinformatics_Sequence AnalysisSangeeta Das
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...IJCSEIT Journal
 
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsseham15
 
A Comparison of Computation Techniques for DNA Sequence Comparison
A Comparison of Computation Techniques for DNA Sequence Comparison A Comparison of Computation Techniques for DNA Sequence Comparison
A Comparison of Computation Techniques for DNA Sequence Comparison IJORCS
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptxericndunek
 
multiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdfmultiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdfsriaisvariyasundar
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfH K Yoon
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignmentKubuldinho
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignmentbarathvaj
 
Performance Efficient DNA Sequence Detectionalgo
Performance Efficient DNA Sequence DetectionalgoPerformance Efficient DNA Sequence Detectionalgo
Performance Efficient DNA Sequence DetectionalgoRahul Shirude
 

Similar to Sequence Alignment (20)

Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
 
Seq alignment
Seq alignment Seq alignment
Seq alignment
 
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
 
Sequence alignment.pptx
Sequence alignment.pptxSequence alignment.pptx
Sequence alignment.pptx
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
Sequence database
Sequence databaseSequence database
Sequence database
 
Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02Sequencealignmentinbioinformatics 100204112518-phpapp02
Sequencealignmentinbioinformatics 100204112518-phpapp02
 
Bioinformatics_Sequence Analysis
Bioinformatics_Sequence AnalysisBioinformatics_Sequence Analysis
Bioinformatics_Sequence Analysis
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...
 
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignments
 
A Comparison of Computation Techniques for DNA Sequence Comparison
A Comparison of Computation Techniques for DNA Sequence Comparison A Comparison of Computation Techniques for DNA Sequence Comparison
A Comparison of Computation Techniques for DNA Sequence Comparison
 
Lecture 5.pptx
Lecture 5.pptxLecture 5.pptx
Lecture 5.pptx
 
multiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdfmultiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdf
 
Parwati sihag
Parwati sihagParwati sihag
Parwati sihag
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdf
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignment
 
Performance Efficient DNA Sequence Detectionalgo
Performance Efficient DNA Sequence DetectionalgoPerformance Efficient DNA Sequence Detectionalgo
Performance Efficient DNA Sequence Detectionalgo
 

More from Meghaj Mallick

PORTFOLIO BY USING HTML & CSS
PORTFOLIO BY USING HTML & CSSPORTFOLIO BY USING HTML & CSS
PORTFOLIO BY USING HTML & CSSMeghaj Mallick
 
Introduction to Software Testing
Introduction to Software TestingIntroduction to Software Testing
Introduction to Software TestingMeghaj Mallick
 
Introduction to System Programming
Introduction to System ProgrammingIntroduction to System Programming
Introduction to System ProgrammingMeghaj Mallick
 
Icons, Image & Multimedia
Icons, Image & MultimediaIcons, Image & Multimedia
Icons, Image & MultimediaMeghaj Mallick
 
Project Tracking & SPC
Project Tracking & SPCProject Tracking & SPC
Project Tracking & SPCMeghaj Mallick
 
Architecture and security in Vanet PPT
Architecture and security in Vanet PPTArchitecture and security in Vanet PPT
Architecture and security in Vanet PPTMeghaj Mallick
 
Design Model & User Interface Design in Software Engineering
Design Model & User Interface Design in Software EngineeringDesign Model & User Interface Design in Software Engineering
Design Model & User Interface Design in Software EngineeringMeghaj Mallick
 
Text Mining of Twitter in Data Mining
Text Mining of Twitter in Data MiningText Mining of Twitter in Data Mining
Text Mining of Twitter in Data MiningMeghaj Mallick
 
DFS & BFS in Computer Algorithm
DFS & BFS in Computer AlgorithmDFS & BFS in Computer Algorithm
DFS & BFS in Computer AlgorithmMeghaj Mallick
 
Software Development Method
Software Development MethodSoftware Development Method
Software Development MethodMeghaj Mallick
 
Secant method in Numerical & Statistical Method
Secant method in Numerical & Statistical MethodSecant method in Numerical & Statistical Method
Secant method in Numerical & Statistical MethodMeghaj Mallick
 
Motivation in Organization
Motivation in OrganizationMotivation in Organization
Motivation in OrganizationMeghaj Mallick
 
Partial-Orderings in Discrete Mathematics
 Partial-Orderings in Discrete Mathematics Partial-Orderings in Discrete Mathematics
Partial-Orderings in Discrete MathematicsMeghaj Mallick
 
Hashing In Data Structure
Hashing In Data Structure Hashing In Data Structure
Hashing In Data Structure Meghaj Mallick
 

More from Meghaj Mallick (20)

24 partial-orderings
24 partial-orderings24 partial-orderings
24 partial-orderings
 
PORTFOLIO BY USING HTML & CSS
PORTFOLIO BY USING HTML & CSSPORTFOLIO BY USING HTML & CSS
PORTFOLIO BY USING HTML & CSS
 
Introduction to Software Testing
Introduction to Software TestingIntroduction to Software Testing
Introduction to Software Testing
 
Introduction to System Programming
Introduction to System ProgrammingIntroduction to System Programming
Introduction to System Programming
 
MACRO ASSEBLER
MACRO ASSEBLERMACRO ASSEBLER
MACRO ASSEBLER
 
Icons, Image & Multimedia
Icons, Image & MultimediaIcons, Image & Multimedia
Icons, Image & Multimedia
 
Project Tracking & SPC
Project Tracking & SPCProject Tracking & SPC
Project Tracking & SPC
 
Peephole Optimization
Peephole OptimizationPeephole Optimization
Peephole Optimization
 
Routing in MANET
Routing in MANETRouting in MANET
Routing in MANET
 
Macro assembler
 Macro assembler Macro assembler
Macro assembler
 
Architecture and security in Vanet PPT
Architecture and security in Vanet PPTArchitecture and security in Vanet PPT
Architecture and security in Vanet PPT
 
Design Model & User Interface Design in Software Engineering
Design Model & User Interface Design in Software EngineeringDesign Model & User Interface Design in Software Engineering
Design Model & User Interface Design in Software Engineering
 
Text Mining of Twitter in Data Mining
Text Mining of Twitter in Data MiningText Mining of Twitter in Data Mining
Text Mining of Twitter in Data Mining
 
DFS & BFS in Computer Algorithm
DFS & BFS in Computer AlgorithmDFS & BFS in Computer Algorithm
DFS & BFS in Computer Algorithm
 
Software Development Method
Software Development MethodSoftware Development Method
Software Development Method
 
Secant method in Numerical & Statistical Method
Secant method in Numerical & Statistical MethodSecant method in Numerical & Statistical Method
Secant method in Numerical & Statistical Method
 
Motivation in Organization
Motivation in OrganizationMotivation in Organization
Motivation in Organization
 
Communication Skill
Communication SkillCommunication Skill
Communication Skill
 
Partial-Orderings in Discrete Mathematics
 Partial-Orderings in Discrete Mathematics Partial-Orderings in Discrete Mathematics
Partial-Orderings in Discrete Mathematics
 
Hashing In Data Structure
Hashing In Data Structure Hashing In Data Structure
Hashing In Data Structure
 

Recently uploaded

Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoKayode Fayemi
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxraffaeleoman
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalFabian de Rijk
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...amilabibi1
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfSenaatti-kiinteistöt
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatmentnswingard
 
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Pooja Nehwal
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfSkillCertProExams
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Baileyhlharris
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar TrainingKylaCullinane
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIINhPhngng3
 
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedDelhi Call girls
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Delhi Call girls
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaKayode Fayemi
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lodhisaajjda
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCamilleBoulbin1
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Vipesco
 

Recently uploaded (18)

Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
Digital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of DrupalDigital collaboration with Microsoft 365 as extension of Drupal
Digital collaboration with Microsoft 365 as extension of Drupal
 
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
Bring back lost lover in USA, Canada ,Uk ,Australia ,London Lost Love Spell C...
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
Aesthetic Colaba Mumbai Cst Call girls 📞 7738631006 Grant road Call Girls ❤️-...
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Noida Escorts | 100% verified
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 

Sequence Alignment

  • 1. BIRLA INSTITUTE OF TECHNOLOGY MESRA, JAIPUR CAMPUS NAME :- NIKHIL AGRAWAL ROLL NO :- MCA/25004/18 TOPIC:- Sequence Alignment
  • 2. Sequence Alignment  Sequence alignment is a way of arranging sequences of DNA,RNA or protein to identify regions of similarity .The similarity may indicate the functional , structural and evolutionary significance of the sequence.  The known sequence is called reference sequence . The unknown sequence is called query sequence.
  • 3. Interpretation of sequence alignment  Sequence alignment is useful for discovering structural, functional and evolutionary information.  Sequences that are very much alike may have similar secondary and 3D structure, similar function and likely a common ancestral sequence. It is extremely unlikely that such sequences obtained similarity by chance. For DNA molecules with n nucleotides such probability is very low P = 4-n . For proteins the probability even much lower P = 20-n, where n is a number of amino acid residues.  Large scale genome studies revealed existence of horizontal transfer of genes and other sequences between species, which may cause similarity between some sequences in very distant species.
  • 4. Alignment  Alignment is the task of locating “equivalent” regions of two or more sequences to maximize their similarity  NIKESH NARAYANAN (RED : Mismatches)  NIGESH NARAYAN- - ( gaps )  Alignment can reveal homology between sequences.  Similarity is descriptive term that tells about the degree of match between the two sequences  Sequence similarity does not always imply a common function.  Conserved function does not always imply similarity at the sequence level.
  • 5. Scoring Alignments: The Main Principles  Alignments of related sequences is expected to give good scores compared with alignments of randomly chosen sequences.  The correct alignment of two related sequences should ideally be the one that gives the best score.
  • 6. Classifications of sequence alignments  Based on Completeness  Global  Local  Based on Numbers  Pair wise alignment  Multiple sequence Alignment
  • 7. Global/local sequence alignment 1. Global alignment  Input: treat the two sequences as potentially equivalent  Goal: identify conserved regions and differences  Algorithm: Needleman-Wunsch dynamic programming  Applications:  Comparing two genes with same function (in human vs. mouse).  Comparing two proteins with similar function
  • 9. Global/local sequence alignment 2. Local alignment  Input: The two sequences may or may not be related  Goal: see whether a substring in one sequence aligns well with a substring in the other  Algorithm: Smith-Waterman dynamic programming  Note: for local matching, overhangs at the ends are not treated as gaps  Applications:  Searching for local similarities in large sequences (e.g., newly sequenced genomes).  Looking for conserved domains or motifs in two proteins.
  • 11. Pairwise/multiple sequence alignment  Pairwise sequence alignment  The process of lining up two sequences to achieve maximal levels of identity (and conservation, for amino acid sequences) for the purpose of assessing the degree of similarity and the possibility of homology.  A pairwise sequence alignment is an alignment of 2 sequences obtained by inserting gaps (“-”) such that the resulting sequences the same length and where each pair of residues represents a homologous position.
  • 12. Pairwise/multiple sequence alignment  Multiple sequence alignment (MSA)  Multiple sequence alignment (MSA) can be seen as a generalization of Sequence Alignment - instead of aligning two sequences, n sequences are aligned simultaneously, where n is > 2 .  Definition: A multiple sequence alignment is an alignment of n > 2 sequences obtained by inserting gaps (“-”) into sequences such that the resulting sequences have all length L and can be arranged in a matrix of N rows and L columns where each column represents a homologous position.  To construct a multiple alignment, one may have to introduce gaps in sequences at positions where there were no gaps in the corresponding alignment Multiple alignments typically contain more gaps than any given of aligned sequences.
  • 13. Which algorithm to use for database similarity search?  BLAST > FASTA > Smith-Waterman (It is VERY SLOW and uses a LOT OF COMPUTER POWER)  FASTA is more sensitive, misses less homologues.  Smith-Waterman is even more sensitive.  BLAST(basic local alignment search tool) calculates probabilities .  FASTA more accurate for DNA-DNA search then BLAST.
  • 14. Method of sequence Alignment  Dot matrix method  The dynamic programming (DP) algorithm  Word or k-tuple methods
  • 15. Dot matrix analysis  A dot matrix is a grid system where the similar nucleotides of two DNA sequences are represented as dots.  It also called dot plots.  It is a pairwise sequence alignment made in the computer.  The dots appear as colorless dots in the computer screen.  In dot matrix , nucleotides of one sequence are written from the left to right on the top row and those of the other sequence are written from the top to bottom on the left side (column) of the matrix . At every point, where the two nucleotides are the same , a dot in the intersection of row and column becomes a dark dot. When all these darken dots are connected, it gives a graph called dot plot. The line found in the dot plot is called recurrence plot. Each dot in the plot represents a matching nucleotide or amino acid.
  • 16. Dot matrix analysis  Dot matrix method is a qualitative and simple to analyze sequences however ,it takes much time to analyze large sequences.  Dot matrix method is useful for the following studies :  Sequence similarity between two nucleotide sequences or two amino acid sequences.  Insertion of short stretches in DNA or amino acid sequence.  Deletion of short stretches from a DNA or amino acid sequence.  Repeats or inserted repeats in a DNA or amino acid sequence.
  • 17. Dot matrix analysis: two similar sequences  Nucleic Acids Dot Plots of genes
  • 18. Dynamic Programming Method  Dynamic programming method is the process of solving problems where one needs to find the best decision one after another.  It was introduced by Richard Bellman in 1940.  The word programming here denotes finding an acceptable plan of action not computer programming.  It is useful in aligning nucleotide sequence of DNA and amino acid sequence of proteins coded by that DNA .  Dynamic programming is a three step process that involves : 1. Breaking of the problem into small subproblems. 2. Solving subproblems using recursive methods. 3. Construction of optimal solutions for original problem using the optimal solutions
  • 19. Dynamic programming algorithm for sequence alignment  The method compares every pair of characters in the two sequences and generates an alignment, which is the best or optimal.  This is a highly computationally demanding method. However the latest algorithmic improvements and ever increasing computer capacity make possible to align a query sequence against a large database in a few minutes.  Each alignments has its own score and it is essential to recognize that several different alignments may have nearly identical scores, which is an indication that the dynamic programming methods may produce more than one optimal alignment. However intelligent manipulation of some parameters is important and may discriminate the alignments with similar scores.  Global alignment program is based on Needleman-Wunsch algorithm and local alignment on Smith-Waterman. Both algorithms are derivates from the basic dynamic programming algorithm.
  • 20. Word Method or K-tuple Method  It is used to find an optimal alignment solution, but it is more than dynamic programming .  This method is useful in large-scale database searches to find whether there is significant match available with the query sequence.  Word method is used in the database search tools FASTA and the BLAST family .  They identify a series of short ,non-overlapping subsequences (words) of the query sequence.  Then they are matched to candidate database sequences to get result .
  • 21. Word Method or K-tuple Method  In the FASTA method ,the user defines a value k to use as the word length to search the database .It is slower but more sensitive at lower values of k .They are also preferred for searches involving a very short query sequence .  The BLAST provides a number of algorithms optimized for particular types of queries ,for distantly related sequence matches.  It is a good alternative to FASTA .However , the results are not very accurate .