SlideShare a Scribd company logo
COMPUTATIONAL METHODS
OF
SEQUENCE ALIGNMENT
CREATED AND DESIGNED BY
AMMAR KAREEM ALMAJDI
II YEAR III SEMESTER
Department of Biotechnology
OUTLINE Bioinformatics
• Sequence Alignment
• Types of a sequence alignment
• Methods of sequence alignment
• Dot Matrix method
• Dynamic programming method
• Word method or k-tuple method
 Sequence alignment is a way of arranging sequences of DNA,RNA or
protein to identifyidentify regions of similarity is made to align the entire
sequence. the similarity may indicate the funcutional,structural and
evolutionary significance of the sequence.
 The sequence alignment is made between a known sequence and
unknown sequence or between two unknown sequences.
 The known sequence is called reference sequence.the unknown
sequence is called query sequenc.
Definition of sequence alignment
Interpretation of sequence alignment
• Sequence alignment is useful for discovering structural, functional
and evolutionary information.
• Sequences that are very much alike may have similar secondary and
3D structure, similar function and likely a common ancestral
sequence. It is extremely unlikely that such sequences obtained
similarity by chance. For DNA molecules with n nucleotides such
probability is very low P = 4-n. For proteins the probability even much
lower P = 20 –n, where n is a number of amino acid residues
• Large scale genome studies revealed existence of horizontal
transfer of genes and other sequences between species, which may
cause similarity between some sequences in very distant species.
Types of Sequence Alignment
 Sequence Alignment is of two types , namely :
 Global Alignment
 Local Alignment
 Global Alignment : is a matching the residues of two sequences
across their entire length.
 global alignment matches the identical sequences .
 Local Alignment : is a matching two sequence from regions which
have more similarity with each other.
Types of Sequence Alignment
 Global alignment
 Input: treat the two sequences as potentially equivalent
 Goal: identify conserved regions and differences
 Applications:
- Comparing two genes with same function (in human vs. mouse).
- Comparing two proteins with similar function.
Types of Sequence Alignment
 Local alignment
 Input: The two sequences may or may not be related
 Goal: see whether a substring in one sequence aligns well with a
substring in the other
 Note: for local matching, overhangs at the ends are not treated as gaps
 Applications:
- Searching for local similarities in large sequences
(e.g., newly sequenced genomes).
- Looking for conserved domains or motifs in two proteins
Types of Sequence Alignmentu
• L G P S S K Q T G K G S - S R I W D N
• Globalalignment
• L N - I T K S A G K G A I M R L G D A
• - - - - - - - T G K G - - - - - - - -
• Localalignment
• - - - - - - - A G K G - - - - - - - -
• Dot matrix method
• The dynamic programming (DP) algorithm
• Word or k-tuple methods
Method of sequence alignment
• A dot matrix is a grid system where the similar nucleotides of two DNA
sequences are represented as dots.
• It also called dot plots.
• It is a pairwise sequence alignment made in the computer.
• The dots appear as colourless dots in the computer screen.
• In dot matrix , nucleotides of one sequence are written from the left to
right on the top row and those of the other sequence are written from the
top to bottom on the left side (column) of the matrix.At every point,
where the two nucleotides are the same , a dot in the intersection of row
and column becomes a dark dot. when all these darken dots are
connected, it gives a graph called dot plot. the line found in the dot plot is
called recurrence plot. Each dot in the plot represents a matching
nucleotide or amino acid.
Dot matrix analysis
• Dot matrix method is a qualitative and simple to analyze
sequences.however ,it takes much time to analyze large sequences.
• Dot matrix method is useful for the following studies :
•Sequence similarity between two nucleotide sequences or two amino
acid sequences.
•Insertion of short stretches in DNA or amino acid sequence.
•Deletion of short stretches from a DNA or amino acid sequence.
•Repeats or inserted repeats in a DNA or amino acid sequence.
Dot matrix analysis
• Nucleic Acids Dot Plots
Dot matrix analysis: Two identical sequences
• Nucleic Acids Dot Plots of genes
Dot matrix analysis: two very different sequences
• Nucleic Acids Dot Plots of genes
Dot matrix analysis: two similar sequences
Dynamic Programming Method
• Is the process of solving problems where one needs to find the best
decision one after another.
• It was introduced by Richard Bellman in 1940.
• The word programming here denotes finding an acceptable plan of action
not computer programming.
• It is useful in aligning nucleotide sequence of DNA and amino acid
sequence of proteins coded by that DNA .
• Dynamic programming is a three step process that involves :
1) Breaking of the problem into small subproblems.
2) Solving subproblems using recursive methods.
3) Construction of optimal solutions for original problem using the optimal
solutions .
• The method compares every pair of characters in the two sequences and
generates an alignment, which is the best or optimal.
• This is a highly computationally demanding method. However the latest
algorithmic improvements and ever increasing computer capacity make possible
to align a query sequence against a large DB in a few minutes.
• Each alignments has its own score and it is essential to recognise that several
different alignments may have nearly identical scores, which is an indication
that the dynamic programming methods may produce more than one optimal
alignment. However intelligent manipulation of some parameters is important
and may discriminate the alignments with similar scores.
• Global alignment program is based on Needleman-Wunsch algorithm and local
alignment on Smith-Waterman. Both algorithms are derivates from the basic
dynamic programming algorithm.
Dynamic programming algorithm for sequence
alignment
• The alignment procedure depends upon scoring system, which can be based on
probability that 1) a particular amino acid pair is found in alignments of related
proteins (pxy); 2) the same amino acid pair is aligned by chance (pxpy); 3)
introduction of a gap would be a better choice as it increases the score.
• The ratio of the first two probabilities is usually provided in an amino acid
substitution matrix. There are many such matrices, two of them PAM and
BLOSUM are considered later.
• The score for the gap introduction and its extension is also calculated from the
matrices and represent a prior knowledge and some assumptions. One of them is
quite simple, if negative cost of a gap is too high a reasonable alignment
between slightly different sequences will be never achieved but if it is too low
an optimal alignment is hardly possible. Other assumptions are based on
sophisticated statistical procedures.
Description of the dynamic programming algorithm
Derivation of the dynamic programming
algorithm
1. Score of new = Score of previous + Score of new
alignment alignment (A) aligned pair
V D S - C Y V D S - C Y
V E S L C Y V E S L C Y
15 = 8 + 7
2. Score of = Score of previous + Score of new
alignment (A) alignment (B) aligned pair
V D S - C V D S - C
V E S L C V E S L C
8 = -1 + 9
3. Repeat removing aligned pairs until end of alignments is reached
Scoring matrices: PAM (Percent Accepted
Mutation)
Amino acids are grouped according to to the chemistry of the side group: (C) sulfhydryl, (STPAG)-
small hydrophilic, (NDEQ) acid, acid amide and hydrophilic, (HRK) basic, (MILV) small hydrophobic,
and (FYW) aromatic. Log odds values: +10 means that ancestor probability is greater, 0 means that the
probability are equal, -4 means that the change is random. Thus the probability of alignment YY/YY is
10+10=20, whereas YY/TP is –3-5=-8, a rare and unexpected between homologous sequences.
Scoring matrices: BLOSUM62
(BLOcks amino acid SUbstitution Matrices)
Ideology of BLOSUM is similar but it is calculated from a very different and much larger set
of proteins, which are much more similar and create blocks of proteins with a similar pattern
• This diagram indicates the moves that are possible to reach a certain position (i,j) starting from the
previous row and column at position (i -1, j-1) or from any position in the same row or column
• Diagonal move with no gap penalties or move from any other position from column j or row i, with
a gap penalty that depends on the size of the gap
Formal description of dynamic programming
algorithm
i -x
i -1
j -1i -y j
i
Si - x,j - wx
Si –1, j- 1 + s(ai , bj)
Si, j - y - wy Si, j
Word Method or K-tuple method
• It is used to find an optimal alignment solution,but is more than dynamic
programming .
• This method is useful in large-scale database searches to find whether there
is significant match available with the query sequence.
• Word method is used in the database search tools FASTA and the BLAST
family .
• They identify a series of short ,non-overlapping subsequences (words) of
the query sequence.
• Then they are matched to candidate database sequences to get result .
Word Method or K-tuple method
• In the FASTA method ,the user defines a value k to use as the word length
to search the database .it is slower but more sensitive at lower values of
k .they are also perferred for serches involving a very short qurery
sequence .
• The BLAST provides a number of algorithms optimized for particular types
of queries ,for distantly related sequence matches.
• It is a good alternative to FASTA .However , the results are not very
accurate .
• Like FASTA ,BLAST uses a word search of length k ,but evaluates only
the most significant word m,latches rather than every word match .
T H A N K Y O U

More Related Content

What's hot

Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
ALLIENU
 
Fasta
FastaFasta
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
naveed ul mushtaq
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure prediction
Siva Dharshini R
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
Vidya Kalaivani Rajkumar
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
geetikaJethra
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
Hafiz Muhammad Zeeshan Raza
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
Hafiz Muhammad Zeeshan Raza
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
Ashwini
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
Ramya S
 
Rasmol
RasmolRasmol
BLAST
BLASTBLAST
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
KAUSHAL SAHU
 
SEQUENCE ANALYSIS
SEQUENCE ANALYSISSEQUENCE ANALYSIS
SEQUENCE ANALYSIS
prashant tripathi
 
Structural databases
Structural databases Structural databases
Structural databases
Priyadharshana
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
PrashantSharma807
 
Prosite
PrositeProsite
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
karamveer prajapat
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
Puneet Kulyana
 

What's hot (20)

Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Fasta
FastaFasta
Fasta
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure prediction
 
Major databases in bioinformatics
Major databases in bioinformaticsMajor databases in bioinformatics
Major databases in bioinformatics
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)European molecular biology laboratory (EMBL)
European molecular biology laboratory (EMBL)
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Rasmol
RasmolRasmol
Rasmol
 
BLAST
BLASTBLAST
BLAST
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
SEQUENCE ANALYSIS
SEQUENCE ANALYSISSEQUENCE ANALYSIS
SEQUENCE ANALYSIS
 
Structural databases
Structural databases Structural databases
Structural databases
 
Introduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASEIntroduction OF BIOLOGICAL DATABASE
Introduction OF BIOLOGICAL DATABASE
 
Prosite
PrositeProsite
Prosite
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
Primary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyanaPrimary and secondary databases ppt by puneet kulyana
Primary and secondary databases ppt by puneet kulyana
 

Similar to sequence alignment

Seq alignment
Seq alignment Seq alignment
Seq alignment
Nagendrasahu6
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
Meghaj Mallick
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdf
H K Yoon
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdfSequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdf
sriaisvariyasundar
 
Sequence alignment global vs. local
Sequence alignment  global vs. localSequence alignment  global vs. local
Sequence alignment global vs. local
benazeer fathima
 
lecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadflecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadf
alizain9604
 
Global and Local Sequence Alignment
Global and Local Sequence AlignmentGlobal and Local Sequence Alignment
Global and Local Sequence Alignment
AjayPatil210
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
Ravi Gandham
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blast
Rai University
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastRai University
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
Meghaj Mallick
 
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignments
seham15
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
Kubuldinho
 
Bioinformatics_Sequence Analysis
Bioinformatics_Sequence AnalysisBioinformatics_Sequence Analysis
Bioinformatics_Sequence Analysis
Sangeeta Das
 
Ijetr042111
Ijetr042111Ijetr042111
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
AnkitTiwari354
 
Parwati sihag
Parwati sihagParwati sihag
Parwati sihag
parwati sihag
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformaticsAbhishek Vatsa
 
Bioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptxBioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptx
Ranjan Jyoti Sarma
 

Similar to sequence alignment (20)

Seq alignment
Seq alignment Seq alignment
Seq alignment
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdf
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Sequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdfSequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdf
 
Sequence alignment global vs. local
Sequence alignment  global vs. localSequence alignment  global vs. local
Sequence alignment global vs. local
 
lecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadflecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadf
 
Global and Local Sequence Alignment
Global and Local Sequence AlignmentGlobal and Local Sequence Alignment
Global and Local Sequence Alignment
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blast
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blast
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
 
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignments
 
Introduction to sequence alignment
Introduction to sequence alignmentIntroduction to sequence alignment
Introduction to sequence alignment
 
Bioinformatics_Sequence Analysis
Bioinformatics_Sequence AnalysisBioinformatics_Sequence Analysis
Bioinformatics_Sequence Analysis
 
Ijetr042111
Ijetr042111Ijetr042111
Ijetr042111
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 
Parwati sihag
Parwati sihagParwati sihag
Parwati sihag
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
Bioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptxBioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptx
 

Recently uploaded

Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
AzmatAli747758
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
bennyroshan06
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
Celine George
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
EduSkills OECD
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 

Recently uploaded (20)

Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptxMARUTI SUZUKI- A Successful Joint Venture in India.pptx
MARUTI SUZUKI- A Successful Joint Venture in India.pptx
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
How to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERPHow to Create Map Views in the Odoo 17 ERP
How to Create Map Views in the Odoo 17 ERP
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptxStudents, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
Students, digital devices and success - Andreas Schleicher - 27 May 2024..pptx
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......Ethnobotany and Ethnopharmacology ......
Ethnobotany and Ethnopharmacology ......
 

sequence alignment

  • 1. COMPUTATIONAL METHODS OF SEQUENCE ALIGNMENT CREATED AND DESIGNED BY AMMAR KAREEM ALMAJDI II YEAR III SEMESTER Department of Biotechnology
  • 2. OUTLINE Bioinformatics • Sequence Alignment • Types of a sequence alignment • Methods of sequence alignment • Dot Matrix method • Dynamic programming method • Word method or k-tuple method
  • 3.  Sequence alignment is a way of arranging sequences of DNA,RNA or protein to identifyidentify regions of similarity is made to align the entire sequence. the similarity may indicate the funcutional,structural and evolutionary significance of the sequence.  The sequence alignment is made between a known sequence and unknown sequence or between two unknown sequences.  The known sequence is called reference sequence.the unknown sequence is called query sequenc. Definition of sequence alignment
  • 4. Interpretation of sequence alignment • Sequence alignment is useful for discovering structural, functional and evolutionary information. • Sequences that are very much alike may have similar secondary and 3D structure, similar function and likely a common ancestral sequence. It is extremely unlikely that such sequences obtained similarity by chance. For DNA molecules with n nucleotides such probability is very low P = 4-n. For proteins the probability even much lower P = 20 –n, where n is a number of amino acid residues • Large scale genome studies revealed existence of horizontal transfer of genes and other sequences between species, which may cause similarity between some sequences in very distant species.
  • 5. Types of Sequence Alignment  Sequence Alignment is of two types , namely :  Global Alignment  Local Alignment  Global Alignment : is a matching the residues of two sequences across their entire length.  global alignment matches the identical sequences .  Local Alignment : is a matching two sequence from regions which have more similarity with each other.
  • 6. Types of Sequence Alignment  Global alignment  Input: treat the two sequences as potentially equivalent  Goal: identify conserved regions and differences  Applications: - Comparing two genes with same function (in human vs. mouse). - Comparing two proteins with similar function.
  • 7. Types of Sequence Alignment  Local alignment  Input: The two sequences may or may not be related  Goal: see whether a substring in one sequence aligns well with a substring in the other  Note: for local matching, overhangs at the ends are not treated as gaps  Applications: - Searching for local similarities in large sequences (e.g., newly sequenced genomes). - Looking for conserved domains or motifs in two proteins
  • 8. Types of Sequence Alignmentu • L G P S S K Q T G K G S - S R I W D N • Globalalignment • L N - I T K S A G K G A I M R L G D A • - - - - - - - T G K G - - - - - - - - • Localalignment • - - - - - - - A G K G - - - - - - - -
  • 9. • Dot matrix method • The dynamic programming (DP) algorithm • Word or k-tuple methods Method of sequence alignment
  • 10. • A dot matrix is a grid system where the similar nucleotides of two DNA sequences are represented as dots. • It also called dot plots. • It is a pairwise sequence alignment made in the computer. • The dots appear as colourless dots in the computer screen. • In dot matrix , nucleotides of one sequence are written from the left to right on the top row and those of the other sequence are written from the top to bottom on the left side (column) of the matrix.At every point, where the two nucleotides are the same , a dot in the intersection of row and column becomes a dark dot. when all these darken dots are connected, it gives a graph called dot plot. the line found in the dot plot is called recurrence plot. Each dot in the plot represents a matching nucleotide or amino acid. Dot matrix analysis
  • 11. • Dot matrix method is a qualitative and simple to analyze sequences.however ,it takes much time to analyze large sequences. • Dot matrix method is useful for the following studies : •Sequence similarity between two nucleotide sequences or two amino acid sequences. •Insertion of short stretches in DNA or amino acid sequence. •Deletion of short stretches from a DNA or amino acid sequence. •Repeats or inserted repeats in a DNA or amino acid sequence. Dot matrix analysis
  • 12. • Nucleic Acids Dot Plots Dot matrix analysis: Two identical sequences
  • 13. • Nucleic Acids Dot Plots of genes Dot matrix analysis: two very different sequences
  • 14. • Nucleic Acids Dot Plots of genes Dot matrix analysis: two similar sequences
  • 15. Dynamic Programming Method • Is the process of solving problems where one needs to find the best decision one after another. • It was introduced by Richard Bellman in 1940. • The word programming here denotes finding an acceptable plan of action not computer programming. • It is useful in aligning nucleotide sequence of DNA and amino acid sequence of proteins coded by that DNA . • Dynamic programming is a three step process that involves : 1) Breaking of the problem into small subproblems. 2) Solving subproblems using recursive methods. 3) Construction of optimal solutions for original problem using the optimal solutions .
  • 16. • The method compares every pair of characters in the two sequences and generates an alignment, which is the best or optimal. • This is a highly computationally demanding method. However the latest algorithmic improvements and ever increasing computer capacity make possible to align a query sequence against a large DB in a few minutes. • Each alignments has its own score and it is essential to recognise that several different alignments may have nearly identical scores, which is an indication that the dynamic programming methods may produce more than one optimal alignment. However intelligent manipulation of some parameters is important and may discriminate the alignments with similar scores. • Global alignment program is based on Needleman-Wunsch algorithm and local alignment on Smith-Waterman. Both algorithms are derivates from the basic dynamic programming algorithm. Dynamic programming algorithm for sequence alignment
  • 17. • The alignment procedure depends upon scoring system, which can be based on probability that 1) a particular amino acid pair is found in alignments of related proteins (pxy); 2) the same amino acid pair is aligned by chance (pxpy); 3) introduction of a gap would be a better choice as it increases the score. • The ratio of the first two probabilities is usually provided in an amino acid substitution matrix. There are many such matrices, two of them PAM and BLOSUM are considered later. • The score for the gap introduction and its extension is also calculated from the matrices and represent a prior knowledge and some assumptions. One of them is quite simple, if negative cost of a gap is too high a reasonable alignment between slightly different sequences will be never achieved but if it is too low an optimal alignment is hardly possible. Other assumptions are based on sophisticated statistical procedures. Description of the dynamic programming algorithm
  • 18. Derivation of the dynamic programming algorithm 1. Score of new = Score of previous + Score of new alignment alignment (A) aligned pair V D S - C Y V D S - C Y V E S L C Y V E S L C Y 15 = 8 + 7 2. Score of = Score of previous + Score of new alignment (A) alignment (B) aligned pair V D S - C V D S - C V E S L C V E S L C 8 = -1 + 9 3. Repeat removing aligned pairs until end of alignments is reached
  • 19. Scoring matrices: PAM (Percent Accepted Mutation) Amino acids are grouped according to to the chemistry of the side group: (C) sulfhydryl, (STPAG)- small hydrophilic, (NDEQ) acid, acid amide and hydrophilic, (HRK) basic, (MILV) small hydrophobic, and (FYW) aromatic. Log odds values: +10 means that ancestor probability is greater, 0 means that the probability are equal, -4 means that the change is random. Thus the probability of alignment YY/YY is 10+10=20, whereas YY/TP is –3-5=-8, a rare and unexpected between homologous sequences.
  • 20. Scoring matrices: BLOSUM62 (BLOcks amino acid SUbstitution Matrices) Ideology of BLOSUM is similar but it is calculated from a very different and much larger set of proteins, which are much more similar and create blocks of proteins with a similar pattern
  • 21. • This diagram indicates the moves that are possible to reach a certain position (i,j) starting from the previous row and column at position (i -1, j-1) or from any position in the same row or column • Diagonal move with no gap penalties or move from any other position from column j or row i, with a gap penalty that depends on the size of the gap Formal description of dynamic programming algorithm i -x i -1 j -1i -y j i Si - x,j - wx Si –1, j- 1 + s(ai , bj) Si, j - y - wy Si, j
  • 22. Word Method or K-tuple method • It is used to find an optimal alignment solution,but is more than dynamic programming . • This method is useful in large-scale database searches to find whether there is significant match available with the query sequence. • Word method is used in the database search tools FASTA and the BLAST family . • They identify a series of short ,non-overlapping subsequences (words) of the query sequence. • Then they are matched to candidate database sequences to get result .
  • 23. Word Method or K-tuple method • In the FASTA method ,the user defines a value k to use as the word length to search the database .it is slower but more sensitive at lower values of k .they are also perferred for serches involving a very short qurery sequence . • The BLAST provides a number of algorithms optimized for particular types of queries ,for distantly related sequence matches. • It is a good alternative to FASTA .However , the results are not very accurate . • Like FASTA ,BLAST uses a word search of length k ,but evaluates only the most significant word m,latches rather than every word match .
  • 24. T H A N K Y O U