SlideShare a Scribd company logo
X W. /-
Sequence Alignment in Bioinformatics:
Introduction:
In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or
protein to identify regions of similarity that may indicate functional, structural, or evolutionary
relationships between the sequences. It is an important first step toward structural and
functional analysis of newly determined sequences. The sequence alignment is made between
a known sequence and unknown sequence or between two unknown sequences. The known
sequence is called reference sequence and the unknown sequence is called query sequence.
As new biological sequences are being generated at exponential rate, sequence comparison is
becoming increasingly important to draw functional and evolutionary inference.
Types of Sequence Alignment:
Sequence Alignment is of two types, namely:
1. Global Alignment, and 2. Local Alignment
1. Global Alignment:
Global alignment is a matching of the residues of two sequences across their entire length. It
matches the identical sequences. Global alignment program is based on Needleman-Wunsch
algorithm.
In global alignment, two sequences to be aligned are assumed to be generally similar over their
entire length. Alignment is carried out from beginning to end of both sequences to find the best
possible alignment across the entire length between the two sequences.
Applications of global sequence alignment are: -
 Comparing two genes with same function (in human vs. mouse).
 Comparing two proteins with similar function.
BOTMT:604
Bioinformatics and Biophysics
Prepared By-
Dr. Sangeeta Das.
Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
2. Local Alignment:
It is a matching between two sequences from regions which have more similarity with each
other. Local alignment program is based on Smith-Waterman algorithm.
Unlike global alignment, local alignment does not assume that the two sequences in question
have similarity over the entire length. It only finds local regions with the highest level of
similarity between the two sequences and aligns these regions without regard for the alignment
of the rest of the sequence regions.
Applications of local sequence alignment are:
 Searching for local similarities in large sequences (e.g., newly sequenced genomes).
 Looking or conserved domains or motifs in two proteins.
Methods of Sequence Alignment:
There are two methods of sequence alignment:
A. Pairwise Sequence Alignment method, and B. Multiple Sequence Alignment Method.
A. Pairwise Sequence Alignment method:
Pairwise sequence alignment methods are used to find the best-matching piecewise (local or
global) alignments of two query sequences.
Pairwise alignments can only be used between two sequences at a time, but they are efficient
to calculate.
The three primary methods of producing Pairwise alignments
1. Dot matrix method
2. The dynamic programming (DP) algorithm (advanced method)
3. Word or k -tuple methods
The three primary methods of producing pairwise sequence alignments are
 Dot-matrix methods (old method),
 Dynamic programming, and
 Word methods.
1. Dot Matrix Method:
A dot matrix is a grid system where the similar nucleotides of two DNA sequences are
represented as dots. It also known as dot plots where the dots appear as colorless dots in the
computer screen.
In dot matrix, nucleotides of one sequence are written from the left to right on the top row and
those of the other sequence are written from the top to bottom on the left side (column) of the
matrix. At every point, where the two nucleotides are the same, a dot in the intersection of row
and column becomes a dark dot. when all these darken dots are connected, it gives a graph
called dot plot. The line found in the dot plot is called recurrence plot. Each dot in the plot
represents a matching nucleotide or amino acid.
BOTMT:604
Bioinformatics and Biophysics
Prepared By-
Dr. Sangeeta Das.
Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
Dot matrix method is a qualitative method. It is very simple to analyze sequences in this method.
However, it takes much time to analyze large sequences.
Applications of Dot matrix method are:
•Sequence similarity between two nucleotide sequences or two amino acid sequences.
•Insertion of short stretches in DNA or amino acid sequence.
•Deletion of short stretches from a DNA or amino acid sequence.
•Repeats or inserted repeats in a DNA or amino acid sequence
Fig.1: Nucleic acid dot plots.
2. Dynamic Programming Method:
It is the process of solving problems when one needs to find the best decision one after another.
This method was introduced by Richard Bellman in 1940. The word programming here denotes
finding an acceptable plan of action not computer programming. The method compares every
pair of characters in the two sequences and generates an alignment, which is the best or optimal.
It is useful in aligning nucleotide sequence of DNA and amino acid sequence of proteins coded
by that DNA. However, it is a highly computationally demanding method. Each alignments
have its own score and it is essential to recognize that several different alignments may have
nearly identical scores, which is an indication that the dynamic programming methods may
produce more than one optimal alignment. However intelligent manipulation of some
parameters is important and may discriminate the alignments with similar scores.
BOTMT:604
Bioinformatics and Biophysics
Prepared By-
Dr. Sangeeta Das.
Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
Global alignment program is based on Needleman-Wunsch algorithm and local alignment on
Smith-Waterman. Both algorithms are the derivates from the basic dynamic programming
algorithm.
Dynamic programming is a three step process that involves:
1) Breaking of the problem into small sub-problems.
2) Solving sub-problems using recursive methods.
3) Construction of optimal solutions for original problem using the optimal solutions.
Example:
Alignment: Sequence 1: G A A T T C A G T T A
Sequence 2: G G A T C G A
So M = 11 and N = 7 (the length of sequence #1 and sequence #2, respectively)
A simple scoring scheme is assumed where
 Si,j = 1 if the residue at position i of sequence #1 is the same as the residue at position
j of sequence #2 (match score); otherwise
 Si,j = 0 (mismatch score)
 w = 0 (gap penalty).
There are three steps in dynamic programming methods:
1. Initialization 2. Matrix fill (scoring), and 3. Traceback (alignment).
1. Initialization Step:
The first step in the global alignment dynamic programming approach is to create a matrix with
M + 1 columns and N + 1 rows where M and N correspond to the size of the sequences to be
aligned.
The matrix can be initially filled with 0.
2. Matrix Fill Step:
One possible (inefficient) solution of the matrix fill step finds the maximum global alignment
score by starting in the upper left hand corner in the matrix and finding the maximal score Mi,j
for each position in the matrix.
Prepared By-
Dr. Sangeeta Das.
Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
BOTMT:604
Bioinformatics and Biophysics
After filling in all the values the score matrix is as follows:
3. Traceback Step:
The traceback step determines the actual alignment(s) that result in the maximum score.
Prepared By-
Dr. Sangeeta Das.
Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
BOTMT:604
Bioinformatics and Biophysics
Giving an alignment of:
3. Word Method or K-Tuple Method:
It is used to find an optimal alignment solution, but is more than dynamic programming. This
method is useful in large-scale database searches to find whether there is significant match
available with the query sequence. This method is used in the database search tools FASTA
and the BLAST. They identify a series of short, non-overlapping subsequences (words) of the
query sequence.
In the FASTA method, the user defines a value k to use as the word length to search the
database. It is slower but more sensitive at lower values of k. They are also preferred for
searches involving a very short query sequence. The BLAST provides a number of algorithms
optimized for particular types of queries, for distantly related sequence matches. It is a good
alternative to FASTA. However, the results are not very accurate. Similar to FASTA, BLAST
uses a word search of length k, but evaluates only the most significant word matches rather
than every word match.
Prepared By-
Dr. Sangeeta Das.
Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
BOTMT:604
Bioinformatics and Biophysics
B. Multiple Sequence Alignment Method:
In a multiple sequence alignment, homologous residues among a set of sequences are aligned
together in columns. Here, homologous is meant in both the structural and evolutionary sense.
Multiple sequence alignment (MSA) is generally the alignment of three or more biological
sequence (protein or nucleic acid) of similar length. From the output, homology can be inferred
and the evolutionary relationship between the sequences studied.
Types of MSA methods: The following are the multiple sequence alignment methods:
1. Dynamic Programming approach, 2. Progressive method and 3. Iterative method.
1. Dynamic Programming approach:
Dynamic programming is applicable to align any number of sequences. It computes an optimal
alignment for a given score function. But, due to its high running time, it is not typically used
in practice.
2. Progressive method:
In this method, pairwise global alignment is performed for all the possible sequences. These
pairs are aligned together on the basis of their similarity.
The most similar sequences are aligned together and then less related sequences are added to
it progressively one-by-one until a complete multiple query set is obtained. This method is also
called hierarchical method or tree method.
Progressive method is one of the fastest approaches, considerably faster than the adaptation of
pair-wise alignments to multiple sequences. However, it can become a very slow process for
more than a few sequences.
One major disadvantage of this method is the reliance on a good alignment of the first two
sequences. Errors there can propagate throughout the rest of the process. An alternative
approach is iterative method.
Steps involved in Multiple Sequence alignment are as follows:
A. Pairwise sequence alignment:
Prepared By-
Dr. Sangeeta Das.
Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
BOTMT:604
Bioinformatics and Biophysics
B. Multiple Sequence Alignment following the tree from A.
3. Iterative Method:
This method performs a series of steps to produce successively better approximation to align
many sequences step-by-step. In this method, the pairwise sequence alignment is totally
avoided. Here, the multiple sequence alignment is re-iterated starting with the pair-wise re-
alignment of sequences within subgroups, and then the re-alignment of the subgroups. The
choice of subgroups can be made via sequence relations on the guide tree, random selection,
and so on.
Iterative methods attempt to improve on the weak point of the progressive methods the heavy
dependence on the accuracy of the initial pairwise alignment. Iterative method is an
optimization method and may use machine learning approaches such as genetic algorithms and
Hidden Markov Models. The disadvantages of iterative method are inherited from optimization
methods i.e., the process can get trapped in local minima and can be much slower.
Prepared By-
Dr. Sangeeta Das.
Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
BOTMT:604
Bioinformatics and Biophysics

More Related Content

What's hot

Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
Arindam Ghosh
 
multiple sequence alignment
multiple sequence alignmentmultiple sequence alignment
multiple sequence alignment
harshita agarwal
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarity
BITS
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
Afra Fathima
 
Sequence alignment.pptx
Sequence alignment.pptxSequence alignment.pptx
Sequence alignment.pptx
PagudalaSangeetha
 
Global and Local Sequence Alignment
Global and Local Sequence AlignmentGlobal and Local Sequence Alignment
Global and Local Sequence Alignment
AjayPatil210
 
Blast fasta
Blast fastaBlast fasta
Blast fastayaghava
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
University of Petroleum and Energy studies
 
Scoring schemes in bioinformatics
Scoring schemes in bioinformaticsScoring schemes in bioinformatics
Scoring schemes in bioinformatics
SumatiHajela
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
Athira RG
 
demonstration lecture on Homology modeling
demonstration lecture on Homology modelingdemonstration lecture on Homology modeling
demonstration lecture on Homology modeling
Maharaj Vinayak Global University
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
Subhranil Bhattacharjee
 
Motif andpatterndatabase
Motif andpatterndatabaseMotif andpatterndatabase
Motif andpatterndatabase
Sucheta Tripathy
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
DEBPRASAD DUTTA
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
Meghaj Mallick
 
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
SHEETHUMOLKS
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
kiran singh
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
Ashwini
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
Meghaj Mallick
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
Nikesh Narayanan
 

What's hot (20)

Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
multiple sequence alignment
multiple sequence alignmentmultiple sequence alignment
multiple sequence alignment
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarity
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Sequence alignment.pptx
Sequence alignment.pptxSequence alignment.pptx
Sequence alignment.pptx
 
Global and Local Sequence Alignment
Global and Local Sequence AlignmentGlobal and Local Sequence Alignment
Global and Local Sequence Alignment
 
Blast fasta
Blast fastaBlast fasta
Blast fasta
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
 
Scoring schemes in bioinformatics
Scoring schemes in bioinformaticsScoring schemes in bioinformatics
Scoring schemes in bioinformatics
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
demonstration lecture on Homology modeling
demonstration lecture on Homology modelingdemonstration lecture on Homology modeling
demonstration lecture on Homology modeling
 
Multiple sequence alignment
Multiple sequence alignmentMultiple sequence alignment
Multiple sequence alignment
 
Motif andpatterndatabase
Motif andpatterndatabaseMotif andpatterndatabase
Motif andpatterndatabase
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
 
Sequence Analysis
Sequence AnalysisSequence Analysis
Sequence Analysis
 
Sequence Alignment In Bioinformatics
Sequence Alignment In BioinformaticsSequence Alignment In Bioinformatics
Sequence Alignment In Bioinformatics
 

Similar to Bioinformatics_Sequence Analysis

Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
Vidya Kalaivani Rajkumar
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...
IJCSEIT Journal
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
journal ijrtem
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
IJRTEMJOURNAL
 
multiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdfmultiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdf
sriaisvariyasundar
 
Needleman wunsch computional ppt
Needleman wunsch computional pptNeedleman wunsch computional ppt
Needleman wunsch computional ppt
tarun shekhawat
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformaticsAbhishek Vatsa
 
5. Global and Local Alignment Algorithms.pptx
5. Global and Local Alignment Algorithms.pptx5. Global and Local Alignment Algorithms.pptx
5. Global and Local Alignment Algorithms.pptx
ArupKhakhlari1
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
AnkitTiwari354
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdf
H K Yoon
 
Parwati sihag
Parwati sihagParwati sihag
Parwati sihag
parwati sihag
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
ammar kareem
 
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISHMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
ijcseit
 
Sequence alignment
Sequence alignmentSequence alignment
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignments
seham15
 
Seq alignment
Seq alignment Seq alignment
Seq alignment
Nagendrasahu6
 
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
International Journal of Engineering Inventions www.ijeijournal.com
 
Comparative analysis of dynamic programming
Comparative analysis of dynamic programmingComparative analysis of dynamic programming
Comparative analysis of dynamic programming
eSAT Publishing House
 
Comparative analysis of dynamic programming algorithms to find similarity in ...
Comparative analysis of dynamic programming algorithms to find similarity in ...Comparative analysis of dynamic programming algorithms to find similarity in ...
Comparative analysis of dynamic programming algorithms to find similarity in ...
eSAT Journals
 

Similar to Bioinformatics_Sequence Analysis (20)

Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...International Journal of Computer Science, Engineering and Information Techno...
International Journal of Computer Science, Engineering and Information Techno...
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
multiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdfmultiple sequence and pairwise alignment.pdf
multiple sequence and pairwise alignment.pdf
 
Needleman wunsch computional ppt
Needleman wunsch computional pptNeedleman wunsch computional ppt
Needleman wunsch computional ppt
 
Basics of bioinformatics
Basics of bioinformaticsBasics of bioinformatics
Basics of bioinformatics
 
5. Global and Local Alignment Algorithms.pptx
5. Global and Local Alignment Algorithms.pptx5. Global and Local Alignment Algorithms.pptx
5. Global and Local Alignment Algorithms.pptx
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
AI 바이오 (4일차).pdf
AI 바이오 (4일차).pdfAI 바이오 (4일차).pdf
AI 바이오 (4일차).pdf
 
Parwati sihag
Parwati sihagParwati sihag
Parwati sihag
 
sequence alignment
sequence alignmentsequence alignment
sequence alignment
 
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSISHMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
HMM’S INTERPOLATION OF PROTIENS FOR PROFILE ANALYSIS
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
Laboratory 1 sequence_alignments
Laboratory 1 sequence_alignmentsLaboratory 1 sequence_alignments
Laboratory 1 sequence_alignments
 
Seq alignment
Seq alignment Seq alignment
Seq alignment
 
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
Automatic Parallelization for Parallel Architectures Using Smith Waterman Alg...
 
Comparative analysis of dynamic programming
Comparative analysis of dynamic programmingComparative analysis of dynamic programming
Comparative analysis of dynamic programming
 
Comparative analysis of dynamic programming algorithms to find similarity in ...
Comparative analysis of dynamic programming algorithms to find similarity in ...Comparative analysis of dynamic programming algorithms to find similarity in ...
Comparative analysis of dynamic programming algorithms to find similarity in ...
 

More from Sangeeta Das

Cyanophyta
CyanophytaCyanophyta
Cyanophyta
Sangeeta Das
 
Human Impact on Forests.pptx
Human Impact on Forests.pptxHuman Impact on Forests.pptx
Human Impact on Forests.pptx
Sangeeta Das
 
Women in NE India-A Holistic Approach
Women in NE India-A Holistic ApproachWomen in NE India-A Holistic Approach
Women in NE India-A Holistic Approach
Sangeeta Das
 
Can organic feed the world
Can organic feed the worldCan organic feed the world
Can organic feed the world
Sangeeta Das
 
Chlamydomonas
ChlamydomonasChlamydomonas
Chlamydomonas
Sangeeta Das
 
Evolution of sporophyte in bryotphytes
Evolution of sporophyte in bryotphytesEvolution of sporophyte in bryotphytes
Evolution of sporophyte in bryotphytes
Sangeeta Das
 
Botanical garden
Botanical gardenBotanical garden
Botanical garden
Sangeeta Das
 
Herbarium Techniques
Herbarium TechniquesHerbarium Techniques
Herbarium Techniques
Sangeeta Das
 
Numerical taxonomy_Plant Taxonomy
Numerical taxonomy_Plant TaxonomyNumerical taxonomy_Plant Taxonomy
Numerical taxonomy_Plant Taxonomy
Sangeeta Das
 
Chemotaxonomy-Plant Taxonomy
Chemotaxonomy-Plant TaxonomyChemotaxonomy-Plant Taxonomy
Chemotaxonomy-Plant Taxonomy
Sangeeta Das
 
Cytotaxonomy plant taxonomy
Cytotaxonomy plant taxonomyCytotaxonomy plant taxonomy
Cytotaxonomy plant taxonomy
Sangeeta Das
 
Rosaceae family-Plant Taxonomy
Rosaceae family-Plant TaxonomyRosaceae family-Plant Taxonomy
Rosaceae family-Plant Taxonomy
Sangeeta Das
 
Bioinformatics data mining
Bioinformatics data miningBioinformatics data mining
Bioinformatics data mining
Sangeeta Das
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
Sangeeta Das
 
Cytokinin
CytokininCytokinin
Cytokinin
Sangeeta Das
 
Documentation in plant taxonomy
Documentation in plant taxonomyDocumentation in plant taxonomy
Documentation in plant taxonomy
Sangeeta Das
 
Aims and objectives of plant taxonomy
Aims and objectives of plant taxonomyAims and objectives of plant taxonomy
Aims and objectives of plant taxonomy
Sangeeta Das
 
History and development of plant taxonomy
History and development of plant taxonomyHistory and development of plant taxonomy
History and development of plant taxonomy
Sangeeta Das
 
Archegoniates
ArchegoniatesArchegoniates
Archegoniates
Sangeeta Das
 
Pellia
PelliaPellia
Pellia
Sangeeta Das
 

More from Sangeeta Das (20)

Cyanophyta
CyanophytaCyanophyta
Cyanophyta
 
Human Impact on Forests.pptx
Human Impact on Forests.pptxHuman Impact on Forests.pptx
Human Impact on Forests.pptx
 
Women in NE India-A Holistic Approach
Women in NE India-A Holistic ApproachWomen in NE India-A Holistic Approach
Women in NE India-A Holistic Approach
 
Can organic feed the world
Can organic feed the worldCan organic feed the world
Can organic feed the world
 
Chlamydomonas
ChlamydomonasChlamydomonas
Chlamydomonas
 
Evolution of sporophyte in bryotphytes
Evolution of sporophyte in bryotphytesEvolution of sporophyte in bryotphytes
Evolution of sporophyte in bryotphytes
 
Botanical garden
Botanical gardenBotanical garden
Botanical garden
 
Herbarium Techniques
Herbarium TechniquesHerbarium Techniques
Herbarium Techniques
 
Numerical taxonomy_Plant Taxonomy
Numerical taxonomy_Plant TaxonomyNumerical taxonomy_Plant Taxonomy
Numerical taxonomy_Plant Taxonomy
 
Chemotaxonomy-Plant Taxonomy
Chemotaxonomy-Plant TaxonomyChemotaxonomy-Plant Taxonomy
Chemotaxonomy-Plant Taxonomy
 
Cytotaxonomy plant taxonomy
Cytotaxonomy plant taxonomyCytotaxonomy plant taxonomy
Cytotaxonomy plant taxonomy
 
Rosaceae family-Plant Taxonomy
Rosaceae family-Plant TaxonomyRosaceae family-Plant Taxonomy
Rosaceae family-Plant Taxonomy
 
Bioinformatics data mining
Bioinformatics data miningBioinformatics data mining
Bioinformatics data mining
 
Bioinformatics biological databases
Bioinformatics biological databasesBioinformatics biological databases
Bioinformatics biological databases
 
Cytokinin
CytokininCytokinin
Cytokinin
 
Documentation in plant taxonomy
Documentation in plant taxonomyDocumentation in plant taxonomy
Documentation in plant taxonomy
 
Aims and objectives of plant taxonomy
Aims and objectives of plant taxonomyAims and objectives of plant taxonomy
Aims and objectives of plant taxonomy
 
History and development of plant taxonomy
History and development of plant taxonomyHistory and development of plant taxonomy
History and development of plant taxonomy
 
Archegoniates
ArchegoniatesArchegoniates
Archegoniates
 
Pellia
PelliaPellia
Pellia
 

Recently uploaded

The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
PedroFerreira53928
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
Jheel Barad
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
AzmatAli747758
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 

Recently uploaded (20)

The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
PART A. Introduction to Costumer Service
PART A. Introduction to Costumer ServicePART A. Introduction to Costumer Service
PART A. Introduction to Costumer Service
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Instructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptxInstructions for Submissions thorugh G- Classroom.pptx
Instructions for Submissions thorugh G- Classroom.pptx
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...Cambridge International AS  A Level Biology Coursebook - EBook (MaryFosbery J...
Cambridge International AS A Level Biology Coursebook - EBook (MaryFosbery J...
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 

Bioinformatics_Sequence Analysis

  • 1. X W. /- Sequence Alignment in Bioinformatics: Introduction: In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may indicate functional, structural, or evolutionary relationships between the sequences. It is an important first step toward structural and functional analysis of newly determined sequences. The sequence alignment is made between a known sequence and unknown sequence or between two unknown sequences. The known sequence is called reference sequence and the unknown sequence is called query sequence. As new biological sequences are being generated at exponential rate, sequence comparison is becoming increasingly important to draw functional and evolutionary inference. Types of Sequence Alignment: Sequence Alignment is of two types, namely: 1. Global Alignment, and 2. Local Alignment 1. Global Alignment: Global alignment is a matching of the residues of two sequences across their entire length. It matches the identical sequences. Global alignment program is based on Needleman-Wunsch algorithm. In global alignment, two sequences to be aligned are assumed to be generally similar over their entire length. Alignment is carried out from beginning to end of both sequences to find the best possible alignment across the entire length between the two sequences. Applications of global sequence alignment are: -  Comparing two genes with same function (in human vs. mouse).  Comparing two proteins with similar function. BOTMT:604 Bioinformatics and Biophysics Prepared By- Dr. Sangeeta Das. Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
  • 2. 2. Local Alignment: It is a matching between two sequences from regions which have more similarity with each other. Local alignment program is based on Smith-Waterman algorithm. Unlike global alignment, local alignment does not assume that the two sequences in question have similarity over the entire length. It only finds local regions with the highest level of similarity between the two sequences and aligns these regions without regard for the alignment of the rest of the sequence regions. Applications of local sequence alignment are:  Searching for local similarities in large sequences (e.g., newly sequenced genomes).  Looking or conserved domains or motifs in two proteins. Methods of Sequence Alignment: There are two methods of sequence alignment: A. Pairwise Sequence Alignment method, and B. Multiple Sequence Alignment Method. A. Pairwise Sequence Alignment method: Pairwise sequence alignment methods are used to find the best-matching piecewise (local or global) alignments of two query sequences. Pairwise alignments can only be used between two sequences at a time, but they are efficient to calculate. The three primary methods of producing Pairwise alignments 1. Dot matrix method 2. The dynamic programming (DP) algorithm (advanced method) 3. Word or k -tuple methods The three primary methods of producing pairwise sequence alignments are  Dot-matrix methods (old method),  Dynamic programming, and  Word methods. 1. Dot Matrix Method: A dot matrix is a grid system where the similar nucleotides of two DNA sequences are represented as dots. It also known as dot plots where the dots appear as colorless dots in the computer screen. In dot matrix, nucleotides of one sequence are written from the left to right on the top row and those of the other sequence are written from the top to bottom on the left side (column) of the matrix. At every point, where the two nucleotides are the same, a dot in the intersection of row and column becomes a dark dot. when all these darken dots are connected, it gives a graph called dot plot. The line found in the dot plot is called recurrence plot. Each dot in the plot represents a matching nucleotide or amino acid. BOTMT:604 Bioinformatics and Biophysics Prepared By- Dr. Sangeeta Das. Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
  • 3. Dot matrix method is a qualitative method. It is very simple to analyze sequences in this method. However, it takes much time to analyze large sequences. Applications of Dot matrix method are: •Sequence similarity between two nucleotide sequences or two amino acid sequences. •Insertion of short stretches in DNA or amino acid sequence. •Deletion of short stretches from a DNA or amino acid sequence. •Repeats or inserted repeats in a DNA or amino acid sequence Fig.1: Nucleic acid dot plots. 2. Dynamic Programming Method: It is the process of solving problems when one needs to find the best decision one after another. This method was introduced by Richard Bellman in 1940. The word programming here denotes finding an acceptable plan of action not computer programming. The method compares every pair of characters in the two sequences and generates an alignment, which is the best or optimal. It is useful in aligning nucleotide sequence of DNA and amino acid sequence of proteins coded by that DNA. However, it is a highly computationally demanding method. Each alignments have its own score and it is essential to recognize that several different alignments may have nearly identical scores, which is an indication that the dynamic programming methods may produce more than one optimal alignment. However intelligent manipulation of some parameters is important and may discriminate the alignments with similar scores. BOTMT:604 Bioinformatics and Biophysics Prepared By- Dr. Sangeeta Das. Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India.
  • 4. Global alignment program is based on Needleman-Wunsch algorithm and local alignment on Smith-Waterman. Both algorithms are the derivates from the basic dynamic programming algorithm. Dynamic programming is a three step process that involves: 1) Breaking of the problem into small sub-problems. 2) Solving sub-problems using recursive methods. 3) Construction of optimal solutions for original problem using the optimal solutions. Example: Alignment: Sequence 1: G A A T T C A G T T A Sequence 2: G G A T C G A So M = 11 and N = 7 (the length of sequence #1 and sequence #2, respectively) A simple scoring scheme is assumed where  Si,j = 1 if the residue at position i of sequence #1 is the same as the residue at position j of sequence #2 (match score); otherwise  Si,j = 0 (mismatch score)  w = 0 (gap penalty). There are three steps in dynamic programming methods: 1. Initialization 2. Matrix fill (scoring), and 3. Traceback (alignment). 1. Initialization Step: The first step in the global alignment dynamic programming approach is to create a matrix with M + 1 columns and N + 1 rows where M and N correspond to the size of the sequences to be aligned. The matrix can be initially filled with 0. 2. Matrix Fill Step: One possible (inefficient) solution of the matrix fill step finds the maximum global alignment score by starting in the upper left hand corner in the matrix and finding the maximal score Mi,j for each position in the matrix. Prepared By- Dr. Sangeeta Das. Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India. BOTMT:604 Bioinformatics and Biophysics
  • 5. After filling in all the values the score matrix is as follows: 3. Traceback Step: The traceback step determines the actual alignment(s) that result in the maximum score. Prepared By- Dr. Sangeeta Das. Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India. BOTMT:604 Bioinformatics and Biophysics
  • 6. Giving an alignment of: 3. Word Method or K-Tuple Method: It is used to find an optimal alignment solution, but is more than dynamic programming. This method is useful in large-scale database searches to find whether there is significant match available with the query sequence. This method is used in the database search tools FASTA and the BLAST. They identify a series of short, non-overlapping subsequences (words) of the query sequence. In the FASTA method, the user defines a value k to use as the word length to search the database. It is slower but more sensitive at lower values of k. They are also preferred for searches involving a very short query sequence. The BLAST provides a number of algorithms optimized for particular types of queries, for distantly related sequence matches. It is a good alternative to FASTA. However, the results are not very accurate. Similar to FASTA, BLAST uses a word search of length k, but evaluates only the most significant word matches rather than every word match. Prepared By- Dr. Sangeeta Das. Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India. BOTMT:604 Bioinformatics and Biophysics
  • 7. B. Multiple Sequence Alignment Method: In a multiple sequence alignment, homologous residues among a set of sequences are aligned together in columns. Here, homologous is meant in both the structural and evolutionary sense. Multiple sequence alignment (MSA) is generally the alignment of three or more biological sequence (protein or nucleic acid) of similar length. From the output, homology can be inferred and the evolutionary relationship between the sequences studied. Types of MSA methods: The following are the multiple sequence alignment methods: 1. Dynamic Programming approach, 2. Progressive method and 3. Iterative method. 1. Dynamic Programming approach: Dynamic programming is applicable to align any number of sequences. It computes an optimal alignment for a given score function. But, due to its high running time, it is not typically used in practice. 2. Progressive method: In this method, pairwise global alignment is performed for all the possible sequences. These pairs are aligned together on the basis of their similarity. The most similar sequences are aligned together and then less related sequences are added to it progressively one-by-one until a complete multiple query set is obtained. This method is also called hierarchical method or tree method. Progressive method is one of the fastest approaches, considerably faster than the adaptation of pair-wise alignments to multiple sequences. However, it can become a very slow process for more than a few sequences. One major disadvantage of this method is the reliance on a good alignment of the first two sequences. Errors there can propagate throughout the rest of the process. An alternative approach is iterative method. Steps involved in Multiple Sequence alignment are as follows: A. Pairwise sequence alignment: Prepared By- Dr. Sangeeta Das. Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India. BOTMT:604 Bioinformatics and Biophysics
  • 8. B. Multiple Sequence Alignment following the tree from A. 3. Iterative Method: This method performs a series of steps to produce successively better approximation to align many sequences step-by-step. In this method, the pairwise sequence alignment is totally avoided. Here, the multiple sequence alignment is re-iterated starting with the pair-wise re- alignment of sequences within subgroups, and then the re-alignment of the subgroups. The choice of subgroups can be made via sequence relations on the guide tree, random selection, and so on. Iterative methods attempt to improve on the weak point of the progressive methods the heavy dependence on the accuracy of the initial pairwise alignment. Iterative method is an optimization method and may use machine learning approaches such as genetic algorithms and Hidden Markov Models. The disadvantages of iterative method are inherited from optimization methods i.e., the process can get trapped in local minima and can be much slower. Prepared By- Dr. Sangeeta Das. Assistant Professor, Department of Botany, Bahona College, Jorhat, Assam, India. BOTMT:604 Bioinformatics and Biophysics