SEQUENCE ALIGNMENT
Presented by,
Poornima. M.S
18TUSLE014
4th semester
Life science
Guided by,
Ms.Priyanka Swamy
Faculty
Dept.of Life Science
CONTENTS
1. Sequence alignment
2. Introduction
3. Why alignment needed?
4. Principles
5. Goals
6. Example Alignment
7. Tools
8. Sequence alignment Problems
9. Types in Sequence alignment
10. Issues
11. Advantages
12. References.
INTRODUCTION
• A Sequence alignment is a way of arranging the primary sequences of
DNA,RNA or Proteins to identify regions of similarity that may be a
consequences of FUNCTIONAL,STRUCTURAL and EVOLUTIONARY relationship
between the sequences.
• Sequence alignment is the procedure of comparing two (pair-wise
alignment)or more (multiple sequences) by searching for a series of individual
characters/patterns that are in the same order in the sequences.
• This can also be defined as “ the alignment is made between a known
sequence & unknown sequence or between 2 unknown sequence”.
Contd..
• Here, the known sequence is called the “REFERENCE SEQUENCE” and the
unknown sequence is called the “QUERY SEQUENCE”.
SEQUENCE ALIGNMENT
• Motivation:Access similarity of
sequences & learn about their
evolutionary relationship.
• Sequence Homology: 2/ more
sequences are Homologous if they
evolved from a common ancestor
• A good alignment is one with few
Substitutions & Indels.
WHY ALIGNMENT NEEDED ❓❓❓
• We need to be able to compare
sequences for similarities &
differences.
• Often what we are looking for are
not exact matches,but similarities.
• Homology- similarity due to
descent from a common ancestor.
• We can sometimes infer structure/
function from sequence similarity.
PRINCIPLES
• Alignment can reveal HOMOLOGY between sequences.
• Similarity is descriptive term that tells about the degree of
match between 2 sequences.
• Sequence similarity doesn’t always imply a common function.
• Conserved function doesn’t always imply similarity at the
sequence level.
• Convergent evolution: sequences are highly similar, but are not
homologous.
GOALS OF SEQUENCE ALIGNMENT
•To identify conserved regions & differences.
•To see whether a substring in one sequence
aligns well with a substring in other.
EXAMPLE ALIGNMENT: GLOBINS
• Figure at right shows prototypical structure
of globins.
• Figure at below shows part of alignment for
8 globins.
TOOLS INVOLVED IN ALIGNMENT
1. MUSCLE
2. M- view
3. T-Coffee
4. Clastal-W
5. PyMOL
6. SABERTOOTH
7. Satsum
Sequence alignment problems
• No.of sequences:
✓ 2 sequences--Pairwise Alignment
✓>2 sequences– MSA
• Which part to align ??
✓Whole sequence--Global alignment
✓Parts of sequence—Local alignment
• How to compute similarity ??
✓Ways to compute substitution scores
✓ Ways to compute gap penalties.
TYPES IN SEQUENCE ALIGNMENT
2 types in Sequence Alignment;
1. PAIRWISE ALIGNMENT :It is a method used
to find the best-matching piece-wise( Global
& Local ) alignments of 2 query sequence at
a time.
2. MULTIPLE SEQUENCE ALIGNMENT :It is an
extension of pairwise alignment to incorporate
3/ more sequences of similar length at a time.
PAIRWISE ALIGNMENT TYPES
There are majorly 2 types of Pairwise Alignment . They are;
1. Global alignment.
2. Local alignment.
GLOBAL ALIGNMENT
• The alignment is stretched over the entire sequence length
to include as many matching amino acids as possible upto
and including the sequence ends.Vertical lines between the
sequence indicate the presence of Identical amino acids.
• Involves the EMBOSS Needle tool.
• Ex: Needleman-Wunsch algorithm
LOCAL ALIGNMENT
• The Alignment tends to stop at the end of the regions of
identity or strong similarity.A much higher priority is given to
finding these local regions than extending the alignment to
include more neighboring amino acid pairs .
• Involves the BLAST tool.
• Ex: Smith-Waterman algorithm
Pairwise Alignment in MSA
• The most practical & widely used method in multiple sequence
alignment is the hierarchical extensions of Pairwise Alignment
methods.
• Here the principle is, the Multiple Alignments are achieved by
successive application of Pairwise methods.
Alignment help to analyze
Sequence data :
Organize & Visualize.
ISSUES IN SEQUENCE ALIGNMENT
• The sequences we are comparing probably differ in length
• There may b only relatively small regions in the sequence that
match.
• Variable length regions may have been inserted / deleted from the
common ancestral sequence.
Advantages of Sequence alignment:
• Sequences of different length are compared.
• Long sequences containing both coding and non-
coding regions are compared.
• Proteins from different protein families are compared
to find conserved domain
• Possible to determine e-values.
• Checking minor differences between 2 sequences.
• Easy to understand complete sequence in output.
• Functional orthology detection.
REFERENCES
• Jurate Daugelaite, Aisling O' Driscoll, Roy D. Sleator, An Overview of Multiple
Sequence Alignments and Cloud Computing in Bioinformatics ,Published
2013,DOI:10.1155/2013/6156300
• C. B. Do and K. Katoh, “Protein multiple sequence alignment” Methods in
Molecular Biology,vol.484,pp.379–413,2008.
• Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD.
(2003). Multiple sequence alignment with the Clustal series of programs. Nucleic
Acids Res., 31, 3497-3500.
• J. D. Thompson, F. Plewniak, and O. Poch, “A comprehensive comparison of
multiple sequence alignment programs,” Nucleic Acids Research, vol. 27, no. 13,
pp. 2682–2690, 1999
• https://www.ncbi.nlm.nih.gov/protein/?term=acidic+ribosomal+protein+po+l10e
“How can WORDS not matter
the foremost ,when our DNA is a sequence of
LETTERS.”

Sequence Alignment

  • 1.
    SEQUENCE ALIGNMENT Presented by, Poornima.M.S 18TUSLE014 4th semester Life science Guided by, Ms.Priyanka Swamy Faculty Dept.of Life Science
  • 2.
    CONTENTS 1. Sequence alignment 2.Introduction 3. Why alignment needed? 4. Principles 5. Goals 6. Example Alignment 7. Tools 8. Sequence alignment Problems 9. Types in Sequence alignment 10. Issues 11. Advantages 12. References.
  • 3.
    INTRODUCTION • A Sequencealignment is a way of arranging the primary sequences of DNA,RNA or Proteins to identify regions of similarity that may be a consequences of FUNCTIONAL,STRUCTURAL and EVOLUTIONARY relationship between the sequences. • Sequence alignment is the procedure of comparing two (pair-wise alignment)or more (multiple sequences) by searching for a series of individual characters/patterns that are in the same order in the sequences. • This can also be defined as “ the alignment is made between a known sequence & unknown sequence or between 2 unknown sequence”.
  • 4.
    Contd.. • Here, theknown sequence is called the “REFERENCE SEQUENCE” and the unknown sequence is called the “QUERY SEQUENCE”.
  • 5.
    SEQUENCE ALIGNMENT • Motivation:Accesssimilarity of sequences & learn about their evolutionary relationship. • Sequence Homology: 2/ more sequences are Homologous if they evolved from a common ancestor • A good alignment is one with few Substitutions & Indels.
  • 6.
    WHY ALIGNMENT NEEDED❓❓❓ • We need to be able to compare sequences for similarities & differences. • Often what we are looking for are not exact matches,but similarities. • Homology- similarity due to descent from a common ancestor. • We can sometimes infer structure/ function from sequence similarity.
  • 7.
    PRINCIPLES • Alignment canreveal HOMOLOGY between sequences. • Similarity is descriptive term that tells about the degree of match between 2 sequences. • Sequence similarity doesn’t always imply a common function. • Conserved function doesn’t always imply similarity at the sequence level. • Convergent evolution: sequences are highly similar, but are not homologous.
  • 8.
    GOALS OF SEQUENCEALIGNMENT •To identify conserved regions & differences. •To see whether a substring in one sequence aligns well with a substring in other.
  • 9.
    EXAMPLE ALIGNMENT: GLOBINS •Figure at right shows prototypical structure of globins. • Figure at below shows part of alignment for 8 globins.
  • 10.
    TOOLS INVOLVED INALIGNMENT 1. MUSCLE 2. M- view 3. T-Coffee 4. Clastal-W 5. PyMOL 6. SABERTOOTH 7. Satsum
  • 11.
    Sequence alignment problems •No.of sequences: ✓ 2 sequences--Pairwise Alignment ✓>2 sequences– MSA • Which part to align ?? ✓Whole sequence--Global alignment ✓Parts of sequence—Local alignment • How to compute similarity ?? ✓Ways to compute substitution scores ✓ Ways to compute gap penalties.
  • 12.
    TYPES IN SEQUENCEALIGNMENT 2 types in Sequence Alignment; 1. PAIRWISE ALIGNMENT :It is a method used to find the best-matching piece-wise( Global & Local ) alignments of 2 query sequence at a time. 2. MULTIPLE SEQUENCE ALIGNMENT :It is an extension of pairwise alignment to incorporate 3/ more sequences of similar length at a time.
  • 13.
    PAIRWISE ALIGNMENT TYPES Thereare majorly 2 types of Pairwise Alignment . They are; 1. Global alignment. 2. Local alignment.
  • 14.
    GLOBAL ALIGNMENT • Thealignment is stretched over the entire sequence length to include as many matching amino acids as possible upto and including the sequence ends.Vertical lines between the sequence indicate the presence of Identical amino acids. • Involves the EMBOSS Needle tool. • Ex: Needleman-Wunsch algorithm
  • 15.
    LOCAL ALIGNMENT • TheAlignment tends to stop at the end of the regions of identity or strong similarity.A much higher priority is given to finding these local regions than extending the alignment to include more neighboring amino acid pairs . • Involves the BLAST tool. • Ex: Smith-Waterman algorithm
  • 16.
    Pairwise Alignment inMSA • The most practical & widely used method in multiple sequence alignment is the hierarchical extensions of Pairwise Alignment methods. • Here the principle is, the Multiple Alignments are achieved by successive application of Pairwise methods. Alignment help to analyze Sequence data : Organize & Visualize.
  • 17.
    ISSUES IN SEQUENCEALIGNMENT • The sequences we are comparing probably differ in length • There may b only relatively small regions in the sequence that match. • Variable length regions may have been inserted / deleted from the common ancestral sequence.
  • 18.
    Advantages of Sequencealignment: • Sequences of different length are compared. • Long sequences containing both coding and non- coding regions are compared. • Proteins from different protein families are compared to find conserved domain • Possible to determine e-values. • Checking minor differences between 2 sequences. • Easy to understand complete sequence in output. • Functional orthology detection.
  • 19.
    REFERENCES • Jurate Daugelaite,Aisling O' Driscoll, Roy D. Sleator, An Overview of Multiple Sequence Alignments and Cloud Computing in Bioinformatics ,Published 2013,DOI:10.1155/2013/6156300 • C. B. Do and K. Katoh, “Protein multiple sequence alignment” Methods in Molecular Biology,vol.484,pp.379–413,2008. • Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD. (2003). Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res., 31, 3497-3500. • J. D. Thompson, F. Plewniak, and O. Poch, “A comprehensive comparison of multiple sequence alignment programs,” Nucleic Acids Research, vol. 27, no. 13, pp. 2682–2690, 1999 • https://www.ncbi.nlm.nih.gov/protein/?term=acidic+ribosomal+protein+po+l10e
  • 20.
    “How can WORDSnot matter the foremost ,when our DNA is a sequence of LETTERS.”