SlideShare a Scribd company logo
1 of 14
Download to read offline
FASTA
Amandeep Singh
Assistant Professor
Department of Biotechnology
GSSDGS Khalsa College Patiala
Introduction
FASTA uses an algorithm for similarity search for nucleotide or protein
sequence from a biological database.
Nucleotide Sequence (Query)
Protein Sequence (Query)
Nucleotide Sequence (Database)
Protein Sequence (Database)
FASTA Algorithm
It start from a Dot-plot or Dot-matrix.
A B C D E F
A
B
M
D
L
F
Second Sequence (Database)
First Sequence
(Query)
Shows regions of similarity
between 2 Sequences
represented as diagonals.
FASTA Algorithm
• FASTA goes a step forward from dot-plot
• It calculates the sum of dots along each diagonal.
• It is a “word” based method.
• It looks for matching “word” or the sequence of patterns called “k-tuple”
Tuple: Finite ordered list of elements
Sequence patterns: 1 or 2 amino acids, or 5 or 6 nucleotides
• Build local alignment using this “word” or “k-tuple”.
• Match identical “word”
• Create diagonals by joining adjacent matches.
• Rescore the highest scoring system using PAM or BLOSUM matrix.
• Best of these scores is called init1.
• Join segments using gaps, the best score from this is called initn.
• Use Dynamic programing (Smith-Waterman algorithm) to create the optimal alignment.
FASTA Algorithm
FASTA Implementation
FASTA3 (https://www.ebi.ac.uk/Tools/sss/fasta/) at the EBI is one of
the most popular FASTA implementations.
FASTA Output
• The Histogram
• The Sequence listing
• The Local alignments
FASTA Output
The Histogram
• First part of FASTA output is Histogram.
• Predicted extreme value is represented by asterisk * symbol
• Actual numbers obtained is represented by equal = sign
• First column: z-opt score
• Second column: number of sequences with these z-opt scores
• Third column: Expected number of alignments
Histogram used to determine, whether statistical theory is valid or not.
• If equal sign follow predicted value  Valid
• If equal sign do not follow predicted value  Invalid
FASTA Output: The Histogram
FASTA Output: The Sequence listing
• Listing of the best scoring sequences in the database.
• Best sequence: reported first
• Worst sequence: reported last
First Column Second
Column
Opt
column
Last
Column
Database Database
accession
number
Database
identifier
Total length
of database
sequence
Final score E-Value
FASTA Output: The Sequence listing
FASTA Output: The Local alignments
Display:
 The local alignment
 Init1 & Initn scores
 E-value
 Opt-score
 Z-score
 Percent identity
Significance of E-Value
• E-Value or Expected value is about number of
alignments hit by chance.
• Smaller the E-value: Less likely a given alignment
occurred by chance.
Variants of FASTA
• FastA - Compares a DNA query sequence to a DNA database, or a
protein query to a protein database, detecting the sequence type
automatically.
• FASTX - Compares a DNA query to a protein database. It may
introduce gaps only between codons.
• FASTY - Compares a DNA query to a protein database, optimizing
gap location, even within codons.
• TFASTA - Compares a protein query to a DNA database.

More Related Content

What's hot

What's hot (20)

Data Retrieval Systems
Data Retrieval SystemsData Retrieval Systems
Data Retrieval Systems
 
MULTIPLE SEQUENCE ALIGNMENT
MULTIPLE  SEQUENCE  ALIGNMENTMULTIPLE  SEQUENCE  ALIGNMENT
MULTIPLE SEQUENCE ALIGNMENT
 
Dot matrix
Dot matrixDot matrix
Dot matrix
 
Structural databases
Structural databases Structural databases
Structural databases
 
Blast and fasta
Blast and fastaBlast and fasta
Blast and fasta
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Scop database
Scop databaseScop database
Scop database
 
Secondary protein structure prediction
Secondary protein structure predictionSecondary protein structure prediction
Secondary protein structure prediction
 
Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment   Clustal W - Multiple Sequence alignment
Clustal W - Multiple Sequence alignment
 
Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
Uni prot presentation
Uni prot presentationUni prot presentation
Uni prot presentation
 
Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-Sequence alig Sequence Alignment Pairwise alignment:-
Sequence alig Sequence Alignment Pairwise alignment:-
 
Prosite
PrositeProsite
Prosite
 
Swiss pdb viewer
Swiss pdb viewerSwiss pdb viewer
Swiss pdb viewer
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Sequence file formats
Sequence file formatsSequence file formats
Sequence file formats
 
sequence of file formats in bioinformatics
sequence of file formats in bioinformaticssequence of file formats in bioinformatics
sequence of file formats in bioinformatics
 
Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins Secondary Structure Prediction of proteins
Secondary Structure Prediction of proteins
 
Proteins databases
Proteins databasesProteins databases
Proteins databases
 

Similar to FASTA

Similar to FASTA (20)

Blast fasta
Blast fastaBlast fasta
Blast fasta
 
BLAST AND FASTA.pptx
BLAST AND FASTA.pptxBLAST AND FASTA.pptx
BLAST AND FASTA.pptx
 
Blast bioinformatics
Blast bioinformaticsBlast bioinformatics
Blast bioinformatics
 
Sequence comparison techniques
Sequence comparison techniquesSequence comparison techniques
Sequence comparison techniques
 
BLAST AND FASTA.pptx12345789999987544321234
BLAST AND FASTA.pptx12345789999987544321234BLAST AND FASTA.pptx12345789999987544321234
BLAST AND FASTA.pptx12345789999987544321234
 
Mayank
MayankMayank
Mayank
 
Sequence similarity tools.pptx
Sequence similarity tools.pptxSequence similarity tools.pptx
Sequence similarity tools.pptx
 
Database Searching
Database SearchingDatabase Searching
Database Searching
 
Blast 2013 1
Blast 2013 1Blast 2013 1
Blast 2013 1
 
FastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHMFastA HOMOLOGY SEARCH ALGORITHM
FastA HOMOLOGY SEARCH ALGORITHM
 
Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)Sequence homology search and multiple sequence alignment(1)
Sequence homology search and multiple sequence alignment(1)
 
Sequence database
Sequence databaseSequence database
Sequence database
 
Blast
BlastBlast
Blast
 
BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)BLAST (Basic local alignment search Tool)
BLAST (Basic local alignment search Tool)
 
BIOINFORMATICS_AND_PHYLOGENY.pdf.pdf
BIOINFORMATICS_AND_PHYLOGENY.pdf.pdfBIOINFORMATICS_AND_PHYLOGENY.pdf.pdf
BIOINFORMATICS_AND_PHYLOGENY.pdf.pdf
 
Sequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdfSequence-analysis-pairwise-alignment.pdf
Sequence-analysis-pairwise-alignment.pdf
 
lecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadflecture4.ppt Sequence Alignmentaldf sdfsadf
lecture4.ppt Sequence Alignmentaldf sdfsadf
 
Presentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informaticePresentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informatice
 
Blast Algorithm
Blast AlgorithmBlast Algorithm
Blast Algorithm
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 

More from Thapar Institute of Engineering & Technology, Patiala, Punjab, India

More from Thapar Institute of Engineering & Technology, Patiala, Punjab, India (20)

SDS PAGE
SDS PAGESDS PAGE
SDS PAGE
 
Agarose gel electrophoresis
Agarose gel electrophoresisAgarose gel electrophoresis
Agarose gel electrophoresis
 
Prokaryotic and eukaryotic cell
Prokaryotic and eukaryotic cellProkaryotic and eukaryotic cell
Prokaryotic and eukaryotic cell
 
Preparation and staining of specimens for microscopy
Preparation and staining of specimens for microscopyPreparation and staining of specimens for microscopy
Preparation and staining of specimens for microscopy
 
Microbial polysaccharides
Microbial polysaccharidesMicrobial polysaccharides
Microbial polysaccharides
 
Organic acids production copy
Organic acids production   copyOrganic acids production   copy
Organic acids production copy
 
Methods of strain improvement
Methods of strain improvementMethods of strain improvement
Methods of strain improvement
 
Refrigeration
RefrigerationRefrigeration
Refrigeration
 
Patents
PatentsPatents
Patents
 
Vaccines
VaccinesVaccines
Vaccines
 
Chemical reactions and rancidity of fats
Chemical reactions and rancidity of fatsChemical reactions and rancidity of fats
Chemical reactions and rancidity of fats
 
Characteristics of biological databases
Characteristics of biological databasesCharacteristics of biological databases
Characteristics of biological databases
 
Organoleptic properties of proteins
Organoleptic properties of proteinsOrganoleptic properties of proteins
Organoleptic properties of proteins
 
Denaturation of proteins
Denaturation of proteinsDenaturation of proteins
Denaturation of proteins
 
OMIM- Online Mendelian Inheritance in Man
OMIM- Online Mendelian Inheritance in Man OMIM- Online Mendelian Inheritance in Man
OMIM- Online Mendelian Inheritance in Man
 
Antigen & antigenicity
Antigen & antigenicityAntigen & antigenicity
Antigen & antigenicity
 
Protein Data Bank (PDB)
Protein Data Bank (PDB)Protein Data Bank (PDB)
Protein Data Bank (PDB)
 
SWISS-PROT
SWISS-PROTSWISS-PROT
SWISS-PROT
 
PIR- Protein Information Resource
PIR- Protein Information ResourcePIR- Protein Information Resource
PIR- Protein Information Resource
 
Organs of the immune system
Organs of the immune systemOrgans of the immune system
Organs of the immune system
 

Recently uploaded

call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxthorishapillay1
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxsqpmdrvczh
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayMakMakNepo
 

Recently uploaded (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Proudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptxProudly South Africa powerpoint Thorisha.pptx
Proudly South Africa powerpoint Thorisha.pptx
 
Romantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptxRomantic Opera MUSIC FOR GRADE NINE pptx
Romantic Opera MUSIC FOR GRADE NINE pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Quarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up FridayQuarter 4 Peace-education.pptx Catch Up Friday
Quarter 4 Peace-education.pptx Catch Up Friday
 

FASTA

  • 1. FASTA Amandeep Singh Assistant Professor Department of Biotechnology GSSDGS Khalsa College Patiala
  • 2. Introduction FASTA uses an algorithm for similarity search for nucleotide or protein sequence from a biological database. Nucleotide Sequence (Query) Protein Sequence (Query) Nucleotide Sequence (Database) Protein Sequence (Database)
  • 3. FASTA Algorithm It start from a Dot-plot or Dot-matrix. A B C D E F A B M D L F Second Sequence (Database) First Sequence (Query) Shows regions of similarity between 2 Sequences represented as diagonals.
  • 4. FASTA Algorithm • FASTA goes a step forward from dot-plot • It calculates the sum of dots along each diagonal. • It is a “word” based method. • It looks for matching “word” or the sequence of patterns called “k-tuple” Tuple: Finite ordered list of elements Sequence patterns: 1 or 2 amino acids, or 5 or 6 nucleotides • Build local alignment using this “word” or “k-tuple”. • Match identical “word” • Create diagonals by joining adjacent matches. • Rescore the highest scoring system using PAM or BLOSUM matrix. • Best of these scores is called init1. • Join segments using gaps, the best score from this is called initn. • Use Dynamic programing (Smith-Waterman algorithm) to create the optimal alignment.
  • 6. FASTA Implementation FASTA3 (https://www.ebi.ac.uk/Tools/sss/fasta/) at the EBI is one of the most popular FASTA implementations.
  • 7. FASTA Output • The Histogram • The Sequence listing • The Local alignments
  • 8. FASTA Output The Histogram • First part of FASTA output is Histogram. • Predicted extreme value is represented by asterisk * symbol • Actual numbers obtained is represented by equal = sign • First column: z-opt score • Second column: number of sequences with these z-opt scores • Third column: Expected number of alignments Histogram used to determine, whether statistical theory is valid or not. • If equal sign follow predicted value  Valid • If equal sign do not follow predicted value  Invalid
  • 9. FASTA Output: The Histogram
  • 10. FASTA Output: The Sequence listing • Listing of the best scoring sequences in the database. • Best sequence: reported first • Worst sequence: reported last First Column Second Column Opt column Last Column Database Database accession number Database identifier Total length of database sequence Final score E-Value
  • 11. FASTA Output: The Sequence listing
  • 12. FASTA Output: The Local alignments Display:  The local alignment  Init1 & Initn scores  E-value  Opt-score  Z-score  Percent identity
  • 13. Significance of E-Value • E-Value or Expected value is about number of alignments hit by chance. • Smaller the E-value: Less likely a given alignment occurred by chance.
  • 14. Variants of FASTA • FastA - Compares a DNA query sequence to a DNA database, or a protein query to a protein database, detecting the sequence type automatically. • FASTX - Compares a DNA query to a protein database. It may introduce gaps only between codons. • FASTY - Compares a DNA query to a protein database, optimizing gap location, even within codons. • TFASTA - Compares a protein query to a DNA database.