SlideShare a Scribd company logo
1 of 9
Lecture 5:- Bioinformatic tools for DNA technologies
Dr. Naulikha Kituyi
Department of Biological Sciences
University of Embu
Biochemistry-
2020
Bioinformatics tools
SELF-TEST QUESTIONS
•Describe the variants of BLAST and their
applications
•Discuss the relevance of sequence alignment
•Discuss the differences between Homology and
similarity
Bioinformatic tools are computer programs that
analyze one or more sequences. There are a
dizzying array of bioinformatic tools that can
analyze sequences to find protein domains
(Pfam), or that can search through databases of
millions of sequences to find ones that are similar
(BLAST) or that can find potential protein-coding
regions (ORF-Finder).
BLAST
• BLAST (Basic Local Alignment Search Tool) is one of the
most widely used tools to gain sequence information.
Finding similarity between DNA and protein sequences
against a database is one of the first things people do when
trying to get immediate information about a sequence of
interest. Doing these searches allows scientists to gain
knowledge about that particular gene’s function. BLAST
finds regions of similarity between the input sequence and
sequences found in its databases.
• Sequence alignment is a way of arranging two or more
sequences (DNA, RNA, or aa.(proteins)) to identify regions
of similar character patterns
• Sequence similarity could be a result of functional,
structural, or evolutionary relationships between the
sequences
• Procedure involves searching for series of identical or
similar characters/patterns in the same order between the
sequences
• Non identical characters aligned as mismatches or
opposite a gap in the other sequence
• Alignment made between a known sequence and unknown
sequence or between two unknown sequences
Why Sequence
Alignment (uses)?:
• Useful in DNA and Protein sequences for:
– Discovering functional information
– Predicting molecular structure
– Discovering evolutionary relationships
• Sequences that are very much alike probably
have:
– Same function
– Similar secondary and 3-D structure (if proteins)
– Shared ancestral sequence (though not always)
• Sequence alignment enables the following:
– Annotation of new sequences
– Modeling of protein structures
– Phylogenetic analysis
The BLAST Algorithm
BLAST - Basic Local Alignment Search Tool
Blast programs use a heuristic search algorithm.. Blast programs were
designed for fast database searching, with minimal sacrifice of
sensitivity for distantly related sequences. The programs search
databases in a special compressed format. Variants of BLAST
BLASTN - Compares a DNA query to a DNA database.Searches
both strands automatically. It is optimized for speed, rather than
sensitivity.
BLASTP - Compares a protein query to a protein database.
BLASTX - Compares a DNA query to a protein database , by
translating the query sequence in the 6 possible frames , and
comparing each against the database (3 reading frames from each
strand of the DNA) searching.
TBLASTN - Compares a protein query to a DNA database, in the 6
possible frames of the database.
TBLASTX - Compares the protein encoded in a DNA query to the
protein encoded in a DNA database, in the 6 6 possible frames of
both query and database sequences.
BLAST2 - Also called advanced BLAST. It can perform gapped
General View of How the BLAST
Program Works
BLAST - The program compares the
query to each sequence in database
using heuristic rules to speed up the
pairwise comparison. It creates
sequence abstraction by listing exact
and similar words. This is done in
advance for each sequence in the
database on the run for a certain query.
BLAST finds similar words between the
query and each database sequence, It
then extends such words to obtain high-
scoring sequence pairs (HSPs). It also
calculates statistics analytically like
FastA does.
Using Blast For Sequence Alignment
Please attempt the following:
Go on to www.ncbi.nlm.nih.gov. and blast
the Dengue virus sequence that we
examined in Lecture 2,( NC_001477)
using blastn. Copy and paste the
sequence accession number into the
search engine, click on search to confirm
the identity of the sequence then click on
the blastn option to see all the search
results with similar sequences. To see
the alignment for the sequences and the
percent identities, click on any of the
sequences like the one highlighted above
and go to alignment.
Terms used in Sequence Comparisons
Terms used in Sequence Comparisons
Homologous- Two related sequences are termed as
homologous to each other. These can be either
orthologs or paralogs. The homologous protein from
two different organisms with similar functions are
termed as orthologs where as homologous protein
with different function in an organism is called as
paralog.
Identity and similarity- The ratio of identical amino acids
residues to the total number of amino acids present in the
entire length of the sequence is termed as identity .
Whereas ratio of similar amino acids in a sequence
relative to the total number of amino acid present is termed
as similarity. The extend of similarity between two amino
acids is calculated with a similarity matrix
• Identity & Similarity:
• Sequence Identity: Exactly the same Amino acid or
nucleotide in the same position
• Sequence Similarity: Content includes identity and
substitutions (aa residues) with similar chemical
properties
• Similarity: A quantifiable property- Two sequences
are similar if order of sequence characters is
recognizably the same and they can be aligned
Global and Local Alignmennt
The Alignment of two query sequences can be
global or local (Figure.4). In global alignment,
the complete length of the protein sequences are
compared to another where as in the case of local
alignment, only a part of the sequence is
compared. The global alignment is used to
classify the protein into different classes where
as local alignment is used to identify the motif or
domain.
Sequence Alignment Example- Homology
10 20 30 40 50 60
HUMAN MNPLLILTFVAAALAAPFDDDDKIVGGYNCEENSVPYQVSLNSGYHFCGGSLINEQWVVS
:. ::::..:.::.: :..:::::::::.: :.:::::::::::::::::::::.:::::
MSALLILALVGAAVAFPLEDDDKIVGGYTCPEHSVPYQVSLNSGYHFCGGSLINDQWVVS
RAT
10 20 30 40 50 60
70 80 90 100 110 120
HUMAN AGHCYKSRIQVRLGEHNIEVLEGNEQFINAAKIIRHPQYDRKTLNNDIMLIKLSSRAVIN
:.::::::::::::::::.::::.::::::::::.::.:. ::::::::::::: . .:
AAHCYKSRIQVRLGEHNINVLEGDEQFINAAKIIKHPNYSSWTLNNDIMLIKLSSPVKLN
RAT
70 80 90 100 110 120
130 140 150 160 170 180
HUMAN ARVSTISLPTAPPATGTKCLISGWGNTASSGADYPDELQCLDAPVLSQAKCEASYPGKIT
:::. ..::.: .::.::::::::: :.:.. :: :::.:::::::: :::.:::.::
ARVAPVALPSACAPAGTQCLISGWGNTLSNGVNNPDLLQCVDAPVLSQADCEAAYPGEIT
RAT
130 140 150 160 170 180
190 200 210 220 230 240
HUMAN SNMFCVGFLEGGKDSCQGDSGGPVVCNGQLQGVVSWGDGCAQKNKPGVYTKVYNYVKWIK
:.:.::::::::::::::::::::::::::::.:::: ::: ..::::::: :.: ::.
RAT SSMICVGFLEGGKDSCQGDSGGPVVCNGQLQGIVSWGYGCALPDNPGVYTKVCNFVGWIQ
190 200 210 220 230 240
HUMAN NTIAAN
.:::::
DTIAAN
RAT
Human (247 aa) vs Rat (246 aa) Trypsin : show 76.4% identity (91.9% similarity) in 246 aa overlap (1-246:1-246) , E(1) < 2e-86
The similarity is statistically significant ( > expected by chance) , so sequences can be considered homologous
Detection of CPG Islands
Detection of methylated-CpG islands in easily accessible biological materials such as
serum has the potential to be useful for the early diagnosis of cancer. Most currently used
methods for detecting methylated-CpG islands are based on sodium bisulfite conversion
of genomic DNA, followed by PCR reactions
CpG Islands
The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by
a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites
occur with high frequency in genomic regions called CpG islands (or CG islands).
Cytosines in CpG dinucleotides can be methylated to form 5-methylcytosines. Enzymes
that add a methyl group are called DNA methyltransferases. In mammals, 70% to 80% of
CpG cytosines are methylated.[1] Methylating the cytosine within a gene can change its
expression, a mechanism that is part of a larger field of science studying gene regulation
that is called epigenetics.
In humans, about 70% of promoters located near the transcription start site of a gene
(proximal promoters) contain a CpG island

More Related Content

Similar to Lecture 5.pptx

blast presentation beevragh muneer.pptx
blast presentation  beevragh muneer.pptxblast presentation  beevragh muneer.pptx
blast presentation beevragh muneer.pptxhome
 
Sequence similarity tools.pptx
Sequence similarity tools.pptxSequence similarity tools.pptx
Sequence similarity tools.pptxPagudalaSangeetha
 
Basic BLAST (BLASTn)
Basic BLAST (BLASTn)Basic BLAST (BLASTn)
Basic BLAST (BLASTn)Syed Lokman
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...journal ijrtem
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...IJRTEMJOURNAL
 
BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)Sobia
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuKAUSHAL SAHU
 
Blast bioinformatics
Blast bioinformaticsBlast bioinformatics
Blast bioinformaticsatmapandey
 
Bioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptxBioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptxRanjan Jyoti Sarma
 
Sequencedatabases
SequencedatabasesSequencedatabases
SequencedatabasesAbhik Seal
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignmentbarathvaj
 

Similar to Lecture 5.pptx (20)

Blasta
BlastaBlasta
Blasta
 
BLAST
BLASTBLAST
BLAST
 
blast presentation beevragh muneer.pptx
blast presentation  beevragh muneer.pptxblast presentation  beevragh muneer.pptx
blast presentation beevragh muneer.pptx
 
Blast
BlastBlast
Blast
 
Blast
BlastBlast
Blast
 
Sequence similarity tools.pptx
Sequence similarity tools.pptxSequence similarity tools.pptx
Sequence similarity tools.pptx
 
BLAST Search tool
BLAST Search toolBLAST Search tool
BLAST Search tool
 
Sequence alignment.pptx
Sequence alignment.pptxSequence alignment.pptx
Sequence alignment.pptx
 
Basic BLAST (BLASTn)
Basic BLAST (BLASTn)Basic BLAST (BLASTn)
Basic BLAST (BLASTn)
 
BLAST
BLASTBLAST
BLAST
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
Performance Improvement of BLAST with Use of MSA Techniques to Search Ancesto...
 
BLAST
BLASTBLAST
BLAST
 
BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)BLAST(Basic Local Alignment Tool)
BLAST(Basic Local Alignment Tool)
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
 
Sequence Alignment
Sequence AlignmentSequence Alignment
Sequence Alignment
 
Blast bioinformatics
Blast bioinformaticsBlast bioinformatics
Blast bioinformatics
 
Bioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptxBioinformaatics for M.Sc. Biotecchnology.pptx
Bioinformaatics for M.Sc. Biotecchnology.pptx
 
Sequencedatabases
SequencedatabasesSequencedatabases
Sequencedatabases
 
Blast gp assignment
Blast  gp assignmentBlast  gp assignment
Blast gp assignment
 

Recently uploaded

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104misteraugie
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 

Recently uploaded (20)

Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"Mattingly "AI & Prompt Design: The Basics of Prompt Design"
Mattingly "AI & Prompt Design: The Basics of Prompt Design"
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 

Lecture 5.pptx

  • 1. Lecture 5:- Bioinformatic tools for DNA technologies Dr. Naulikha Kituyi Department of Biological Sciences University of Embu Biochemistry- 2020
  • 2. Bioinformatics tools SELF-TEST QUESTIONS •Describe the variants of BLAST and their applications •Discuss the relevance of sequence alignment •Discuss the differences between Homology and similarity Bioinformatic tools are computer programs that analyze one or more sequences. There are a dizzying array of bioinformatic tools that can analyze sequences to find protein domains (Pfam), or that can search through databases of millions of sequences to find ones that are similar (BLAST) or that can find potential protein-coding regions (ORF-Finder). BLAST • BLAST (Basic Local Alignment Search Tool) is one of the most widely used tools to gain sequence information. Finding similarity between DNA and protein sequences against a database is one of the first things people do when trying to get immediate information about a sequence of interest. Doing these searches allows scientists to gain knowledge about that particular gene’s function. BLAST finds regions of similarity between the input sequence and sequences found in its databases. • Sequence alignment is a way of arranging two or more sequences (DNA, RNA, or aa.(proteins)) to identify regions of similar character patterns • Sequence similarity could be a result of functional, structural, or evolutionary relationships between the sequences • Procedure involves searching for series of identical or similar characters/patterns in the same order between the sequences • Non identical characters aligned as mismatches or opposite a gap in the other sequence • Alignment made between a known sequence and unknown sequence or between two unknown sequences
  • 3. Why Sequence Alignment (uses)?: • Useful in DNA and Protein sequences for: – Discovering functional information – Predicting molecular structure – Discovering evolutionary relationships • Sequences that are very much alike probably have: – Same function – Similar secondary and 3-D structure (if proteins) – Shared ancestral sequence (though not always) • Sequence alignment enables the following: – Annotation of new sequences – Modeling of protein structures – Phylogenetic analysis
  • 4. The BLAST Algorithm BLAST - Basic Local Alignment Search Tool Blast programs use a heuristic search algorithm.. Blast programs were designed for fast database searching, with minimal sacrifice of sensitivity for distantly related sequences. The programs search databases in a special compressed format. Variants of BLAST BLASTN - Compares a DNA query to a DNA database.Searches both strands automatically. It is optimized for speed, rather than sensitivity. BLASTP - Compares a protein query to a protein database. BLASTX - Compares a DNA query to a protein database , by translating the query sequence in the 6 possible frames , and comparing each against the database (3 reading frames from each strand of the DNA) searching. TBLASTN - Compares a protein query to a DNA database, in the 6 possible frames of the database. TBLASTX - Compares the protein encoded in a DNA query to the protein encoded in a DNA database, in the 6 6 possible frames of both query and database sequences. BLAST2 - Also called advanced BLAST. It can perform gapped General View of How the BLAST Program Works BLAST - The program compares the query to each sequence in database using heuristic rules to speed up the pairwise comparison. It creates sequence abstraction by listing exact and similar words. This is done in advance for each sequence in the database on the run for a certain query. BLAST finds similar words between the query and each database sequence, It then extends such words to obtain high- scoring sequence pairs (HSPs). It also calculates statistics analytically like FastA does.
  • 5. Using Blast For Sequence Alignment Please attempt the following: Go on to www.ncbi.nlm.nih.gov. and blast the Dengue virus sequence that we examined in Lecture 2,( NC_001477) using blastn. Copy and paste the sequence accession number into the search engine, click on search to confirm the identity of the sequence then click on the blastn option to see all the search results with similar sequences. To see the alignment for the sequences and the percent identities, click on any of the sequences like the one highlighted above and go to alignment.
  • 6. Terms used in Sequence Comparisons Terms used in Sequence Comparisons Homologous- Two related sequences are termed as homologous to each other. These can be either orthologs or paralogs. The homologous protein from two different organisms with similar functions are termed as orthologs where as homologous protein with different function in an organism is called as paralog. Identity and similarity- The ratio of identical amino acids residues to the total number of amino acids present in the entire length of the sequence is termed as identity . Whereas ratio of similar amino acids in a sequence relative to the total number of amino acid present is termed as similarity. The extend of similarity between two amino acids is calculated with a similarity matrix • Identity & Similarity: • Sequence Identity: Exactly the same Amino acid or nucleotide in the same position • Sequence Similarity: Content includes identity and substitutions (aa residues) with similar chemical properties • Similarity: A quantifiable property- Two sequences are similar if order of sequence characters is recognizably the same and they can be aligned
  • 7. Global and Local Alignmennt The Alignment of two query sequences can be global or local (Figure.4). In global alignment, the complete length of the protein sequences are compared to another where as in the case of local alignment, only a part of the sequence is compared. The global alignment is used to classify the protein into different classes where as local alignment is used to identify the motif or domain.
  • 8. Sequence Alignment Example- Homology 10 20 30 40 50 60 HUMAN MNPLLILTFVAAALAAPFDDDDKIVGGYNCEENSVPYQVSLNSGYHFCGGSLINEQWVVS :. ::::..:.::.: :..:::::::::.: :.:::::::::::::::::::::.::::: MSALLILALVGAAVAFPLEDDDKIVGGYTCPEHSVPYQVSLNSGYHFCGGSLINDQWVVS RAT 10 20 30 40 50 60 70 80 90 100 110 120 HUMAN AGHCYKSRIQVRLGEHNIEVLEGNEQFINAAKIIRHPQYDRKTLNNDIMLIKLSSRAVIN :.::::::::::::::::.::::.::::::::::.::.:. ::::::::::::: . .: AAHCYKSRIQVRLGEHNINVLEGDEQFINAAKIIKHPNYSSWTLNNDIMLIKLSSPVKLN RAT 70 80 90 100 110 120 130 140 150 160 170 180 HUMAN ARVSTISLPTAPPATGTKCLISGWGNTASSGADYPDELQCLDAPVLSQAKCEASYPGKIT :::. ..::.: .::.::::::::: :.:.. :: :::.:::::::: :::.:::.:: ARVAPVALPSACAPAGTQCLISGWGNTLSNGVNNPDLLQCVDAPVLSQADCEAAYPGEIT RAT 130 140 150 160 170 180 190 200 210 220 230 240 HUMAN SNMFCVGFLEGGKDSCQGDSGGPVVCNGQLQGVVSWGDGCAQKNKPGVYTKVYNYVKWIK :.:.::::::::::::::::::::::::::::.:::: ::: ..::::::: :.: ::. RAT SSMICVGFLEGGKDSCQGDSGGPVVCNGQLQGIVSWGYGCALPDNPGVYTKVCNFVGWIQ 190 200 210 220 230 240 HUMAN NTIAAN .::::: DTIAAN RAT Human (247 aa) vs Rat (246 aa) Trypsin : show 76.4% identity (91.9% similarity) in 246 aa overlap (1-246:1-246) , E(1) < 2e-86 The similarity is statistically significant ( > expected by chance) , so sequences can be considered homologous
  • 9. Detection of CPG Islands Detection of methylated-CpG islands in easily accessible biological materials such as serum has the potential to be useful for the early diagnosis of cancer. Most currently used methods for detecting methylated-CpG islands are based on sodium bisulfite conversion of genomic DNA, followed by PCR reactions CpG Islands The CpG sites or CG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' → 3' direction. CpG sites occur with high frequency in genomic regions called CpG islands (or CG islands). Cytosines in CpG dinucleotides can be methylated to form 5-methylcytosines. Enzymes that add a methyl group are called DNA methyltransferases. In mammals, 70% to 80% of CpG cytosines are methylated.[1] Methylating the cytosine within a gene can change its expression, a mechanism that is part of a larger field of science studying gene regulation that is called epigenetics. In humans, about 70% of promoters located near the transcription start site of a gene (proximal promoters) contain a CpG island