SlideShare a Scribd company logo
1 of 50
Genome Annotation
      Delivered by
  Muhammad Tajammal Khan
      M.Phil (Botany)
      11-arid-3759
Definition:     Genome Annotation is the process of
interpreting raw sequence data into useful biological
information Annotations describe the genome and
transform raw genome sequences into biological
information by integrating computational analyses,
other biological data and biological expertise.
Unannotated DNA

  5'                           3'

Annotated DNA




Legend:

       Exon (protein coding)
       Intron
       Intergenic sequence
Annotation may be
Structural annotation
ORFs and their localisation (http://www.ncbi.nlm.nih.gov/gorf/gorf.html)
gene structure
coding regions
location of regulatory motifs


Functional annotation
biochemical function
biological function
involved regulation and interactions
expression
Things we are looking to
annotate?

   CDS
   mRNA
   Promoter and Poly-A Signal
   Pseudogenes
   ncRNA
Tools
   ORF detectors
    ◦ NCBI: http://www.ncbi.nih.gov/gorf/gorf.html
   Promoter predictors
    ◦ CSHL: http://rulai.cshl.org/software/index1.htm
    ◦ BDGP: fruitfly.org/seq_tools/promoter.html
    ◦ ICG: TATA-Box predictor
   PolyA signal predictors
    ◦ CSHL: argon.cshl.org/tabaska/polyadq_form.html
   Splice site predictors
    ◦ BDGP:
      http://www.fruitfly.org/seq_tools/splice.html
   Start-/stop-codon identifiers
    ◦ DNALC: Translator/ORF-Finder
    ◦ BCM: Searchlauncher
Overview of
genome
analysis
Two approaches to genome sequencing
Whole Genome Shotgun
An approach used to decode an organism's genome
by shredding it into smaller fragments of DNA which
can be sequenced individually. The sequences of these
fragments are then ordered, based on overlaps in the
genetic code, and finally reassembled into the complete
sequence. The 'whole genome shotgun' (WGS) method is
applied to the entire genome all at once, while the
'hierarchical shotgun' method is applied to large,
overlapping DNA fragments of known location in
the genome.
Two approaches to genome sequencing

Hierarchical shotgun method
Assemble contigs from various chromosomes, then sequence and assemble
them. A contig is a set of overlapping clones or sequences from which a
sequence can be obtained.

A contig is thus a chromosome map showing the locations of those regions of
a chromosome where contiguous DNA segments overlap. Contig maps are
important because they provide the ability to study a complete, and often
large segment of the genome by examining a series of overlapping clones
which then provide an unbroken succession of information about
that region.
Hierarchical vs. Whole Genome
Sequencing technology in 12 steps
1. Prepare genomic DNA

                                           2. Attach DNA to surface

    DNA                                    3. Bridge amplification

                                           4. Fragments become
                       adapters
                                           double stranded

                                           5. Denature the double-
                                           stranded molecules

                                           6. Complete amplification

Randomly fragment genomic DNA and ligate
adapters to both ends of the fragments
adapter
                           DNA                 1. Prepare genomic DNA
                           fragment
                                               2. Attach DNA to surface

                           dense lawn          3. Bridge amplification
                           of primers
            adapter                            4. Fragments become
                                               double stranded

                                               5. Denature the double-
                                               stranded molecules

                                               6. Complete amplification



Bind single-stranded fragments randomly to
the inside surface of the flow cell channels
1. Prepare genomic DNA

                                            2. Attach DNA to surface

                                            3. Bridge amplification

                                            4. Fragments become
                                            double stranded

                                            5. Denature the double-
                                            stranded molecules

                                            6. Complete amplification


Add unlabeled nucleotides and enzyme to
initiate solid-phase bridge amplification
1. Prepare genomic DNA

                                             2. Attach DNA to surface

                               Attached      3. Bridge amplification
Attached terminus   free       terminus
                    terminus                 4. Fragments become
                                             double stranded

                                             5. Denature the double-
                                             stranded molecules

                                             6. Complete amplification

    The enzyme incorporates nucleotides to
    build double-stranded bridges on the
    solid-phase substrate
1. Prepare genomic DNA

                                 2. Attach DNA to surface

                     Attached    3. Bridge amplification
       Attached
                                 4. Fragments become
                                 double stranded

                                 5. Denature the double-
                                 stranded molecules

                                 6. Complete amplification

Denaturation leaves single-
stranded templates anchored to
the substrate
1. Prepare genomic DNA

                                       2. Attach DNA to surface

                                       3. Bridge amplification

                                       4. Fragments become
                                       double stranded

                                       5. Denature the double-
                                       stranded molecules
                     Clusters
                                       6. Complete amplification

Several million dense clusters of
double-stranded DNA are generated in
each channel of the flow cell
7. Determine first base

                                              8. Image first base

                                              9. Determine second base

                                              10. Image second
                                              chemistry cycle

                                              11. Sequencing over
                                              multiple chemistry cycles

                                              12. Align data
                 Laser

The first sequencing cycle begins by
adding four labeled reversible terminators,
primers, and DNA polymerase
7. Determine first base

                                             8. Image first base

                                             9. Determine second base

                                             10. Image second
                                             chemistry cycle

                                             11. Sequencing over
                                             multiple chemistry cycles

                                             12. Align data

After laser excitation, the emitted
fluorescence from each cluster is captured
and the first base is identified
7. Determine first base

                                           8. Image first base

                                           9. Determine second base

                                           10. Image second
                                           chemistry cycle

                                           11. Sequencing over
                                           multiple chemistry cycles

                    Laser                  12. Align data

The next cycle repeats the incorporation
of four labeled reversible terminators,
primers, and DNA polymerase
7. Determine first base

                                          8. Image first base

                                          9. Determine second base

                                          10. Image second
                                          chemistry cycle

                                          11. Sequencing over
                                          multiple chemistry cycles

                                          12. Align data

After laser excitation the image is
captured as before, and the identity of
the second base is recorded.
7. Determine first base

                                        8. Image first base

                                        9. Determine second base

                                        10. Image second
                                        chemistry cycle

                                        11. Sequencing over
                                        multiple chemistry cycles

                                        12. Align data

The sequencing cycles are repeated to
determine the sequence of bases in a
fragment, one base at a time.
Reference
    sequence
                                       7. Determine first base

                                       8. Image first base

                                       9. Determine second base

                                       10. Image second
                                       chemistry cycle

                                       11. Sequencing over
                                       multiple chemistry cycles

                                       12. Align data

The data are aligned and compared to
a reference, and sequencing
differences are identified.
The generic structure of an automatic genome annotation pipeline and
delivery system
The Annotation Process




               ANNALYSIS SOFTWARE
DNA SEQUENCE
                                                Useful
                                                Information




                                    Annotator
Annotation Process


                           DNA sequence




RepeatMasker    Blastn     Gene finders      Blastx     Halfwise    tRNA scan




 Repeats       Promoters    rRNA     Pseudo-Genes                       tRNA
                                                          Genes


    Fasta       BlastP     Pfam    Prosite      Psort     SignalP    TMHMM
Genome Browsers




Generic Genome Browser (CSHL)           NCBI Map Viewer                 Ensembl Genome Browser
www.wormbase.org/db/seq/gbrowse   www.ncbi.nlm.nih.gov/mapview/             www.ensembl.org/




               UCSC Genome Browser                                 Apollo Genome Browser
    genome.ucsc.edu/cgi-bin/hgGateway?org=human                   www.bdgp.org/annot/apollo/
What is gene
                    prediction?

Detecting meaningful signals in uncharacterised DNA sequences.
Knowledge of the interesting information in DNA.




 GATCGGTCGAGCGTAAGCTAGCTAG
 ATCGATGATCGATCGGCCATATATC
 ACTAGAGCTAGAATCGATAATCGAT
 CGATATAGCTATAGCTATAGCCTAT



  Gene prediction is ‘recognising protein-
 coding regions in genomic sequence’
Basic Gene Prediction Flow
      Chart
            Obtain new genomic DNA sequence



1. Translate in all six reading frames and compare to protein
sequence databases
2. Perform database similarity search of expressed sequence tag
Sites (EST) database of same organism, or cDNA sequences if
 available
           Use gene prediction program to locate genes


           Analyze regulatory sequences in the gene
Approaches to gene prediction

Ab Initio Gene Finding
        http://exon.gatech.edu/GenMark/eukhmm.cgi
        http://sun1.softberry.com/berry.phtml=fgenesh&group=programs
        &subgroup=gfind
Repeat Masking
      http://www.repeatmasker.org
      http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker
Transcript based prediction
       http://plantta.tigr.org
       http://harvest.ucr.edu/
Gene function CDNA
       http://au.expasy.org/sprot/
       http://www.pir.uniprot.org/
Gene Ontologies
       http://www.geneontology.org
Visualization Tools
         http://www.gmod.org/?q=node/4
         http://www.gmod.org/?q=node/71
Prediciton of Secondary Structure and Folding Classes
• nnpredict      http://www.cmpharm.ucsf.edu/_nomi/nnpredict.html
• PredictProtein    http://www.embl-heidelberg.de/predictprotein/
• SOPMA              http://pbil.ibcp.fr/
• Jpred              http://jura.ebi.ac.uk:8888/
• PSIPRED            http://insulin.brunel.ac.uk/psipred
• PREDATOR           http://www.embl-heidelberg.de/predator.html
Prediction of Specialized Structures or Features
• COILS          http://www.ch.embnet.org/software/COILSform.html
• MacStripe          www.york.ac.uk/depts/biol/units/coils/mstr2.html
• PHDtopology        http://www.embl-heidelberg.de/predictprotein
• SignalP            http://www.cbs.dtu.dk/services/SignalP/
• TMpred             http://www.isrec.isb-sib.ch/ftp-erver/tmpred
                        www/TMPREDform.html
Structure Prediction
• DALI               http://www2.ebi.ac.uk/dali/
• Bryant-Lawrence ftp://ncbi.nlm.nih.gov/pub/pkb/
• FSSP                http://www2.ebi.ac.uk/dali/fssp/
• UCLA-DOE           http://fold.doe-mbi.ucla.edu/Home
• SWISS-MODEL         http://www.expasy.ch/swissmod/SWISS-MODEL
Search using the gene
name
Click the name for transcript
info
Click the contig to see detailed information
Click the contig to see detailed information




Click here to export the contig
Select output format, then click Continue
Save or copy the data for further analysis
Genomic region
Coding sequence
Click here for pairwise BLAST
Click ‘Align’ to
proceed




       Paste in genomic sequence




       Paste in CDS sequence
This is a protein BLAST
(BLASTP)
Some Concluding remarks

 Trust but verify
 Beware of gene prediction tools!
 Always use more than one gene
  prediction tool and more than one
  genome when possible.
 Active area of bioinformatics research,
  so be mindful of the new literature in
  this .
Gemome annotation

More Related Content

What's hot (20)

SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)SAGE (Serial analysis of Gene Expression)
SAGE (Serial analysis of Gene Expression)
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Gene prediction and expression
Gene prediction and expressionGene prediction and expression
Gene prediction and expression
 
Cath
CathCath
Cath
 
Expressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular markerExpressed sequence tag (EST), molecular marker
Expressed sequence tag (EST), molecular marker
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
 
Transcriptome analysis
Transcriptome analysisTranscriptome analysis
Transcriptome analysis
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
Est database
Est databaseEst database
Est database
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Genetic and Physical map of Genome
Genetic and Physical map of GenomeGenetic and Physical map of Genome
Genetic and Physical map of Genome
 
Serial analysis of gene expression
Serial analysis of gene expressionSerial analysis of gene expression
Serial analysis of gene expression
 
Dot matrix
Dot matrixDot matrix
Dot matrix
 
ChIP-seq
ChIP-seqChIP-seq
ChIP-seq
 
Web based servers and softwares for genome analysis
Web based servers and softwares for genome analysisWeb based servers and softwares for genome analysis
Web based servers and softwares for genome analysis
 
Protein Structure Prediction
Protein Structure PredictionProtein Structure Prediction
Protein Structure Prediction
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
 
Protein-Protein Interactions (PPIs)
Protein-Protein Interactions (PPIs)Protein-Protein Interactions (PPIs)
Protein-Protein Interactions (PPIs)
 
Genomic databases
Genomic databasesGenomic databases
Genomic databases
 

Viewers also liked

BIOL335: How to annotate a genome
BIOL335: How to annotate a genomeBIOL335: How to annotate a genome
BIOL335: How to annotate a genomePaul Gardner
 
Gene identification and discovery
Gene identification and discoveryGene identification and discovery
Gene identification and discoveryAmit Ruchi Yadav
 
Genome assembly: then and now — v1.0
Genome assembly: then and now — v1.0Genome assembly: then and now — v1.0
Genome assembly: then and now — v1.0Keith Bradnam
 
2 md2016 annotation
2 md2016 annotation2 md2016 annotation
2 md2016 annotationScott Dawson
 
Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012Mark Pallen
 
Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Keith Bradnam
 
Open Reading Frames
Open Reading FramesOpen Reading Frames
Open Reading FramesOsama Zahid
 
Computational Approaches to Systems Biology
Computational Approaches to Systems BiologyComputational Approaches to Systems Biology
Computational Approaches to Systems BiologyMike Hucka
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBIgeetikaJethra
 
The human genome project
The human genome projectThe human genome project
The human genome projectSahil Biswas
 
Molecular markers used in biotechnology
Molecular markers used in biotechnology Molecular markers used in biotechnology
Molecular markers used in biotechnology sana sana
 
Fine structure of gene
Fine structure of geneFine structure of gene
Fine structure of geneSayali28
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencingShital Pal
 
15 molecular markers techniques
15 molecular markers techniques15 molecular markers techniques
15 molecular markers techniquesAVINASH KUSHWAHA
 

Viewers also liked (20)

Genome annotation 2013
Genome annotation 2013Genome annotation 2013
Genome annotation 2013
 
BIOL335: How to annotate a genome
BIOL335: How to annotate a genomeBIOL335: How to annotate a genome
BIOL335: How to annotate a genome
 
Gene identification and discovery
Gene identification and discoveryGene identification and discovery
Gene identification and discovery
 
Genome assembly: then and now — v1.0
Genome assembly: then and now — v1.0Genome assembly: then and now — v1.0
Genome assembly: then and now — v1.0
 
Introduction to Apollo for i5k
Introduction to Apollo for i5kIntroduction to Apollo for i5k
Introduction to Apollo for i5k
 
2 md2016 annotation
2 md2016 annotation2 md2016 annotation
2 md2016 annotation
 
Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012Bio305 genome analysis and annotation 2012
Bio305 genome analysis and annotation 2012
 
Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...Genome assembly: the art of trying to make one big thing from millions of ver...
Genome assembly: the art of trying to make one big thing from millions of ver...
 
Open Reading Frames
Open Reading FramesOpen Reading Frames
Open Reading Frames
 
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
 
Data mining ppt
Data mining pptData mining ppt
Data mining ppt
 
Computational Approaches to Systems Biology
Computational Approaches to Systems BiologyComputational Approaches to Systems Biology
Computational Approaches to Systems Biology
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Introduction to NCBI
Introduction to NCBIIntroduction to NCBI
Introduction to NCBI
 
Rflp,rapd&aflp
Rflp,rapd&aflpRflp,rapd&aflp
Rflp,rapd&aflp
 
The human genome project
The human genome projectThe human genome project
The human genome project
 
Molecular markers used in biotechnology
Molecular markers used in biotechnology Molecular markers used in biotechnology
Molecular markers used in biotechnology
 
Fine structure of gene
Fine structure of geneFine structure of gene
Fine structure of gene
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
15 molecular markers techniques
15 molecular markers techniques15 molecular markers techniques
15 molecular markers techniques
 

Similar to Gemome annotation

Dna Replication Slide
Dna Replication SlideDna Replication Slide
Dna Replication SlideQuanina Quan
 
Dna replication slide
Dna replication slideDna replication slide
Dna replication slideQuanina Quan
 
7.1 dna structure & replication
7.1 dna structure & replication7.1 dna structure & replication
7.1 dna structure & replicationdabagus
 
transcription and translation ppt 16.pptx
transcription and translation ppt 16.pptxtranscription and translation ppt 16.pptx
transcription and translation ppt 16.pptxKennedyKen2
 
Donohue dna practice questions
Donohue dna practice questionsDonohue dna practice questions
Donohue dna practice questionsMaria Donohue
 
Ch09 lecture dna and its role in heredity
Ch09 lecture dna and its role in heredityCh09 lecture dna and its role in heredity
Ch09 lecture dna and its role in heredityTia Hohler
 
2023 REPLICATION COMPLETE.pptx
2023 REPLICATION COMPLETE.pptx2023 REPLICATION COMPLETE.pptx
2023 REPLICATION COMPLETE.pptxFaridahAhmed1
 
Prokaryotic and eukaryotic dna replication with their clinical applications
Prokaryotic and eukaryotic dna replication with their clinical applicationsProkaryotic and eukaryotic dna replication with their clinical applications
Prokaryotic and eukaryotic dna replication with their clinical applicationsrohini sane
 
Biok_2.7_DNA_replication_transcription_ and_translation.pptx
Biok_2.7_DNA_replication_transcription_ and_translation.pptxBiok_2.7_DNA_replication_transcription_ and_translation.pptx
Biok_2.7_DNA_replication_transcription_ and_translation.pptxguptav2
 
IDENTIFICATION OF PROTEIN BINDING SITE.docx
IDENTIFICATION OF PROTEIN BINDING SITE.docxIDENTIFICATION OF PROTEIN BINDING SITE.docx
IDENTIFICATION OF PROTEIN BINDING SITE.docxSNEHA AGRAWAL GUPTA
 
IB Biology 2.7 & 7.1 Slides: DNA Replication
IB Biology 2.7 & 7.1 Slides: DNA ReplicationIB Biology 2.7 & 7.1 Slides: DNA Replication
IB Biology 2.7 & 7.1 Slides: DNA ReplicationJacob Cedarbaum
 
Chapter 12 notes
Chapter 12 notesChapter 12 notes
Chapter 12 notesCXG050
 
Chp 12 cornell notes
Chp 12 cornell notesChp 12 cornell notes
Chp 12 cornell notesMRINCON002
 
Notes chpt 12
Notes chpt 12Notes chpt 12
Notes chpt 12jfg082
 
12.2 replication of dna
12.2 replication of dna12.2 replication of dna
12.2 replication of dnakathy_lambert
 
Replication class final.ppt
Replication class final.ppt Replication class final.ppt
Replication class final.ppt biochemistry1234
 
Human genome project
Human genome projectHuman genome project
Human genome project15cookho
 
Replication of DNA
Replication of DNAReplication of DNA
Replication of DNAFarhana Atia
 

Similar to Gemome annotation (20)

MOLCULAR BIOLOGY
MOLCULAR BIOLOGY MOLCULAR BIOLOGY
MOLCULAR BIOLOGY
 
Dna Replication Slide
Dna Replication SlideDna Replication Slide
Dna Replication Slide
 
Dna replication slide
Dna replication slideDna replication slide
Dna replication slide
 
7.1 dna structure & replication
7.1 dna structure & replication7.1 dna structure & replication
7.1 dna structure & replication
 
transcription and translation ppt 16.pptx
transcription and translation ppt 16.pptxtranscription and translation ppt 16.pptx
transcription and translation ppt 16.pptx
 
Donohue dna practice questions
Donohue dna practice questionsDonohue dna practice questions
Donohue dna practice questions
 
Ch09 lecture dna and its role in heredity
Ch09 lecture dna and its role in heredityCh09 lecture dna and its role in heredity
Ch09 lecture dna and its role in heredity
 
2023 REPLICATION COMPLETE.pptx
2023 REPLICATION COMPLETE.pptx2023 REPLICATION COMPLETE.pptx
2023 REPLICATION COMPLETE.pptx
 
Prokaryotic and eukaryotic dna replication with their clinical applications
Prokaryotic and eukaryotic dna replication with their clinical applicationsProkaryotic and eukaryotic dna replication with their clinical applications
Prokaryotic and eukaryotic dna replication with their clinical applications
 
Biok_2.7_DNA_replication_transcription_ and_translation.pptx
Biok_2.7_DNA_replication_transcription_ and_translation.pptxBiok_2.7_DNA_replication_transcription_ and_translation.pptx
Biok_2.7_DNA_replication_transcription_ and_translation.pptx
 
IDENTIFICATION OF PROTEIN BINDING SITE.docx
IDENTIFICATION OF PROTEIN BINDING SITE.docxIDENTIFICATION OF PROTEIN BINDING SITE.docx
IDENTIFICATION OF PROTEIN BINDING SITE.docx
 
IB Biology 2.7 & 7.1 Slides: DNA Replication
IB Biology 2.7 & 7.1 Slides: DNA ReplicationIB Biology 2.7 & 7.1 Slides: DNA Replication
IB Biology 2.7 & 7.1 Slides: DNA Replication
 
Chapter 12 notes
Chapter 12 notesChapter 12 notes
Chapter 12 notes
 
Chp 12 cornell notes
Chp 12 cornell notesChp 12 cornell notes
Chp 12 cornell notes
 
Notes chpt 12
Notes chpt 12Notes chpt 12
Notes chpt 12
 
12.2 replication of dna
12.2 replication of dna12.2 replication of dna
12.2 replication of dna
 
Replication class final.ppt
Replication class final.ppt Replication class final.ppt
Replication class final.ppt
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Tame documents
Tame documents Tame documents
Tame documents
 
Replication of DNA
Replication of DNAReplication of DNA
Replication of DNA
 

Recently uploaded

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 

Recently uploaded (20)

Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 

Gemome annotation

  • 1. Genome Annotation Delivered by Muhammad Tajammal Khan M.Phil (Botany) 11-arid-3759
  • 2. Definition: Genome Annotation is the process of interpreting raw sequence data into useful biological information Annotations describe the genome and transform raw genome sequences into biological information by integrating computational analyses, other biological data and biological expertise.
  • 3. Unannotated DNA 5' 3' Annotated DNA Legend: Exon (protein coding) Intron Intergenic sequence
  • 4. Annotation may be Structural annotation ORFs and their localisation (http://www.ncbi.nlm.nih.gov/gorf/gorf.html) gene structure coding regions location of regulatory motifs Functional annotation biochemical function biological function involved regulation and interactions expression
  • 5. Things we are looking to annotate?  CDS  mRNA  Promoter and Poly-A Signal  Pseudogenes  ncRNA
  • 6. Tools  ORF detectors ◦ NCBI: http://www.ncbi.nih.gov/gorf/gorf.html  Promoter predictors ◦ CSHL: http://rulai.cshl.org/software/index1.htm ◦ BDGP: fruitfly.org/seq_tools/promoter.html ◦ ICG: TATA-Box predictor  PolyA signal predictors ◦ CSHL: argon.cshl.org/tabaska/polyadq_form.html  Splice site predictors ◦ BDGP: http://www.fruitfly.org/seq_tools/splice.html  Start-/stop-codon identifiers ◦ DNALC: Translator/ORF-Finder ◦ BCM: Searchlauncher
  • 8. Two approaches to genome sequencing Whole Genome Shotgun An approach used to decode an organism's genome by shredding it into smaller fragments of DNA which can be sequenced individually. The sequences of these fragments are then ordered, based on overlaps in the genetic code, and finally reassembled into the complete sequence. The 'whole genome shotgun' (WGS) method is applied to the entire genome all at once, while the 'hierarchical shotgun' method is applied to large, overlapping DNA fragments of known location in the genome.
  • 9. Two approaches to genome sequencing Hierarchical shotgun method Assemble contigs from various chromosomes, then sequence and assemble them. A contig is a set of overlapping clones or sequences from which a sequence can be obtained. A contig is thus a chromosome map showing the locations of those regions of a chromosome where contiguous DNA segments overlap. Contig maps are important because they provide the ability to study a complete, and often large segment of the genome by examining a series of overlapping clones which then provide an unbroken succession of information about that region.
  • 12. 1. Prepare genomic DNA 2. Attach DNA to surface DNA 3. Bridge amplification 4. Fragments become adapters double stranded 5. Denature the double- stranded molecules 6. Complete amplification Randomly fragment genomic DNA and ligate adapters to both ends of the fragments
  • 13. adapter DNA 1. Prepare genomic DNA fragment 2. Attach DNA to surface dense lawn 3. Bridge amplification of primers adapter 4. Fragments become double stranded 5. Denature the double- stranded molecules 6. Complete amplification Bind single-stranded fragments randomly to the inside surface of the flow cell channels
  • 14. 1. Prepare genomic DNA 2. Attach DNA to surface 3. Bridge amplification 4. Fragments become double stranded 5. Denature the double- stranded molecules 6. Complete amplification Add unlabeled nucleotides and enzyme to initiate solid-phase bridge amplification
  • 15. 1. Prepare genomic DNA 2. Attach DNA to surface Attached 3. Bridge amplification Attached terminus free terminus terminus 4. Fragments become double stranded 5. Denature the double- stranded molecules 6. Complete amplification The enzyme incorporates nucleotides to build double-stranded bridges on the solid-phase substrate
  • 16. 1. Prepare genomic DNA 2. Attach DNA to surface Attached 3. Bridge amplification Attached 4. Fragments become double stranded 5. Denature the double- stranded molecules 6. Complete amplification Denaturation leaves single- stranded templates anchored to the substrate
  • 17. 1. Prepare genomic DNA 2. Attach DNA to surface 3. Bridge amplification 4. Fragments become double stranded 5. Denature the double- stranded molecules Clusters 6. Complete amplification Several million dense clusters of double-stranded DNA are generated in each channel of the flow cell
  • 18. 7. Determine first base 8. Image first base 9. Determine second base 10. Image second chemistry cycle 11. Sequencing over multiple chemistry cycles 12. Align data Laser The first sequencing cycle begins by adding four labeled reversible terminators, primers, and DNA polymerase
  • 19. 7. Determine first base 8. Image first base 9. Determine second base 10. Image second chemistry cycle 11. Sequencing over multiple chemistry cycles 12. Align data After laser excitation, the emitted fluorescence from each cluster is captured and the first base is identified
  • 20. 7. Determine first base 8. Image first base 9. Determine second base 10. Image second chemistry cycle 11. Sequencing over multiple chemistry cycles Laser 12. Align data The next cycle repeats the incorporation of four labeled reversible terminators, primers, and DNA polymerase
  • 21. 7. Determine first base 8. Image first base 9. Determine second base 10. Image second chemistry cycle 11. Sequencing over multiple chemistry cycles 12. Align data After laser excitation the image is captured as before, and the identity of the second base is recorded.
  • 22. 7. Determine first base 8. Image first base 9. Determine second base 10. Image second chemistry cycle 11. Sequencing over multiple chemistry cycles 12. Align data The sequencing cycles are repeated to determine the sequence of bases in a fragment, one base at a time.
  • 23. Reference sequence 7. Determine first base 8. Image first base 9. Determine second base 10. Image second chemistry cycle 11. Sequencing over multiple chemistry cycles 12. Align data The data are aligned and compared to a reference, and sequencing differences are identified.
  • 24. The generic structure of an automatic genome annotation pipeline and delivery system
  • 25. The Annotation Process ANNALYSIS SOFTWARE DNA SEQUENCE Useful Information Annotator
  • 26. Annotation Process DNA sequence RepeatMasker Blastn Gene finders Blastx Halfwise tRNA scan Repeats Promoters rRNA Pseudo-Genes tRNA Genes Fasta BlastP Pfam Prosite Psort SignalP TMHMM
  • 27. Genome Browsers Generic Genome Browser (CSHL) NCBI Map Viewer Ensembl Genome Browser www.wormbase.org/db/seq/gbrowse www.ncbi.nlm.nih.gov/mapview/ www.ensembl.org/ UCSC Genome Browser Apollo Genome Browser genome.ucsc.edu/cgi-bin/hgGateway?org=human www.bdgp.org/annot/apollo/
  • 28. What is gene prediction? Detecting meaningful signals in uncharacterised DNA sequences. Knowledge of the interesting information in DNA. GATCGGTCGAGCGTAAGCTAGCTAG ATCGATGATCGATCGGCCATATATC ACTAGAGCTAGAATCGATAATCGAT CGATATAGCTATAGCTATAGCCTAT  Gene prediction is ‘recognising protein- coding regions in genomic sequence’
  • 29. Basic Gene Prediction Flow Chart Obtain new genomic DNA sequence 1. Translate in all six reading frames and compare to protein sequence databases 2. Perform database similarity search of expressed sequence tag Sites (EST) database of same organism, or cDNA sequences if available Use gene prediction program to locate genes Analyze regulatory sequences in the gene
  • 30. Approaches to gene prediction Ab Initio Gene Finding http://exon.gatech.edu/GenMark/eukhmm.cgi http://sun1.softberry.com/berry.phtml=fgenesh&group=programs &subgroup=gfind Repeat Masking http://www.repeatmasker.org http://www.repeatmasker.org/cgi-bin/WEBRepeatMasker Transcript based prediction http://plantta.tigr.org http://harvest.ucr.edu/ Gene function CDNA http://au.expasy.org/sprot/ http://www.pir.uniprot.org/ Gene Ontologies http://www.geneontology.org
  • 31. Visualization Tools http://www.gmod.org/?q=node/4 http://www.gmod.org/?q=node/71
  • 32. Prediciton of Secondary Structure and Folding Classes • nnpredict http://www.cmpharm.ucsf.edu/_nomi/nnpredict.html • PredictProtein http://www.embl-heidelberg.de/predictprotein/ • SOPMA http://pbil.ibcp.fr/ • Jpred http://jura.ebi.ac.uk:8888/ • PSIPRED http://insulin.brunel.ac.uk/psipred • PREDATOR http://www.embl-heidelberg.de/predator.html Prediction of Specialized Structures or Features • COILS http://www.ch.embnet.org/software/COILSform.html • MacStripe www.york.ac.uk/depts/biol/units/coils/mstr2.html • PHDtopology http://www.embl-heidelberg.de/predictprotein • SignalP http://www.cbs.dtu.dk/services/SignalP/ • TMpred http://www.isrec.isb-sib.ch/ftp-erver/tmpred www/TMPREDform.html Structure Prediction • DALI http://www2.ebi.ac.uk/dali/ • Bryant-Lawrence ftp://ncbi.nlm.nih.gov/pub/pkb/ • FSSP http://www2.ebi.ac.uk/dali/fssp/ • UCLA-DOE http://fold.doe-mbi.ucla.edu/Home • SWISS-MODEL http://www.expasy.ch/swissmod/SWISS-MODEL
  • 33. Search using the gene name
  • 34. Click the name for transcript info
  • 35. Click the contig to see detailed information
  • 36. Click the contig to see detailed information Click here to export the contig
  • 37.
  • 38. Select output format, then click Continue
  • 39. Save or copy the data for further analysis
  • 42. Click here for pairwise BLAST
  • 43. Click ‘Align’ to proceed Paste in genomic sequence Paste in CDS sequence
  • 44.
  • 45.
  • 46.
  • 47. This is a protein BLAST (BLASTP)
  • 48.
  • 49. Some Concluding remarks  Trust but verify  Beware of gene prediction tools!  Always use more than one gene prediction tool and more than one genome when possible.  Active area of bioinformatics research, so be mindful of the new literature in this .