SlideShare a Scribd company logo
1 of 34
Gene and protein prediction
Course: B.Sc Biochemistry
Subject: Basic of Bioinformatics
Unit: IV
• Gene: A sequence of nucleotides coding
for protein
• Gene Prediction Problem: Determine the
beginning and end positions of genes in a
genome
Gene Prediction: Computational Challenge
Gene Prediction: Computational Challengeaatgcatgcggctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgc
taatgcatgcggctatgcaagctgggatccgatgactatgctaagctgggatccgatgacaatgcatgcg
gctatgctaatgaatggtcttgggatttaccttggaatgctaagctgggatccgatgacaatgcatgcggct
atgctaatgaatggtcttgggatttaccttggaatatgctaatgcatgcggctatgctaagctgggatccga
tgacaatgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcg
gctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgctaatgcatgc
ggctatgcaagctgggatcctgcggctatgctaatgaatggtcttgggatttaccttggaatgctaagctg
ggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatgcat
gcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctat
gctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatgcgg
ctatgctaagctcatgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgaca
atgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctat
gctaatgcatgcggctatgctaagctcggctatgctaatgaatggtcttgggatttaccttggaatgctaag
ctgggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatg
catgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggc
tatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatg
cggctatgctaagctcatgcgg
Gene Prediction: Computational Challengeaatgcatgcggctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgc
taatgcatgcggctatgcaagctgggatccgatgactatgctaagctgggatccgatgacaatgcatgcg
gctatgctaatgaatggtcttgggatttaccttggaatgctaagctgggatccgatgacaatgcatgcggct
atgctaatgaatggtcttgggatttaccttggaatatgctaatgcatgcggctatgctaagctgggatccga
tgacaatgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcg
gctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgctaatgcatgc
ggctatgcaagctgggatcctgcggctatgctaatgaatggtcttgggatttaccttggaatgctaagctg
ggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatgcat
gcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctat
gctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatgcgg
ctatgctaagctcatgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgaca
atgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctat
gctaatgcatgcggctatgctaagctcggctatgctaatgaatggtcttgggatttaccttggaatgctaag
ctgggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatg
catgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggc
tatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatg
cggctatgctaagctcatgcgg
Gene Prediction: Computational Challengeaatgcatgcggctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgc
taatgcatgcggctatgcaagctgggatccgatgactatgctaagctgggatccgatgacaatgcatgcg
gctatgctaatgaatggtcttgggatttaccttggaatgctaagctgggatccgatgacaatgcatgcggct
atgctaatgaatggtcttgggatttaccttggaatatgctaatgcatgcggctatgctaagctgggatccga
tgacaatgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcg
gctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgctaatgcatgc
ggctatgcaagctgggatcctgcggctatgctaatgaatggtcttgggatttaccttggaatgctaagctg
ggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatgcat
gcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctat
gctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatgcgg
ctatgctaagctcatgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgaca
atgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctat
gctaatgcatgcggctatgctaagctcggctatgctaatgaatggtcttgggatttaccttggaatgctaag
ctgggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatg
catgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggc
tatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatg
cggctatgctaagctcatgcgg
Gene!
Protein
RNA
DNA
transcription
translation
CCTGAGCCAACTATTGATGAA
PEPTIDE
CCUGAGCCAACUAUUGAUGAA
Central Dogma: DNA -> RNA -> Protein
1.
• Central Dogma was proposed in 1958 by Francis Crick
• Crick had very little supporting evidence in late 1950s
• Before Crick’s seminal paper
all possible information transfers
were considered viable
• Crick postulated that some
of them are not viable
(missing arrows)
• In 1970 Crick published a paper defending the Central Dogma.
Central Dogma: Doubts
• In 1961 Sydney Brenner and Francis Crick
discovered frameshift mutations
• Systematically deleted nucleotides from DNA
– Single and double deletions dramatically
altered protein product
– Effects of triple deletions were minor
– Conclusion: every triplet of nucleotides, each
codon, codes for exactly one amino acid in a
protein
Codons
• Codon: 3 consecutive nucleotides
• 4 3 = 64 possible codons
• Genetic code is degenerative and redundant
– Includes start and stop codons
– An amino acid may be coded by more than
one codon
Translating Nucleotides into Amino Acids
• In 1977, Phillip Sharp and
Richard Roberts
experimented with mRNA of
hexon, a viral protein.
– Map hexon mRNA in
viral genome by
hybridization to
adenovirus DNA and
electron microscopy
– mRNA-DNA hybrids
formed three curious
loop structures instead
of contiguous duplex
segments
Discovery of Split Genes
Exons and Introns
• In eukaryotes, the gene is a combination
of coding segments (exons) that are
interrupted by non-coding segments
(introns)
• This makes computational gene prediction
in eukaryotes even more difficult
• Prokaryotes don’t have introns - Genes in
prokaryotes are continuous
Central Dogma and Splicing
exon1 exon2 exon3
intron1 intron2
transcription
translation
splicing
exon = coding
intron = non-coding
Batzoglou
2.
Gene Structure
3.
Splicing Signals
Exons are interspersed with introns and
typically flanked by GT and AG
4.
Consensus splice sites
Donor: 7.9 bits
Acceptor: 9.4 bits
5.
Promoters
• Promoters are DNA segments upstream
of transcripts that initiate transcription
• Promoter attracts RNA Polymerase to the
transcription start site
5’Promoter 3’
6.
Splicing mechanism
(http://genes.mit.edu/chris/)
7.
Splicing mechanism
• Adenine recognition site marks intron
• snRNPs bind around adenine recognition
site
• The spliceosome thus forms
• Spliceosome excises introns in the mRNA
• Statistical: coding segments (exons) have typical
sequences on either end and use different
subwords than non-coding segments (introns).
• Similarity-based: many human genes are similar
to genes in mice, chicken, or even bacteria.
Therefore, already known mouse, chicken, and
bacterial genes may help to find human genes.
Two Approaches to Gene Prediction
UAA, UAG and
UGA correspond to
3 Stop codons that
(together with Start
codon ATG)
delineate Open
Reading Frames
Genetic Code and Stop Codons
8.
Six Frames in a DNA Sequence
• stop codons – TAA, TAG, TGA
• start codons - ATG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
9.
• Detect potential coding regions by looking at ORFs
– A genome of length n is comprised of (n/3) codons
– Stop codons break genome into segments between consecutive
Stop codons
– The subsegments of these that start from the Start codon (ATG)
are ORFs
• ORFs in different frames may overlap
3
n
3
n
3
n
Genomic Sequence
Open reading frame
ATG TGA
Open Reading Frames (ORFs)
• Long open reading frames may be a gene
– At random, we should expect one stop codon
every (64/3) ~= 21 codons
– However, genes are usually much longer
than this
• A basic approach is to scan for ORFs whose
length exceeds certain threshold
– This is naïve because some genes (e.g. some
neural and immune system genes) are
relatively short
Long vs.Short ORFs
Testing ORFs: Codon Usage
• Create a 64-element hash table and count
the frequencies of codons in an ORF
• Amino acids typically have more than one
codon, but in nature certain codons are
more in use
• Uneven use of the codons may
characterize a real gene
Codon Usage in Human Genome
10.
Gene Prediction and Motifs
• Upstream regions of genes often contain
motifs that can be used for gene prediction
-10
STOP
0 10-35
ATG
TATACT
Pribnow Box
TTCCAA GGAGG
Ribosomal binding site
Transcription start site
Promoter Structure in Prokaryotes (E.Coli)
Transcription starts
at offset 0.
• Pribnow Box (-10)
• Gilbert Box (-30)
• Ribosomal
Binding Site (+10)
11.
Ribosomal Binding Site
12.
Splicing Signals
• Try to recognize location of splicing signals at
exon-intron junctions
– This has yielded a weakly conserved donor
splice site and acceptor splice site
• Profiles for sites are still weak, and lends the
problem to the Hidden Markov Model (HMM)
approaches, which capture the statistical
dependencies between sites
Donor and Acceptor Sites:
GT and AG dinucleotides
• The beginning and end of exons are signaled by donor
and acceptor sites that usually have GT and AC
dinucleotides
• Detecting these sites is difficult, because GT and AC
appear very often
exon 1 exon 2
GT AC
Acceptor
Site
Donor
Site
Popular Gene Prediction Algorithms
• GENSCAN: uses Hidden Markov Models
(HMMs)
• TWINSCAN
–Uses both HMM and similarity (e.g.,
between human and mouse genomes)
Books and Web References
• Books Name :
1. Introduction To Bioinformatics by T. K. Attwood
2. BioInformatics by Sangita
3. Basic Bioinformatics by S.Ignacimuthu, s.j.
• http://en.wikipedia.org/wiki/Gene_prediction
• http://www.humgen.nl/programs.html
• http://rulai.cshl.edu/reprints/nrg890_fs.pdf
Management 8/e - Chapter 8 32
Image References
• 1. https://encrypted-
tbn1.gstatic.com/images?q=tbn:ANd9GcQNwz43SompELD7S4Gadm73w8j
Ngs44zWxSX2wytTzJP0W6AuWBQw
• 2. http://svmcompbio.tuebingen.mpg.de/img/splice.png
• 3.http://bja.oxfordjournals.org/content/early/2009/05/29/bja.aep130/F2.
large.jpg
• 4. http://www.web-books.com/MoBio/Free/images/Ch7E3.gif
• 5. https://encrypted-
tbn2.gstatic.com/images?q=tbn:ANd9GcQSRar31enTpHlRv9-
lff0C3cFOr8lB73z5h1aF4TnI3EP2Ye96HQ
• 6. http://edoc.hu-berlin.de/dissertationen/kuehn-kristina-2006-02-
23/HTML/image005.jpg
• 7. http://www.nature.com/nrn/journal/v4/n10/images/nrn1234-i1.jpg
• 8. https://encrypted-
tbn0.gstatic.com/images?q=tbn:ANd9GcTXSXyL-
0Nz7wVnCEEYXi5hiriOOB63rmRr9-XjHVt_2RRg_P3C
• 9. http://cloud.gmod.org/gbrowse2/tutorial/figures/dna2.png
• 10.https://encrypted-
tbn0.gstatic.com/images?q=tbn:ANd9GcTXSXyL-
0Nz7wVnCEEYXi5hiriOOB63rmRr9-XjHVt_2RRg_P3C
• 11. https://encrypted-
tbn0.gstatic.com/images?q=tbn:ANd9GcQeaTfC5lrWR-
otzQsi7ONsKcYgjzoea--vsN1vMxFwxE3by4qElw
• 12.https://encrypted-
tbn0.gstatic.com/images?q=tbn:ANd9GcSCziU2aI59p-
Svlmwo0WbWzwnBff-6jrXshMhXemdMvYqRiK1G

More Related Content

What's hot

What's hot (20)

Gene prediction strategies
Gene prediction strategies Gene prediction strategies
Gene prediction strategies
 
Genome analysis2
Genome analysis2Genome analysis2
Genome analysis2
 
Gene identification and discovery
Gene identification and discoveryGene identification and discovery
Gene identification and discovery
 
Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120Tyler functional annotation thurs 1120
Tyler functional annotation thurs 1120
 
prediction methods for ORF
prediction methods for ORFprediction methods for ORF
prediction methods for ORF
 
BIOL335: How to annotate a genome
BIOL335: How to annotate a genomeBIOL335: How to annotate a genome
BIOL335: How to annotate a genome
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
 
Genome assembly
Genome assemblyGenome assembly
Genome assembly
 
Open Reading Frames
Open Reading FramesOpen Reading Frames
Open Reading Frames
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
 
2 md2016 annotation
2 md2016 annotation2 md2016 annotation
2 md2016 annotation
 
Functional annotation
Functional annotationFunctional annotation
Functional annotation
 
Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07
 
Finding genes
Finding genesFinding genes
Finding genes
 
Genome Assembly
Genome AssemblyGenome Assembly
Genome Assembly
 
Gene expression profiling
Gene expression profilingGene expression profiling
Gene expression profiling
 
Gene prediction method
Gene prediction method Gene prediction method
Gene prediction method
 
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCINGDNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
 
Transcriptome analysis
Transcriptome analysisTranscriptome analysis
Transcriptome analysis
 

Viewers also liked

Protein function prediction
Protein function predictionProtein function prediction
Protein function predictionLars Juhl Jensen
 
Constraints and Global Optimization for Gene Prediction Overlap Resolution
Constraints and Global Optimization for Gene Prediction Overlap ResolutionConstraints and Global Optimization for Gene Prediction Overlap Resolution
Constraints and Global Optimization for Gene Prediction Overlap ResolutionChristian Have
 
The Structure of Level-k Phylogenetic Networks
The Structure of Level-k Phylogenetic NetworksThe Structure of Level-k Phylogenetic Networks
The Structure of Level-k Phylogenetic NetworksPhilippe Gambette
 
Bls 303 l1.phylogenetics
Bls 303 l1.phylogeneticsBls 303 l1.phylogenetics
Bls 303 l1.phylogeneticsBruno Mmassy
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in Rschamber
 
CpG Island Identification with Hidden Markov Models
CpG Island Identification with Hidden Markov ModelsCpG Island Identification with Hidden Markov Models
CpG Island Identification with Hidden Markov ModelsKshitij Tayal
 
Bioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyBioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyJoaquin Dopazo
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastRai University
 
BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.BITS
 
Prediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresPrediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresLars Juhl Jensen
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS
 
Gene Prediction Using Hidden Markov Model and Recurrent Neural Network
Gene Prediction Using Hidden Markov Model and Recurrent Neural NetworkGene Prediction Using Hidden Markov Model and Recurrent Neural Network
Gene Prediction Using Hidden Markov Model and Recurrent Neural NetworkAhmed Hani Ibrahim
 
Comparitive genomic hybridisation
Comparitive genomic hybridisationComparitive genomic hybridisation
Comparitive genomic hybridisationnamrathrs87
 
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...Jonathan Eisen
 
Bioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 monthBioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 monthbiinoida
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS
 
BITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS
 
September 1 Day Workshop
September 1 Day WorkshopSeptember 1 Day Workshop
September 1 Day WorkshopThe Biome
 

Viewers also liked (20)

Protein function prediction
Protein function predictionProtein function prediction
Protein function prediction
 
Constraints and Global Optimization for Gene Prediction Overlap Resolution
Constraints and Global Optimization for Gene Prediction Overlap ResolutionConstraints and Global Optimization for Gene Prediction Overlap Resolution
Constraints and Global Optimization for Gene Prediction Overlap Resolution
 
The Structure of Level-k Phylogenetic Networks
The Structure of Level-k Phylogenetic NetworksThe Structure of Level-k Phylogenetic Networks
The Structure of Level-k Phylogenetic Networks
 
Molecular phylogenetics
Molecular phylogeneticsMolecular phylogenetics
Molecular phylogenetics
 
Bls 303 l1.phylogenetics
Bls 303 l1.phylogeneticsBls 303 l1.phylogenetics
Bls 303 l1.phylogenetics
 
Phylogenetics in R
Phylogenetics in RPhylogenetics in R
Phylogenetics in R
 
CpG Island Identification with Hidden Markov Models
CpG Island Identification with Hidden Markov ModelsCpG Island Identification with Hidden Markov Models
CpG Island Identification with Hidden Markov Models
 
Hmm
Hmm Hmm
Hmm
 
Bioinformatics in dermato-oncology
Bioinformatics in dermato-oncologyBioinformatics in dermato-oncology
Bioinformatics in dermato-oncology
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blast
 
BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.BITs: Genome browsers and interpretation of gene lists.
BITs: Genome browsers and interpretation of gene lists.
 
Prediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein featuresPrediction of protein function from sequence derived protein features
Prediction of protein function from sequence derived protein features
 
BITS: Basics of sequence databases
BITS: Basics of sequence databasesBITS: Basics of sequence databases
BITS: Basics of sequence databases
 
Gene Prediction Using Hidden Markov Model and Recurrent Neural Network
Gene Prediction Using Hidden Markov Model and Recurrent Neural NetworkGene Prediction Using Hidden Markov Model and Recurrent Neural Network
Gene Prediction Using Hidden Markov Model and Recurrent Neural Network
 
Comparitive genomic hybridisation
Comparitive genomic hybridisationComparitive genomic hybridisation
Comparitive genomic hybridisation
 
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
Evolution of DNA Sequencing - talk by Jonathan Eisen for the Bodega Workshop ...
 
Bioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 monthBioinformatics Project Training for 2,4,6 month
Bioinformatics Project Training for 2,4,6 month
 
BITS: Basics of Sequence similarity
BITS: Basics of Sequence similarityBITS: Basics of Sequence similarity
BITS: Basics of Sequence similarity
 
BITS: Basics of sequence analysis
BITS: Basics of sequence analysisBITS: Basics of sequence analysis
BITS: Basics of sequence analysis
 
September 1 Day Workshop
September 1 Day WorkshopSeptember 1 Day Workshop
September 1 Day Workshop
 

Similar to B.sc biochem i bobi u 4 gene prediction

Introduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinisIntroduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinisMonica Munoz-Torres
 
Introduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research CommunityIntroduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research CommunityMonica Munoz-Torres
 
Functional annotation- prediction of genes.pptx
Functional annotation- prediction of genes.pptxFunctional annotation- prediction of genes.pptx
Functional annotation- prediction of genes.pptxSridharshinisathishk
 
Arnab kumar de
Arnab kumar deArnab kumar de
Arnab kumar deArnab De
 
Apollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionApollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionMonica Munoz-Torres
 
Modern concept of gene.pdf
Modern concept of gene.pdfModern concept of gene.pdf
Modern concept of gene.pdfMeeraTaraSuresh
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical NotebookNaima Tahsin
 
Lec.1 Introduction Prokaryote_and_Eukaryote
Lec.1 Introduction Prokaryote_and_EukaryoteLec.1 Introduction Prokaryote_and_Eukaryote
Lec.1 Introduction Prokaryote_and_EukaryoteiGEM NCKU
 
Prokaryote genome
Prokaryote genomeProkaryote genome
Prokaryote genomemonanarayan
 
Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Monica Munoz-Torres
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_coursehansjansen9999
 
Genome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTKGenome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTKMonica Munoz-Torres
 

Similar to B.sc biochem i bobi u 4 gene prediction (20)

Lecture 4.ppt
Lecture 4.pptLecture 4.ppt
Lecture 4.ppt
 
genomeannotation-160822182432.pdf
genomeannotation-160822182432.pdfgenomeannotation-160822182432.pdf
genomeannotation-160822182432.pdf
 
Introduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinisIntroduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinis
 
Introduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research CommunityIntroduction to Apollo: A webinar for the i5K Research Community
Introduction to Apollo: A webinar for the i5K Research Community
 
Functional annotation- prediction of genes.pptx
Functional annotation- prediction of genes.pptxFunctional annotation- prediction of genes.pptx
Functional annotation- prediction of genes.pptx
 
Arnab kumar de
Arnab kumar deArnab kumar de
Arnab kumar de
 
Group b
Group bGroup b
Group b
 
Apollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 IntroductionApollo Workshop AGS2017 Introduction
Apollo Workshop AGS2017 Introduction
 
Modern concept of gene.pdf
Modern concept of gene.pdfModern concept of gene.pdf
Modern concept of gene.pdf
 
gene prediction methods.pptx
gene prediction methods.pptxgene prediction methods.pptx
gene prediction methods.pptx
 
08_Annotation_2022.pdf
08_Annotation_2022.pdf08_Annotation_2022.pdf
08_Annotation_2022.pdf
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 
Genes
GenesGenes
Genes
 
Lec.1 Introduction Prokaryote_and_Eukaryote
Lec.1 Introduction Prokaryote_and_EukaryoteLec.1 Introduction Prokaryote_and_Eukaryote
Lec.1 Introduction Prokaryote_and_Eukaryote
 
Prokaryote genome
Prokaryote genomeProkaryote genome
Prokaryote genome
 
Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing Apollo Collaborative genome annotation editing
Apollo Collaborative genome annotation editing
 
Genotyping With PCR
Genotyping With PCRGenotyping With PCR
Genotyping With PCR
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Genome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTKGenome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTK
 
Genome Curation using Apollo
Genome Curation using ApolloGenome Curation using Apollo
Genome Curation using Apollo
 

More from Rai University

Brochure Rai University
Brochure Rai University Brochure Rai University
Brochure Rai University Rai University
 
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,
Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,Rai University
 
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02Rai University
 
Bsc agri 2 pae u-4.3 public expenditure
Bsc agri  2 pae  u-4.3 public expenditureBsc agri  2 pae  u-4.3 public expenditure
Bsc agri 2 pae u-4.3 public expenditureRai University
 
Bsc agri 2 pae u-4.2 public finance
Bsc agri  2 pae  u-4.2 public financeBsc agri  2 pae  u-4.2 public finance
Bsc agri 2 pae u-4.2 public financeRai University
 
Bsc agri 2 pae u-4.1 introduction
Bsc agri  2 pae  u-4.1 introductionBsc agri  2 pae  u-4.1 introduction
Bsc agri 2 pae u-4.1 introductionRai University
 
Bsc agri 2 pae u-3.3 inflation
Bsc agri  2 pae  u-3.3  inflationBsc agri  2 pae  u-3.3  inflation
Bsc agri 2 pae u-3.3 inflationRai University
 
Bsc agri 2 pae u-3.2 introduction to macro economics
Bsc agri  2 pae  u-3.2 introduction to macro economicsBsc agri  2 pae  u-3.2 introduction to macro economics
Bsc agri 2 pae u-3.2 introduction to macro economicsRai University
 
Bsc agri 2 pae u-3.1 marketstructure
Bsc agri  2 pae  u-3.1 marketstructureBsc agri  2 pae  u-3.1 marketstructure
Bsc agri 2 pae u-3.1 marketstructureRai University
 
Bsc agri 2 pae u-3 perfect-competition
Bsc agri  2 pae  u-3 perfect-competitionBsc agri  2 pae  u-3 perfect-competition
Bsc agri 2 pae u-3 perfect-competitionRai University
 

More from Rai University (20)

Brochure Rai University
Brochure Rai University Brochure Rai University
Brochure Rai University
 
Mm unit 4point2
Mm unit 4point2Mm unit 4point2
Mm unit 4point2
 
Mm unit 4point1
Mm unit 4point1Mm unit 4point1
Mm unit 4point1
 
Mm unit 4point3
Mm unit 4point3Mm unit 4point3
Mm unit 4point3
 
Mm unit 3point2
Mm unit 3point2Mm unit 3point2
Mm unit 3point2
 
Mm unit 3point1
Mm unit 3point1Mm unit 3point1
Mm unit 3point1
 
Mm unit 2point2
Mm unit 2point2Mm unit 2point2
Mm unit 2point2
 
Mm unit 2 point 1
Mm unit 2 point 1Mm unit 2 point 1
Mm unit 2 point 1
 
Mm unit 1point3
Mm unit 1point3Mm unit 1point3
Mm unit 1point3
 
Mm unit 1point2
Mm unit 1point2Mm unit 1point2
Mm unit 1point2
 
Mm unit 1point1
Mm unit 1point1Mm unit 1point1
Mm unit 1point1
 
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,
Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,Bdft ii, tmt, unit-iii,  dyeing & types of dyeing,
Bdft ii, tmt, unit-iii, dyeing & types of dyeing,
 
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02Bsc agri  2 pae  u-4.4 publicrevenue-presentation-130208082149-phpapp02
Bsc agri 2 pae u-4.4 publicrevenue-presentation-130208082149-phpapp02
 
Bsc agri 2 pae u-4.3 public expenditure
Bsc agri  2 pae  u-4.3 public expenditureBsc agri  2 pae  u-4.3 public expenditure
Bsc agri 2 pae u-4.3 public expenditure
 
Bsc agri 2 pae u-4.2 public finance
Bsc agri  2 pae  u-4.2 public financeBsc agri  2 pae  u-4.2 public finance
Bsc agri 2 pae u-4.2 public finance
 
Bsc agri 2 pae u-4.1 introduction
Bsc agri  2 pae  u-4.1 introductionBsc agri  2 pae  u-4.1 introduction
Bsc agri 2 pae u-4.1 introduction
 
Bsc agri 2 pae u-3.3 inflation
Bsc agri  2 pae  u-3.3  inflationBsc agri  2 pae  u-3.3  inflation
Bsc agri 2 pae u-3.3 inflation
 
Bsc agri 2 pae u-3.2 introduction to macro economics
Bsc agri  2 pae  u-3.2 introduction to macro economicsBsc agri  2 pae  u-3.2 introduction to macro economics
Bsc agri 2 pae u-3.2 introduction to macro economics
 
Bsc agri 2 pae u-3.1 marketstructure
Bsc agri  2 pae  u-3.1 marketstructureBsc agri  2 pae  u-3.1 marketstructure
Bsc agri 2 pae u-3.1 marketstructure
 
Bsc agri 2 pae u-3 perfect-competition
Bsc agri  2 pae  u-3 perfect-competitionBsc agri  2 pae  u-3 perfect-competition
Bsc agri 2 pae u-3 perfect-competition
 

Recently uploaded

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxCarlos105
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinojohnmickonozaleda
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfSpandanaRallapalli
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)cama23
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)lakshayb543
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 

Recently uploaded (20)

Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptxBarangay Council for the Protection of Children (BCPC) Orientation.pptx
Barangay Council for the Protection of Children (BCPC) Orientation.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
FILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipinoFILIPINO PSYCHology sikolohiyang pilipino
FILIPINO PSYCHology sikolohiyang pilipino
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
ACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdfACC 2024 Chronicles. Cardiology. Exam.pdf
ACC 2024 Chronicles. Cardiology. Exam.pdf
 
Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)Global Lehigh Strategic Initiatives (without descriptions)
Global Lehigh Strategic Initiatives (without descriptions)
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
Visit to a blind student's school🧑‍🦯🧑‍🦯(community medicine)
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 

B.sc biochem i bobi u 4 gene prediction

  • 1. Gene and protein prediction Course: B.Sc Biochemistry Subject: Basic of Bioinformatics Unit: IV
  • 2. • Gene: A sequence of nucleotides coding for protein • Gene Prediction Problem: Determine the beginning and end positions of genes in a genome Gene Prediction: Computational Challenge
  • 3. Gene Prediction: Computational Challengeaatgcatgcggctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgc taatgcatgcggctatgcaagctgggatccgatgactatgctaagctgggatccgatgacaatgcatgcg gctatgctaatgaatggtcttgggatttaccttggaatgctaagctgggatccgatgacaatgcatgcggct atgctaatgaatggtcttgggatttaccttggaatatgctaatgcatgcggctatgctaagctgggatccga tgacaatgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcg gctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgctaatgcatgc ggctatgcaagctgggatcctgcggctatgctaatgaatggtcttgggatttaccttggaatgctaagctg ggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatgcat gcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctat gctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatgcgg ctatgctaagctcatgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgaca atgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctat gctaatgcatgcggctatgctaagctcggctatgctaatgaatggtcttgggatttaccttggaatgctaag ctgggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatg catgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggc tatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatg cggctatgctaagctcatgcgg
  • 4. Gene Prediction: Computational Challengeaatgcatgcggctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgc taatgcatgcggctatgcaagctgggatccgatgactatgctaagctgggatccgatgacaatgcatgcg gctatgctaatgaatggtcttgggatttaccttggaatgctaagctgggatccgatgacaatgcatgcggct atgctaatgaatggtcttgggatttaccttggaatatgctaatgcatgcggctatgctaagctgggatccga tgacaatgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcg gctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgctaatgcatgc ggctatgcaagctgggatcctgcggctatgctaatgaatggtcttgggatttaccttggaatgctaagctg ggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatgcat gcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctat gctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatgcgg ctatgctaagctcatgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgaca atgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctat gctaatgcatgcggctatgctaagctcggctatgctaatgaatggtcttgggatttaccttggaatgctaag ctgggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatg catgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggc tatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatg cggctatgctaagctcatgcgg
  • 5. Gene Prediction: Computational Challengeaatgcatgcggctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgc taatgcatgcggctatgcaagctgggatccgatgactatgctaagctgggatccgatgacaatgcatgcg gctatgctaatgaatggtcttgggatttaccttggaatgctaagctgggatccgatgacaatgcatgcggct atgctaatgaatggtcttgggatttaccttggaatatgctaatgcatgcggctatgctaagctgggatccga tgacaatgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcg gctatgctaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctatgctaatgcatgc ggctatgcaagctgggatcctgcggctatgctaatgaatggtcttgggatttaccttggaatgctaagctg ggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatgcat gcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggctat gctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatgcgg ctatgctaagctcatgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgaca atgcatgcggctatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctat gctaatgcatgcggctatgctaagctcggctatgctaatgaatggtcttgggatttaccttggaatgctaag ctgggatccgatgacaatgcatgcggctatgctaatgaatggtcttgggatttaccttggaatatgctaatg catgcggctatgctaagctgggaatgcatgcggctatgctaagctgggatccgatgacaatgcatgcggc tatgctaatgcatgcggctatgcaagctgggatccgatgactatgctaagctgcggctatgctaatgcatg cggctatgctaagctcatgcgg Gene!
  • 7. • Central Dogma was proposed in 1958 by Francis Crick • Crick had very little supporting evidence in late 1950s • Before Crick’s seminal paper all possible information transfers were considered viable • Crick postulated that some of them are not viable (missing arrows) • In 1970 Crick published a paper defending the Central Dogma. Central Dogma: Doubts
  • 8. • In 1961 Sydney Brenner and Francis Crick discovered frameshift mutations • Systematically deleted nucleotides from DNA – Single and double deletions dramatically altered protein product – Effects of triple deletions were minor – Conclusion: every triplet of nucleotides, each codon, codes for exactly one amino acid in a protein Codons
  • 9. • Codon: 3 consecutive nucleotides • 4 3 = 64 possible codons • Genetic code is degenerative and redundant – Includes start and stop codons – An amino acid may be coded by more than one codon Translating Nucleotides into Amino Acids
  • 10. • In 1977, Phillip Sharp and Richard Roberts experimented with mRNA of hexon, a viral protein. – Map hexon mRNA in viral genome by hybridization to adenovirus DNA and electron microscopy – mRNA-DNA hybrids formed three curious loop structures instead of contiguous duplex segments Discovery of Split Genes
  • 11. Exons and Introns • In eukaryotes, the gene is a combination of coding segments (exons) that are interrupted by non-coding segments (introns) • This makes computational gene prediction in eukaryotes even more difficult • Prokaryotes don’t have introns - Genes in prokaryotes are continuous
  • 12. Central Dogma and Splicing exon1 exon2 exon3 intron1 intron2 transcription translation splicing exon = coding intron = non-coding Batzoglou 2.
  • 14. Splicing Signals Exons are interspersed with introns and typically flanked by GT and AG 4.
  • 15. Consensus splice sites Donor: 7.9 bits Acceptor: 9.4 bits 5.
  • 16. Promoters • Promoters are DNA segments upstream of transcripts that initiate transcription • Promoter attracts RNA Polymerase to the transcription start site 5’Promoter 3’ 6.
  • 18. Splicing mechanism • Adenine recognition site marks intron • snRNPs bind around adenine recognition site • The spliceosome thus forms • Spliceosome excises introns in the mRNA
  • 19. • Statistical: coding segments (exons) have typical sequences on either end and use different subwords than non-coding segments (introns). • Similarity-based: many human genes are similar to genes in mice, chicken, or even bacteria. Therefore, already known mouse, chicken, and bacterial genes may help to find human genes. Two Approaches to Gene Prediction
  • 20. UAA, UAG and UGA correspond to 3 Stop codons that (together with Start codon ATG) delineate Open Reading Frames Genetic Code and Stop Codons 8.
  • 21. Six Frames in a DNA Sequence • stop codons – TAA, TAG, TGA • start codons - ATG GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC 9.
  • 22. • Detect potential coding regions by looking at ORFs – A genome of length n is comprised of (n/3) codons – Stop codons break genome into segments between consecutive Stop codons – The subsegments of these that start from the Start codon (ATG) are ORFs • ORFs in different frames may overlap 3 n 3 n 3 n Genomic Sequence Open reading frame ATG TGA Open Reading Frames (ORFs)
  • 23. • Long open reading frames may be a gene – At random, we should expect one stop codon every (64/3) ~= 21 codons – However, genes are usually much longer than this • A basic approach is to scan for ORFs whose length exceeds certain threshold – This is naïve because some genes (e.g. some neural and immune system genes) are relatively short Long vs.Short ORFs
  • 24. Testing ORFs: Codon Usage • Create a 64-element hash table and count the frequencies of codons in an ORF • Amino acids typically have more than one codon, but in nature certain codons are more in use • Uneven use of the codons may characterize a real gene
  • 25. Codon Usage in Human Genome 10.
  • 26. Gene Prediction and Motifs • Upstream regions of genes often contain motifs that can be used for gene prediction -10 STOP 0 10-35 ATG TATACT Pribnow Box TTCCAA GGAGG Ribosomal binding site Transcription start site
  • 27. Promoter Structure in Prokaryotes (E.Coli) Transcription starts at offset 0. • Pribnow Box (-10) • Gilbert Box (-30) • Ribosomal Binding Site (+10) 11.
  • 29. Splicing Signals • Try to recognize location of splicing signals at exon-intron junctions – This has yielded a weakly conserved donor splice site and acceptor splice site • Profiles for sites are still weak, and lends the problem to the Hidden Markov Model (HMM) approaches, which capture the statistical dependencies between sites
  • 30. Donor and Acceptor Sites: GT and AG dinucleotides • The beginning and end of exons are signaled by donor and acceptor sites that usually have GT and AC dinucleotides • Detecting these sites is difficult, because GT and AC appear very often exon 1 exon 2 GT AC Acceptor Site Donor Site
  • 31. Popular Gene Prediction Algorithms • GENSCAN: uses Hidden Markov Models (HMMs) • TWINSCAN –Uses both HMM and similarity (e.g., between human and mouse genomes)
  • 32. Books and Web References • Books Name : 1. Introduction To Bioinformatics by T. K. Attwood 2. BioInformatics by Sangita 3. Basic Bioinformatics by S.Ignacimuthu, s.j. • http://en.wikipedia.org/wiki/Gene_prediction • http://www.humgen.nl/programs.html • http://rulai.cshl.edu/reprints/nrg890_fs.pdf Management 8/e - Chapter 8 32
  • 33. Image References • 1. https://encrypted- tbn1.gstatic.com/images?q=tbn:ANd9GcQNwz43SompELD7S4Gadm73w8j Ngs44zWxSX2wytTzJP0W6AuWBQw • 2. http://svmcompbio.tuebingen.mpg.de/img/splice.png • 3.http://bja.oxfordjournals.org/content/early/2009/05/29/bja.aep130/F2. large.jpg • 4. http://www.web-books.com/MoBio/Free/images/Ch7E3.gif • 5. https://encrypted- tbn2.gstatic.com/images?q=tbn:ANd9GcQSRar31enTpHlRv9- lff0C3cFOr8lB73z5h1aF4TnI3EP2Ye96HQ • 6. http://edoc.hu-berlin.de/dissertationen/kuehn-kristina-2006-02- 23/HTML/image005.jpg • 7. http://www.nature.com/nrn/journal/v4/n10/images/nrn1234-i1.jpg
  • 34. • 8. https://encrypted- tbn0.gstatic.com/images?q=tbn:ANd9GcTXSXyL- 0Nz7wVnCEEYXi5hiriOOB63rmRr9-XjHVt_2RRg_P3C • 9. http://cloud.gmod.org/gbrowse2/tutorial/figures/dna2.png • 10.https://encrypted- tbn0.gstatic.com/images?q=tbn:ANd9GcTXSXyL- 0Nz7wVnCEEYXi5hiriOOB63rmRr9-XjHVt_2RRg_P3C • 11. https://encrypted- tbn0.gstatic.com/images?q=tbn:ANd9GcQeaTfC5lrWR- otzQsi7ONsKcYgjzoea--vsN1vMxFwxE3by4qElw • 12.https://encrypted- tbn0.gstatic.com/images?q=tbn:ANd9GcSCziU2aI59p- Svlmwo0WbWzwnBff-6jrXshMhXemdMvYqRiK1G