SlideShare a Scribd company logo
Submitted by-
Ishi tandon
CT-IV
Gene:
• Asequence of nucleotides coding for protein.
CentralDogma:
• Proposed in 1958 by Francis Crick.
• Hepostulated that all possibleinformation
transferred, are not viable.
• Hepublished apaper in 1970.
CODONS:
• Discovered by Sydney Brenner and Francis Crickin
1961.
• In every triplet of nucleotides, each codoncodesfor
one amino acid in aprotein.
DNA RNA PROTEIN PHENOTYPE
2
4
cDNA
1 3
1. TRANSCRIPTION
2. TRANSLATION
3. GENE EXPRESSION
4. REVERSETRANSCRIPTION
DEfiniTION
• It is aprerequisite for detailed functionalannotation
of genesand genomes.
• It candetect location of ORFs(Open Reading
Frames), structures of introns andexons.
• It describes all the genescomputationally withnear
100% accuracy.
• It canreduce the amount ofexperimental
verification work required.
TYPES
• Abinitio- gene signals, intron splice, transcription
factor binding site, ribosomal binding site, poly-
adenylation site, triplet codon structure and gene
content.
• Homology- significant matches of query sequence
with sequence of knowngenes.
• Probabilistic models like Markov model or Hidden
Markov Models (HMMs).
Abinitio-based
Homology-
based
Translation
Protein
Splicing
mRNA Cap- -Poly(A)
Transcription
pre-mRNA Cap- -Poly(A)
Genomic DNA
Stop codon
GT AG
exon intron
Splice sites
Donor site Acceptor site
SEQUENCE
SIGNALS
Start codon
Exonsare usually
shorter thanintrons.
Prokaryoticgene
prediction
• Geneprediction is easier in microbialgenomes.
• Smaller genomes, high gene density, very few
repetitive sequence, more sequenced genomes.
• Start codon is ATG.
• Ribosomal binding site/Shine Dalgarno sequence.
Openreadingframes
• A sequence defined by in-frame start and stop
codon, which in turn defines aputative amino acid
sequence.
• Agenome of length n is comprised of (n/3)codons.
• Stop codons break genome into segments between
consecutive stop codons.
• Thesub-segments of these that start from the Start
codon (ATG)areORFs.
• DNA is translated in all six possible frames,
three frames forward and three reverse.
ATG TGA
Genomic Sequence
Open reading frame
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
CTGCAGACGAAACCTCTTGATGTAGTTGGCCTGACACCGACAATAATGAAGACTACCGTCTTACTAACAC
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
GACGTCTGCTTTGGAGAACTACATCAACCGGACTGTGGCTGTTATTACTTCTGATGGCAGAATGATTGTG
Probabilisticmodels
• Statistical description of agene.
• Markov Models &Hidden Markov Models.
• Usedto distinguish oligonucleotide distributions in
the coding regions from those for non-coding
regions.
• Probability of distribution of nucleotides inDNA
sequence depends on the order k.
• Typesof order- zero,first and second.
• Order , gene canpredicted more accurately.
Genecontent and length distribution of
prokaryotic genes
TYPICAL ATYPICAL
Ranges from100
to 500amino
acids with a
nucleotide
distribution
typical ofthe
organism.
Shorter or longer
with different
nucleotidestatistics.
Genes tend toescape
detection when
typical gene modelis
used.
Genefindingprogramsin
prokaryotes
• Theprograms are based on HMM/IMM.
 GeneMark.hmm (microbial genomes)
 Glimmer (UNIX program from TIGR). Computation
involves two steps viz. model building & gene
prediction.
 FGENESB (bacterial sequences). It uses Vertibi
algorithm & linear discriminant analysis(LDA).
 RBSfinder- Searches from ribosomal binding site or
shine dalgarno sequence for prediction of translation
initiation site.
Sensitivity Ability to include correct predictions. It is the
fraction of known genescorrectlypredicted.
Specificity Ability to exclude incorrect predictions. It is the
fraction of predicted genes that correspond to true genes.
 Both are the proportion of true signals.
Eukaryoticgeneprediction
• Genomes are much larger than prokaryotes(10Mbp to
670 Gbp).
• Low gene density.
• Spacebetween genesis very large and rich in
repetitive sequences & transposableelements.
• Splitting of genesby intervening noncodingsequences
(introns) and joining of coding sequences(exons).
• Splice junctions follow GT-AGrule.
• An intron at the 5’ splice junction hasaconsensus
motif GTAAGTand that at 3’ endNCAG.
exon 1 exon 2
• Geneshave ahigh density of CGdinucleotides near
the transcription start site. Thisregion is CpGisland. It
helps to identify the transcription initiation site of an
eukaryotic gene.
• Somepost-transcriptional modification occur with the
transcript to become mature mRNAviz. Capping,
Splicing and Polyadenylation.
Acceptor
Site
Donor
Site
GT AG
o CAPPING: Occurs at the 5’ end of the transcript. It
involves methylation at the initial residue of the
RNA.
o SPLICING: Processof removal of intronsand
joining of exons. It involves alargeRNA-protein
complex called spliceosome.
o POLYADENYLATION:Addition of astretch ofAs
(~250) at the 3’ end of the RNA.Theprocessis
accomplished by poly-Apolymerase.
Genefindingprogramsin
EUkaryotes
• Three categories of algorithms
 Ab Initiobased-
It joins the exonsin correct order.Twosignals->
a) Genesignals: asmall pattern within the genomic
DNAincluding putative splice sites, start and stop
sites of transcription or translation, branchpoints,
transcription factor binding sites, recognizable
consensus sequences.
b) Genecontent: aregion of genomic DNAincluding
nucleotide and amino acid distribution, Synonymous
codon usageand hexamer frequencies.
 Neural network based algorithm
-Composed of network of mathematicalvariables.
-Multiple layers like input, output and hiddenlayers.
-GRAIL (Splice junctions, start and stop codons, poly-A
sites, promoters and CpGislands). It scansthe query
sequence with windows of variable lengths &scores.
 Discriminant analysis
-Linear Discriminant Analysis (LDA) represents 2D
graph of coding signals vs. all possible 3’ splice site
positions; adiagonal line.
-Quadratic DiscriminantAnalysis (QDA)represents
quadratic function; acurved line.
-FGENES (LDA)
-FGENESH [Find Genes] (HMMs)
-FGENESH_C (Similarity based)
-FGENESH+ (Combination of ab initio &similarity
based)
-MZEF [Michael Zhang’s Exon Finder](QDA)
 HMMs
-GENSCAN (Fifth order HMMs); combination of
hexamer frequencies with coding signals;probability
score P>0.5
-HMMgene (Conditional Maximum Likelihood);
combination of ab initio & homology-basedalgorithm
 Homology-based-
Exonstructures and sequencesof related speciesare
highly conserved.
Comparison of homologous sequences derived from
cDNAor ExpressedSequenceTags (ESTs).
-GenomeScan (Combination of GENSCANprediction
results with BLASTXsimilaritysearches)
-EST2Genome (Intron-exon boundaries); Comparison
of an ESTsequence with agenomic DNAsequence
-SGP-1 [Syntenic Gene Prediction] (Similar to EST2)
-TwinScan (gene-finding server; similar to
GenomeScan)
 Consensus-based-
Combination of results of multiple programsbased
on consensus.
Improvement of specificity by correctingfalse
positives & problem ofoverprediction.
Lowered sensitivity & missedpredictons.
-GeneComber (Combination of HMMgene&
GenScanprediction results)
-DIGIT (Combination of FGENESH,GENSCAN&
HMMgene)
GENE EXPRESSION
Two steps are required
1. Translation
The synthesis of a polypeptide chain using the genetic
code on the mRNA molecule as its guide.
1. Transcription
The synthesis of mRNA uses the gene on the DNA
molecule as a template
This happens in the nucleus of eukaryotes
Types OF RNA
Messenger RNA (mRNA) <5%
Ribosomal RNA (rRNA) Up to 80%
Transfer RNA (tRNA) About 15%
In eukaryotes small nuclear ribonucleoproteins (snRNP aka
spliceosomes
Structural characteristics of RNA molecules
Single polynucleotide strand which may be looped or
coiled (not a double helix)
Sugar Ribose (not deoxyribose)
Bases used: Adenine, Guanine, Cytosine and Uracil (not
Thymine
Transcription: The synthesis of a strand of mRNA (and
other RNAs)
Uses an enzyme RNA polymerase
Proceeds in the same direction as replication (5’ to 3’)
Forms a complementary strand of mRNA
It begins at a promotor site, which signals that the beginning of
the gene is near (about 20 to 30 nucleotides away)
After the end of the gene is reached, there is a terminator
sequence that tells RNA polymerase to stop transcribing
NB Terminator sequence ≠ terminator codon
RNA POLYMERASE
Editing the mRNA
In prokaryotes, transcribed mRNA
goes straight to the ribosomes in the
cytoplasm
In eukaryotes, freshly transcribed
mRNA in the nucleus is about 5000
nucleotides long
When the same mRNA is used for
translation at the ribosome it is only
1000 nucleotides long
The mRNA has been edited
The parts which are kept for gene
expression are called EXONS (exons =
expressed)
The parts which are edited out (by
spliceosomes) are called INTRONS.
Translation
TRANSLATION
Complete protein
Polypeptide chain
Ribosomes
Stop codon Start codon
© 2016 Paul Billiet ODWS
Translation
 Location: The ribosomes in the cytoplasm
that provide the environment for translation
 The genetic code is brought by the mRNA
molecule.
© 2016 Paul Billiet ODWS
An important discovery Retro viruses (e.g. HIV)
carry RNA as their
genetic information
 When they invade their
host cell they convert
their RNA into a DNA
copy using reverse
transcriptase
 Thus the central dogma is modified:
DNA↔RNAProtein
 This has helped to explain an important paradox in the
evolution of life.
Reverse transcriptase
© 2016 Paul Billiet ODWS
The paradox of DNA
 DNA is a very stable molecule
 It is a good medium for storing genetic material
but…
 DNA can do nothing for itself
 It requires enzymes for replication
 It requires enzymes for gene expression
 The information in DNA is required to synthesise
enzymes (proteins) but enzymes are require to
make DNA function
 Which came first in the origin of life DNA or
enzymes?
© 2016 Paul Billiet ODWS
RIBOZYMES: Both genetic and
catalytic
 Certain forms of RNA have catalytic properties
 RIBOZYMES
 Ribosomes and spliceosomes are ribozymes
 RNA could have been the first genetic information
synthesizing proteins…
 …and at the same time a biocatalyst
 Reverse transcriptase provides the possibility of
producing DNA copies from RNA.
© 2016 Paul Billiet ODWS
The ribosome a ribozyme
REFERENCES
 http://www.4ulr.com/products/currentprotocols/bioinformatics.html
 http://proxy.lib.iastate.edu:2103/nrg/journal/v3/n9/full/nrg890_fs.html
 http://proxy.lib.iastate.edu:2103/nrg/journal/v5/n4/full/nrg1315_fs.html
 Xiong J.;Essential bioinformatics; QH324.2.X56 2006

More Related Content

What's hot

Proteins databases
Proteins databasesProteins databases
Proteins databases
Hafiz Muhammad Zeeshan Raza
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
Shital Pal
 
Biological database
Biological databaseBiological database
Biological database
Iqbal college Peringammala TVM
 
Sequence Submission Tools
Sequence Submission ToolsSequence Submission Tools
Sequence Submission Tools
RishikaMaji
 
Blast
BlastBlast
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
Hafiz Muhammad Zeeshan Raza
 
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCINGDNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
Puneet Kulyana
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
Vidya Kalaivani Rajkumar
 
shotgun sequncing
 shotgun sequncing shotgun sequncing
shotgun sequncing
SAIFALI444
 
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSPHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICS
Usman Arshad
 
Finding ORF
Finding ORFFinding ORF
Finding ORF
Sabahat Ali
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
KAUSHAL SAHU
 
SEQUENCE ANALYSIS
SEQUENCE ANALYSISSEQUENCE ANALYSIS
SEQUENCE ANALYSIS
prashant tripathi
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
Dhananjay Desai
 
Ddbj
DdbjDdbj
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
Thapar Institute of Engineering & Technology, Patiala, Punjab, India
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
Shifa Ansari
 
Prosite
PrositeProsite
Clustal X
Clustal XClustal X
Clustal X
biinoida
 

What's hot (20)

Proteins databases
Proteins databasesProteins databases
Proteins databases
 
Genome sequencing
Genome sequencingGenome sequencing
Genome sequencing
 
Biological database
Biological databaseBiological database
Biological database
 
Sequence Submission Tools
Sequence Submission ToolsSequence Submission Tools
Sequence Submission Tools
 
Blast
BlastBlast
Blast
 
Gen bank databases
Gen bank databasesGen bank databases
Gen bank databases
 
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCINGDNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
 
Sequence alignment
Sequence alignmentSequence alignment
Sequence alignment
 
shotgun sequncing
 shotgun sequncing shotgun sequncing
shotgun sequncing
 
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSPHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICS
 
Finding ORF
Finding ORFFinding ORF
Finding ORF
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
SEQUENCE ANALYSIS
SEQUENCE ANALYSISSEQUENCE ANALYSIS
SEQUENCE ANALYSIS
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
 
Ddbj
DdbjDdbj
Ddbj
 
blast bioinformatics
blast bioinformaticsblast bioinformatics
blast bioinformatics
 
NCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology InformationNCBI National Center for Biotechnology Information
NCBI National Center for Biotechnology Information
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Prosite
PrositeProsite
Prosite
 
Clustal X
Clustal XClustal X
Clustal X
 

Similar to Gene prediction and expression

Central dogma
Central dogmaCentral dogma
Central dogmaneizylah
 
If you were looking at an mRNA and saw the codon AUG, what would you .pdf
If you were looking at an mRNA and saw the codon AUG, what would you .pdfIf you were looking at an mRNA and saw the codon AUG, what would you .pdf
If you were looking at an mRNA and saw the codon AUG, what would you .pdf
naveenkumar29100
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene Expression
Aashish Patel
 
Central Dogma-Cell Theory.pptx
Central Dogma-Cell Theory.pptxCentral Dogma-Cell Theory.pptx
Central Dogma-Cell Theory.pptx
AdrianPerezTastar
 
Central Dogma of Life
Central Dogma of LifeCentral Dogma of Life
Central Dogma of Life
Jasper Falalimpa
 
Central dogma of molecular genetics valerio
Central dogma of molecular genetics valerioCentral dogma of molecular genetics valerio
Central dogma of molecular genetics valerioGenny Valerio
 
Transcription in prokaryotes and eukaryotes.pdf
Transcription in prokaryotes and eukaryotes.pdfTranscription in prokaryotes and eukaryotes.pdf
Transcription in prokaryotes and eukaryotes.pdf
ssuser880f82
 
Role of DNA and RNA in Protein Synthesis
Role of DNA and RNA in Protein SynthesisRole of DNA and RNA in Protein Synthesis
Role of DNA and RNA in Protein Synthesis
CharupriyaChauhan1
 
Translation of Proteins.ppt
Translation of Proteins.pptTranslation of Proteins.ppt
Translation of Proteins.ppt
DrBeenishAftab
 
11 transcription
11 transcription11 transcription
11 transcription
elhadi ibrahim
 
lecture 3 Gene expression pptx
lecture 3 Gene expression           pptxlecture 3 Gene expression           pptx
lecture 3 Gene expression pptx
HanySaid33
 
protein synthesis
protein synthesisprotein synthesis
protein synthesis
Nawfal Aldujaily
 
Protein synthesis mechanism with reference of Translation and Transcription d...
Protein synthesis mechanism with reference of Translation and Transcription d...Protein synthesis mechanism with reference of Translation and Transcription d...
Protein synthesis mechanism with reference of Translation and Transcription d...
muhammad aleem ijaz
 
5.Genetics in orthodontics
5.Genetics in orthodontics5.Genetics in orthodontics
5.Genetics in orthodontics
Abirajkr
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02
Cleophas Rwemera
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02
Cleophas Rwemera
 
Genes
GenesGenes
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical NotebookNaima Tahsin
 

Similar to Gene prediction and expression (20)

Central dogma
Central dogmaCentral dogma
Central dogma
 
If you were looking at an mRNA and saw the codon AUG, what would you .pdf
If you were looking at an mRNA and saw the codon AUG, what would you .pdfIf you were looking at an mRNA and saw the codon AUG, what would you .pdf
If you were looking at an mRNA and saw the codon AUG, what would you .pdf
 
SAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene ExpressionSAGE- Serial Analysis of Gene Expression
SAGE- Serial Analysis of Gene Expression
 
Central Dogma-Cell Theory.pptx
Central Dogma-Cell Theory.pptxCentral Dogma-Cell Theory.pptx
Central Dogma-Cell Theory.pptx
 
Central Dogma of Life
Central Dogma of LifeCentral Dogma of Life
Central Dogma of Life
 
Central dogma of molecular genetics valerio
Central dogma of molecular genetics valerioCentral dogma of molecular genetics valerio
Central dogma of molecular genetics valerio
 
Ig
IgIg
Ig
 
chapter 7
chapter 7chapter 7
chapter 7
 
Transcription in prokaryotes and eukaryotes.pdf
Transcription in prokaryotes and eukaryotes.pdfTranscription in prokaryotes and eukaryotes.pdf
Transcription in prokaryotes and eukaryotes.pdf
 
Role of DNA and RNA in Protein Synthesis
Role of DNA and RNA in Protein SynthesisRole of DNA and RNA in Protein Synthesis
Role of DNA and RNA in Protein Synthesis
 
Translation of Proteins.ppt
Translation of Proteins.pptTranslation of Proteins.ppt
Translation of Proteins.ppt
 
11 transcription
11 transcription11 transcription
11 transcription
 
lecture 3 Gene expression pptx
lecture 3 Gene expression           pptxlecture 3 Gene expression           pptx
lecture 3 Gene expression pptx
 
protein synthesis
protein synthesisprotein synthesis
protein synthesis
 
Protein synthesis mechanism with reference of Translation and Transcription d...
Protein synthesis mechanism with reference of Translation and Transcription d...Protein synthesis mechanism with reference of Translation and Transcription d...
Protein synthesis mechanism with reference of Translation and Transcription d...
 
5.Genetics in orthodontics
5.Genetics in orthodontics5.Genetics in orthodontics
5.Genetics in orthodontics
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02
 
Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02Biol102 chp17-pp-spr10-100508132228-phpapp02
Biol102 chp17-pp-spr10-100508132228-phpapp02
 
Genes
GenesGenes
Genes
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 

Recently uploaded

The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 

Recently uploaded (20)

The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 

Gene prediction and expression

  • 2. Gene: • Asequence of nucleotides coding for protein. CentralDogma: • Proposed in 1958 by Francis Crick. • Hepostulated that all possibleinformation transferred, are not viable. • Hepublished apaper in 1970. CODONS: • Discovered by Sydney Brenner and Francis Crickin 1961. • In every triplet of nucleotides, each codoncodesfor one amino acid in aprotein.
  • 3. DNA RNA PROTEIN PHENOTYPE 2 4 cDNA 1 3 1. TRANSCRIPTION 2. TRANSLATION 3. GENE EXPRESSION 4. REVERSETRANSCRIPTION
  • 4. DEfiniTION • It is aprerequisite for detailed functionalannotation of genesand genomes. • It candetect location of ORFs(Open Reading Frames), structures of introns andexons. • It describes all the genescomputationally withnear 100% accuracy. • It canreduce the amount ofexperimental verification work required.
  • 5. TYPES • Abinitio- gene signals, intron splice, transcription factor binding site, ribosomal binding site, poly- adenylation site, triplet codon structure and gene content. • Homology- significant matches of query sequence with sequence of knowngenes. • Probabilistic models like Markov model or Hidden Markov Models (HMMs). Abinitio-based Homology- based
  • 6. Translation Protein Splicing mRNA Cap- -Poly(A) Transcription pre-mRNA Cap- -Poly(A) Genomic DNA Stop codon GT AG exon intron Splice sites Donor site Acceptor site SEQUENCE SIGNALS Start codon Exonsare usually shorter thanintrons.
  • 7. Prokaryoticgene prediction • Geneprediction is easier in microbialgenomes. • Smaller genomes, high gene density, very few repetitive sequence, more sequenced genomes. • Start codon is ATG. • Ribosomal binding site/Shine Dalgarno sequence.
  • 8. Openreadingframes • A sequence defined by in-frame start and stop codon, which in turn defines aputative amino acid sequence. • Agenome of length n is comprised of (n/3)codons. • Stop codons break genome into segments between consecutive stop codons. • Thesub-segments of these that start from the Start codon (ATG)areORFs. • DNA is translated in all six possible frames, three frames forward and three reverse. ATG TGA Genomic Sequence Open reading frame
  • 10. Probabilisticmodels • Statistical description of agene. • Markov Models &Hidden Markov Models. • Usedto distinguish oligonucleotide distributions in the coding regions from those for non-coding regions. • Probability of distribution of nucleotides inDNA sequence depends on the order k. • Typesof order- zero,first and second. • Order , gene canpredicted more accurately.
  • 11. Genecontent and length distribution of prokaryotic genes TYPICAL ATYPICAL Ranges from100 to 500amino acids with a nucleotide distribution typical ofthe organism. Shorter or longer with different nucleotidestatistics. Genes tend toescape detection when typical gene modelis used.
  • 12. Genefindingprogramsin prokaryotes • Theprograms are based on HMM/IMM.  GeneMark.hmm (microbial genomes)  Glimmer (UNIX program from TIGR). Computation involves two steps viz. model building & gene prediction.  FGENESB (bacterial sequences). It uses Vertibi algorithm & linear discriminant analysis(LDA).  RBSfinder- Searches from ribosomal binding site or shine dalgarno sequence for prediction of translation initiation site.
  • 13. Sensitivity Ability to include correct predictions. It is the fraction of known genescorrectlypredicted. Specificity Ability to exclude incorrect predictions. It is the fraction of predicted genes that correspond to true genes.  Both are the proportion of true signals.
  • 14. Eukaryoticgeneprediction • Genomes are much larger than prokaryotes(10Mbp to 670 Gbp). • Low gene density. • Spacebetween genesis very large and rich in repetitive sequences & transposableelements. • Splitting of genesby intervening noncodingsequences (introns) and joining of coding sequences(exons).
  • 15. • Splice junctions follow GT-AGrule. • An intron at the 5’ splice junction hasaconsensus motif GTAAGTand that at 3’ endNCAG. exon 1 exon 2 • Geneshave ahigh density of CGdinucleotides near the transcription start site. Thisregion is CpGisland. It helps to identify the transcription initiation site of an eukaryotic gene. • Somepost-transcriptional modification occur with the transcript to become mature mRNAviz. Capping, Splicing and Polyadenylation. Acceptor Site Donor Site GT AG
  • 16. o CAPPING: Occurs at the 5’ end of the transcript. It involves methylation at the initial residue of the RNA. o SPLICING: Processof removal of intronsand joining of exons. It involves alargeRNA-protein complex called spliceosome. o POLYADENYLATION:Addition of astretch ofAs (~250) at the 3’ end of the RNA.Theprocessis accomplished by poly-Apolymerase.
  • 17. Genefindingprogramsin EUkaryotes • Three categories of algorithms  Ab Initiobased- It joins the exonsin correct order.Twosignals-> a) Genesignals: asmall pattern within the genomic DNAincluding putative splice sites, start and stop sites of transcription or translation, branchpoints, transcription factor binding sites, recognizable consensus sequences. b) Genecontent: aregion of genomic DNAincluding nucleotide and amino acid distribution, Synonymous codon usageand hexamer frequencies.
  • 18.  Neural network based algorithm -Composed of network of mathematicalvariables. -Multiple layers like input, output and hiddenlayers. -GRAIL (Splice junctions, start and stop codons, poly-A sites, promoters and CpGislands). It scansthe query sequence with windows of variable lengths &scores.  Discriminant analysis -Linear Discriminant Analysis (LDA) represents 2D graph of coding signals vs. all possible 3’ splice site positions; adiagonal line. -Quadratic DiscriminantAnalysis (QDA)represents quadratic function; acurved line. -FGENES (LDA)
  • 19. -FGENESH [Find Genes] (HMMs) -FGENESH_C (Similarity based) -FGENESH+ (Combination of ab initio &similarity based) -MZEF [Michael Zhang’s Exon Finder](QDA)  HMMs -GENSCAN (Fifth order HMMs); combination of hexamer frequencies with coding signals;probability score P>0.5 -HMMgene (Conditional Maximum Likelihood); combination of ab initio & homology-basedalgorithm
  • 20.  Homology-based- Exonstructures and sequencesof related speciesare highly conserved. Comparison of homologous sequences derived from cDNAor ExpressedSequenceTags (ESTs). -GenomeScan (Combination of GENSCANprediction results with BLASTXsimilaritysearches) -EST2Genome (Intron-exon boundaries); Comparison of an ESTsequence with agenomic DNAsequence -SGP-1 [Syntenic Gene Prediction] (Similar to EST2) -TwinScan (gene-finding server; similar to GenomeScan)
  • 21.  Consensus-based- Combination of results of multiple programsbased on consensus. Improvement of specificity by correctingfalse positives & problem ofoverprediction. Lowered sensitivity & missedpredictons. -GeneComber (Combination of HMMgene& GenScanprediction results) -DIGIT (Combination of FGENESH,GENSCAN& HMMgene)
  • 22. GENE EXPRESSION Two steps are required 1. Translation The synthesis of a polypeptide chain using the genetic code on the mRNA molecule as its guide. 1. Transcription The synthesis of mRNA uses the gene on the DNA molecule as a template This happens in the nucleus of eukaryotes
  • 23. Types OF RNA Messenger RNA (mRNA) <5% Ribosomal RNA (rRNA) Up to 80% Transfer RNA (tRNA) About 15% In eukaryotes small nuclear ribonucleoproteins (snRNP aka spliceosomes Structural characteristics of RNA molecules Single polynucleotide strand which may be looped or coiled (not a double helix) Sugar Ribose (not deoxyribose) Bases used: Adenine, Guanine, Cytosine and Uracil (not Thymine
  • 24. Transcription: The synthesis of a strand of mRNA (and other RNAs) Uses an enzyme RNA polymerase Proceeds in the same direction as replication (5’ to 3’) Forms a complementary strand of mRNA It begins at a promotor site, which signals that the beginning of the gene is near (about 20 to 30 nucleotides away) After the end of the gene is reached, there is a terminator sequence that tells RNA polymerase to stop transcribing NB Terminator sequence ≠ terminator codon RNA POLYMERASE
  • 25. Editing the mRNA In prokaryotes, transcribed mRNA goes straight to the ribosomes in the cytoplasm In eukaryotes, freshly transcribed mRNA in the nucleus is about 5000 nucleotides long When the same mRNA is used for translation at the ribosome it is only 1000 nucleotides long The mRNA has been edited The parts which are kept for gene expression are called EXONS (exons = expressed) The parts which are edited out (by spliceosomes) are called INTRONS.
  • 27. Translation  Location: The ribosomes in the cytoplasm that provide the environment for translation  The genetic code is brought by the mRNA molecule. © 2016 Paul Billiet ODWS
  • 28. An important discovery Retro viruses (e.g. HIV) carry RNA as their genetic information  When they invade their host cell they convert their RNA into a DNA copy using reverse transcriptase  Thus the central dogma is modified: DNA↔RNAProtein  This has helped to explain an important paradox in the evolution of life. Reverse transcriptase © 2016 Paul Billiet ODWS
  • 29. The paradox of DNA  DNA is a very stable molecule  It is a good medium for storing genetic material but…  DNA can do nothing for itself  It requires enzymes for replication  It requires enzymes for gene expression  The information in DNA is required to synthesise enzymes (proteins) but enzymes are require to make DNA function  Which came first in the origin of life DNA or enzymes? © 2016 Paul Billiet ODWS
  • 30. RIBOZYMES: Both genetic and catalytic  Certain forms of RNA have catalytic properties  RIBOZYMES  Ribosomes and spliceosomes are ribozymes  RNA could have been the first genetic information synthesizing proteins…  …and at the same time a biocatalyst  Reverse transcriptase provides the possibility of producing DNA copies from RNA. © 2016 Paul Billiet ODWS
  • 31. The ribosome a ribozyme
  • 32. REFERENCES  http://www.4ulr.com/products/currentprotocols/bioinformatics.html  http://proxy.lib.iastate.edu:2103/nrg/journal/v3/n9/full/nrg890_fs.html  http://proxy.lib.iastate.edu:2103/nrg/journal/v5/n4/full/nrg1315_fs.html  Xiong J.;Essential bioinformatics; QH324.2.X56 2006