SlideShare a Scribd company logo
1 of 10
Molecular Data Analysis
BIT-757, 3(3-0)
Dr. Farrukh Azeem
MS Biotechnology 1st semester 2015-17
Gene and Regulatory Element Prediction
Levels of Gene Regulation
Strategies used in Gene Prediction
Programs
Homology based
• Based on comparison
with other sequence
Ab-initio based
• Based on given sequence
• Use two major features
a) Gene signals
- Start and stop codons
- Transcription factor binding
sites
- Ribosomal binding sites
- Polyadenylation (Poly-A) sites
b) Gene content
- Nucleotide composition
- Pattern of sequence in coding
and non-coding region
- GC content etc.
Gene Prediction in Prokaryotes
Characteristics of Genome
- Small size (0.5 to 10Mbp)
- High gene density….90% genome is coding sequence
- Few repetitive sequences
- No Introns
- ATG, ….also GTG, TTG
- Shine-Delgarno sequence……..a Purin-rich sequence complementary
to 16sRNA in ribosome…..AGGAGGT…present downstream of TSS and
upstream of Translation ignition codon
- Possible stop codons.. Three
- Operon….. Followed by termination codon (p-independent
terminator)…….and stretch of TTTT
Gene Prediction in Prokaryotes
Gene Prediction in Prokaryotes
Conventional Method
a) Identification of ORF and major signals related to prokaryotic
gene
- Conceptual translation in all six reading frames
- Identification of a frame longer than 30 codons … a stop codon
in every 20 codons by chance
- Confirmation of signals like Shine-Delgarno sequence and
other signals
- Sequence similarity searching by BLAST…
b) Codon Biasness.. Third codon nucleotide…G/C…coding regions
have higher GC contents.
c) TESTCODE….. Codon at third position tend to repeat itself… so
by plotting the repetition pattern… coding and non-coding
regions can be differentiated
Gene Prediction in Prokaryotes
Gene Prediction in Prokaryotes
Non-conventional Method
A Markov model describes the probability of the distribution of nucleotides in a
DNA sequence, in which the conditional probability of a particular sequence
position depends on k previous positions.
Based on Markov model and Hidden Markov Models
Zero Order….every position independent
First Order….a position is dependent on previous position
Second Order…a position depends on preceding two positions
Based upon gene content and gene length….
a prokaryotic gene can be Typical (100 to 500 AA) or Atypical (shorter or longer )
So to explain both types….HMM are developed.
Tools based upon MM, HMM…….GeneMark, Glimmer, FGENESB
Gene Prediction in Eukaryotes
Characteristics of Genome
- Large size genomes (10Mbp 670 Gbp)
- Low gene density….almost 3% human genome is coding sequence
- Very rich repetitive sequences between coding regions
- Introns and Exons
- ATG, ….also GTG, TTG
- RNA processing…5’caping……Splicing……Polyadenylation
GT-AG rule for intron exon prediction
- Kozak sequence…..flanking ATG …..
- High CG dinucleotides near TSS………called CpG island
- Possible stop codons.. Three
- Poly-A signal
- High frequency of hexamers in coding regions.
Ab Initio based
-First objective is discrimination of Exons and Introns
Use two major features
a) Gene signals
- Start and stop codons
-Intron splice signals
-Transcription factor binding sites
-Polyadenylation (Poly-A) sites
b) Gene content
-Nucleotide composition
-Pattern of sequence in coding and non-coding region
-GC content etc.
-Hexamer Frequencies
Tools….GRAIL, MZEF, FGENES, HMMgene
Strategies used in Gene Prediction
Programs
Homology based
Based upon homology of Exon
structure, Exon sequence and
patterns of Exon-Intron
Genomescan, EST2Genome
Also Consensus Based
GeneComber, DIGIT

More Related Content

What's hot

In silico structure prediction
In silico structure predictionIn silico structure prediction
In silico structure predictionSubin E K
 
Tech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome BrowserTech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome BrowserHoffman Lab
 
Pattern recognition techniques for the emerging feilds in bioinformatics
Pattern recognition techniques for the emerging feilds in bioinformaticsPattern recognition techniques for the emerging feilds in bioinformatics
Pattern recognition techniques for the emerging feilds in bioinformaticsKaveen Prathibha Kumarasinghe
 
Bioinformatics complete manual
Bioinformatics complete manualBioinformatics complete manual
Bioinformatics complete manualFrazAhmadMazari
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelLars Juhl Jensen
 
encode project
encode project encode project
encode project Priti Pal
 
Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)Leighton Pritchard
 
Yeast two hybrid system for Protein Protein Interaction Studies
Yeast two hybrid system for Protein Protein Interaction StudiesYeast two hybrid system for Protein Protein Interaction Studies
Yeast two hybrid system for Protein Protein Interaction Studiesajithnandanam
 
Bioinformatics lecture 1
Bioinformatics lecture 1Bioinformatics lecture 1
Bioinformatics lecture 1Hamid Ur-Rahman
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijayVijay Hemmadi
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformaticsHamid Ur-Rahman
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl databaseAshfaq Ahmad
 
Motif Finding.pdf
Motif Finding.pdfMotif Finding.pdf
Motif Finding.pdfShimoFcis
 
Motif & Domain
Motif & DomainMotif & Domain
Motif & DomainAnik Banik
 

What's hot (20)

In silico structure prediction
In silico structure predictionIn silico structure prediction
In silico structure prediction
 
Tech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome BrowserTech Talk: UCSC Genome Browser
Tech Talk: UCSC Genome Browser
 
Pattern recognition techniques for the emerging feilds in bioinformatics
Pattern recognition techniques for the emerging feilds in bioinformaticsPattern recognition techniques for the emerging feilds in bioinformatics
Pattern recognition techniques for the emerging feilds in bioinformatics
 
Bioinformatics complete manual
Bioinformatics complete manualBioinformatics complete manual
Bioinformatics complete manual
 
Systems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems levelSystems biology - Understanding biology at the systems level
Systems biology - Understanding biology at the systems level
 
encode project
encode project encode project
encode project
 
Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)Microbial Genomics and Bioinformatics: BM405 (2015)
Microbial Genomics and Bioinformatics: BM405 (2015)
 
Yeast two hybrid system for Protein Protein Interaction Studies
Yeast two hybrid system for Protein Protein Interaction StudiesYeast two hybrid system for Protein Protein Interaction Studies
Yeast two hybrid system for Protein Protein Interaction Studies
 
Bioinformatics lecture 1
Bioinformatics lecture 1Bioinformatics lecture 1
Bioinformatics lecture 1
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
Basics of Genome Assembly
Basics of Genome Assembly Basics of Genome Assembly
Basics of Genome Assembly
 
Modeller
ModellerModeller
Modeller
 
Epigenomics
EpigenomicsEpigenomics
Epigenomics
 
Introduction to bioinformatics
Introduction to bioinformaticsIntroduction to bioinformatics
Introduction to bioinformatics
 
Genomics
GenomicsGenomics
Genomics
 
Important protein databases and proteomics softwares
Important protein databases and proteomics softwaresImportant protein databases and proteomics softwares
Important protein databases and proteomics softwares
 
The ensembl database
The ensembl databaseThe ensembl database
The ensembl database
 
Motif Finding.pdf
Motif Finding.pdfMotif Finding.pdf
Motif Finding.pdf
 
Metabolomics
MetabolomicsMetabolomics
Metabolomics
 
Motif & Domain
Motif & DomainMotif & Domain
Motif & Domain
 

Similar to Gene Prediction

Marker devt. workshop 27022012
Marker devt. workshop 27022012Marker devt. workshop 27022012
Marker devt. workshop 27022012Koppolu Ravi
 
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema
 
Present status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptxPresent status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptxPrabhatSingh628463
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programsMugdhaSharma11
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxChijiokeNsofor
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical NotebookNaima Tahsin
 
Central dogma
Central dogmaCentral dogma
Central dogmaneizylah
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER) ijceronline
 
Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationMohamedHasan816582
 
Processing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataProcessing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataAlireza Doustmohammadi
 
How we revealed genomes secrets?
How we revealed genomes secrets? How we revealed genomes secrets?
How we revealed genomes secrets? ehsan sepahi
 
01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for educationaryajayakottarathil
 
Molecular markers by tahura mariyam ansari
Molecular markers by tahura mariyam ansariMolecular markers by tahura mariyam ansari
Molecular markers by tahura mariyam ansariTahura Mariyam Ansari
 
Microarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarraysMicroarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarraysayeshasattarsandhu
 
DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification Senthil Natesan
 

Similar to Gene Prediction (20)

genomeannotation-160822182432.pdf
genomeannotation-160822182432.pdfgenomeannotation-160822182432.pdf
genomeannotation-160822182432.pdf
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
Marker devt. workshop 27022012
Marker devt. workshop 27022012Marker devt. workshop 27022012
Marker devt. workshop 27022012
 
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
 
Present status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptxPresent status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptx
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 
Gene prediction method
Gene prediction method Gene prediction method
Gene prediction method
 
Central dogma
Central dogmaCentral dogma
Central dogma
 
International Journal of Computational Engineering Research(IJCER)
 International Journal of Computational Engineering Research(IJCER)  International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Microsatellites Markers
Microsatellites  MarkersMicrosatellites  Markers
Microsatellites Markers
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
 
Lecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generationLecture bioinformatics Part2.next generation
Lecture bioinformatics Part2.next generation
 
Processing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing DataProcessing Raw scRNA-Seq Sequencing Data
Processing Raw scRNA-Seq Sequencing Data
 
How we revealed genomes secrets?
How we revealed genomes secrets? How we revealed genomes secrets?
How we revealed genomes secrets?
 
01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education01-Sequencing_Technologies (1).ppt for education
01-Sequencing_Technologies (1).ppt for education
 
Molecular markers by tahura mariyam ansari
Molecular markers by tahura mariyam ansariMolecular markers by tahura mariyam ansari
Molecular markers by tahura mariyam ansari
 
Microarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarraysMicroarray biotechnologg ppy dna microarrays
Microarray biotechnologg ppy dna microarrays
 
DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification
 

Recently uploaded

Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Patrick Diehl
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfSELF-EXPLANATORY
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 

Recently uploaded (20)

The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?Is RISC-V ready for HPC workload? Maybe?
Is RISC-V ready for HPC workload? Maybe?
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
G9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.pptG9 Science Q4- Week 1-2 Projectile Motion.ppt
G9 Science Q4- Week 1-2 Projectile Motion.ppt
 
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Mayapuri Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdfBehavioral Disorder: Schizophrenia & it's Case Study.pdf
Behavioral Disorder: Schizophrenia & it's Case Study.pdf
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Luciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptxLuciferase in rDNA technology (biotechnology).pptx
Luciferase in rDNA technology (biotechnology).pptx
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 

Gene Prediction

  • 1. Molecular Data Analysis BIT-757, 3(3-0) Dr. Farrukh Azeem MS Biotechnology 1st semester 2015-17 Gene and Regulatory Element Prediction
  • 2. Levels of Gene Regulation
  • 3. Strategies used in Gene Prediction Programs Homology based • Based on comparison with other sequence Ab-initio based • Based on given sequence • Use two major features a) Gene signals - Start and stop codons - Transcription factor binding sites - Ribosomal binding sites - Polyadenylation (Poly-A) sites b) Gene content - Nucleotide composition - Pattern of sequence in coding and non-coding region - GC content etc.
  • 4. Gene Prediction in Prokaryotes Characteristics of Genome - Small size (0.5 to 10Mbp) - High gene density….90% genome is coding sequence - Few repetitive sequences - No Introns - ATG, ….also GTG, TTG - Shine-Delgarno sequence……..a Purin-rich sequence complementary to 16sRNA in ribosome…..AGGAGGT…present downstream of TSS and upstream of Translation ignition codon - Possible stop codons.. Three - Operon….. Followed by termination codon (p-independent terminator)…….and stretch of TTTT
  • 5. Gene Prediction in Prokaryotes
  • 6. Gene Prediction in Prokaryotes Conventional Method a) Identification of ORF and major signals related to prokaryotic gene - Conceptual translation in all six reading frames - Identification of a frame longer than 30 codons … a stop codon in every 20 codons by chance - Confirmation of signals like Shine-Delgarno sequence and other signals - Sequence similarity searching by BLAST… b) Codon Biasness.. Third codon nucleotide…G/C…coding regions have higher GC contents. c) TESTCODE….. Codon at third position tend to repeat itself… so by plotting the repetition pattern… coding and non-coding regions can be differentiated
  • 7. Gene Prediction in Prokaryotes
  • 8. Gene Prediction in Prokaryotes Non-conventional Method A Markov model describes the probability of the distribution of nucleotides in a DNA sequence, in which the conditional probability of a particular sequence position depends on k previous positions. Based on Markov model and Hidden Markov Models Zero Order….every position independent First Order….a position is dependent on previous position Second Order…a position depends on preceding two positions Based upon gene content and gene length…. a prokaryotic gene can be Typical (100 to 500 AA) or Atypical (shorter or longer ) So to explain both types….HMM are developed. Tools based upon MM, HMM…….GeneMark, Glimmer, FGENESB
  • 9. Gene Prediction in Eukaryotes Characteristics of Genome - Large size genomes (10Mbp 670 Gbp) - Low gene density….almost 3% human genome is coding sequence - Very rich repetitive sequences between coding regions - Introns and Exons - ATG, ….also GTG, TTG - RNA processing…5’caping……Splicing……Polyadenylation GT-AG rule for intron exon prediction - Kozak sequence…..flanking ATG ….. - High CG dinucleotides near TSS………called CpG island - Possible stop codons.. Three - Poly-A signal - High frequency of hexamers in coding regions.
  • 10. Ab Initio based -First objective is discrimination of Exons and Introns Use two major features a) Gene signals - Start and stop codons -Intron splice signals -Transcription factor binding sites -Polyadenylation (Poly-A) sites b) Gene content -Nucleotide composition -Pattern of sequence in coding and non-coding region -GC content etc. -Hexamer Frequencies Tools….GRAIL, MZEF, FGENES, HMMgene Strategies used in Gene Prediction Programs Homology based Based upon homology of Exon structure, Exon sequence and patterns of Exon-Intron Genomescan, EST2Genome Also Consensus Based GeneComber, DIGIT

Editor's Notes

  1. Promoters in prokaryotic organisms are two short DNA sequences located at the -10 (10bp 5' or upstream) and -35 positions from the transcription start site (TSS). Their equivalent to the eukaryotic TATA box, the Pribnow box (TATAAT) is located at the -10 position and is essential for transcription initiation. The -35 position, simply titled the -35 element, typically consists of the sequence TTGACA and this element controls the rate of transcription. Prokaryotic cells contain sigma factors which assist the RNA polymerase in binding to the promoter region. Each sigma factor recognizes different core promoter sequences.
  2. 1 gene per 100 Kb Nearly 250 As at the 3’ end