SlideShare a Scribd company logo
An Introduction to Bioinformatics
Finding genes in prokaryotes
Usually the primary challenge that follows the sequencing of
anything from a small segment of DNA to a complete genome
is to establish where the location functional elements such as:
genes (intron/exon boundaries)
promoters,
terminators etc
DNA sequences that may potentially encode proteins are called
Open Reading Frames (ORFs)
The situation in prokaryotes is relatively straightforward since
scarcely any eubacterial and archaeal genes contain introns
FINDING ORFs
The simplest method in prokaryotes is to scan the DNA for
start and stop codons
The DNA is double stranded and each strand has three
potential reading frames (codons are groups of 3 bases)
THE CAT ATE THE RAT Frame 1
T HEC ATA TET HER AT Frame 2
TH ECA TAT ETH ERA T Frame 3
The scan must look at all 6 reading frames
Any region of DNA between a start codon and a stop codon in
the same reading frame could potentially code for a polypeptide
and is therefore an ORF
Start AUG (methionine) Stop UAA UAG UGA
small potential coding sequences like this will occur frequently
by chance, and therefore the longer they are the more likely
they are to represent real coding regions, genes
Problems
Small genes may be missed
The actual start codon may be internal to the ORF
There may be overlapping genes
The simplest tool for finding ORFs is ORF Finder at NCBI
It simply scans all 6 reading frames and shows the position of
the ORFs which are greater than a user defined minimum size
The genetic code used for the analysis can be altered by the
user
This would be important if e.g. mitochondrial or ciliate nuclear
DNA were being analysed
To overcome the limitations of ORF finder, more sophisticated
programmes detect compositional biases and increase the
reliability of gene detection
These compositional biases are regular, though very diffuse,
And arise for a variety of reasons:
many organisms there is a detectable preference for G or C
over A and T in the third ("wobble") position in a codon
all organisms do not utilize synonymous codons with the same
frequency - consequently there is a codon bias
there is an unequal usage of amino acids in proteins sufficient to
cause a bias in all three positions of codons and increase the
overall codon bias
the %GC content of the first two codon positions of the
universal genetic code is approximately 50%, therefore,
organisms which have a low or high %GC content will exhibit
a marked bias at the third position of codons to achieve their
overall %GC content
The most recent approaches to using compositional features
to distinguish coding from non-coding regions employ ‘Markov
models’
such approaches include the popular GENEMARK and
GLIMMER programs
Finding Genes in Eukaryotes
An Introduction to Bioinformatics
AIMS To establish the concept of ORFs and their relationship to genes
To describe the features used by software to find ORFs/genes
To become familiar with Web-based programmes used to find
ORFs/genes
OBJECTIVES
To be able to distinguish between the concepts of ORF and gene
Use ORF Finder to find ORFs in prokaryotic nucleotide sequences
To describe the complications of the eukaryote “signals”
To be aware of the Web-based programmes
To be able to use the eukaryote programmes for a number of
organisms
Organisms whose cells have a membrane-bound
nucleus and many specialised structures located within
their cell boundary.
In these organisms, genetic material is organized into
chromosomes that reside in the nucleus.
Principles
• Content - codon usage
– often species or class specific
• Signals - PWMs
– principle is the same, signals are different
– Complication of introns/exons
Eukaryotic promoter
TATA boxGC boxCAAT box
5’ 3’
-110 -40 -25 +1mRNA
In addition - transcription factor binding sites
Genes can be enormous!
Controlled by “distant” enhancers
AAUAA
~ 12bp polyA
AAAAA…...
Kozak sequence
At translational start
Polyadenylation sequence
AUG
Signals on the mRNA
STOP
Introns and Exons
Chicken 1α2 collagen gene
has - 38 kb > 50 Introns
Muscular Dystrophy gene is 2.5 Mb and has
? Exons!
Splicing signals
C A T C
A G C T
AGGT AGT N AGG( )>11
5’Exon 3’Exon
GT-AG rule
Exon finding
• Initial exons, from the initiation codon to the first
splice site;
• Internal exons from splice site to splice site;
• Terminal exons from splice site to stop codon;
• Single introns corresponding to uninterrupted,
intronless genes, i.e., running from initiation codon to
stop codon.
Intergrated Gene Parsing
• Search for signals
• Perform a content analysis
• Define the intron/exon boundaries
Gene finding web sites
http://www.tigr.org/~salzberg/appendixa.html
>25 listed sites
GENSCAN
FGENES
Finding genes
Finding genes
Finding genes
Finding genes
Finding genes

More Related Content

What's hot

Phylogenetic analysis
Phylogenetic analysis Phylogenetic analysis
Phylogenetic analysis
Nitin Naik
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
Vaibhav Maurya
 
Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)
SumatiHajela
 
222397 lecture 16 17
222397 lecture 16 17222397 lecture 16 17
222397 lecture 16 17
mohamedseyam13
 
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
SHEETHUMOLKS
 
UniProt
UniProtUniProt
UniProt
AmnaA7
 
Gene prediction method
Gene prediction method Gene prediction method
Gene prediction method
Nusrat Gulbarga
 
Orthologs,Paralogs & Xenologs
 Orthologs,Paralogs & Xenologs  Orthologs,Paralogs & Xenologs
Orthologs,Paralogs & Xenologs
OsamaZafar16
 
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSPHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICS
Usman Arshad
 
EMBL
EMBLEMBL
BLAST
BLASTBLAST
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
Ashwini
 
Distance based method
Distance based method Distance based method
Distance based method
Adhena Lulli
 
cDNA Library Construction
cDNA Library ConstructioncDNA Library Construction
cDNA Library Construction
Stella Evelyn
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
karamveer prajapat
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
KAUSHAL SAHU
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
Vijay Hemmadi
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencing
Goutham Sarovar
 

What's hot (20)

Phylogenetic analysis
Phylogenetic analysis Phylogenetic analysis
Phylogenetic analysis
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)Scoring schemes in bioinformatics (blosum)
Scoring schemes in bioinformatics (blosum)
 
222397 lecture 16 17
222397 lecture 16 17222397 lecture 16 17
222397 lecture 16 17
 
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICSSTRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
 
UniProt
UniProtUniProt
UniProt
 
Gene prediction method
Gene prediction method Gene prediction method
Gene prediction method
 
Orthologs,Paralogs & Xenologs
 Orthologs,Paralogs & Xenologs  Orthologs,Paralogs & Xenologs
Orthologs,Paralogs & Xenologs
 
PHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICSPHYSICAL MAPPING STRATEGIES IN GENOMICS
PHYSICAL MAPPING STRATEGIES IN GENOMICS
 
NCBI
NCBINCBI
NCBI
 
EMBL
EMBLEMBL
EMBL
 
BLAST
BLASTBLAST
BLAST
 
Scoring matrices
Scoring matricesScoring matrices
Scoring matrices
 
Distance based method
Distance based method Distance based method
Distance based method
 
cDNA Library Construction
cDNA Library ConstructioncDNA Library Construction
cDNA Library Construction
 
methods for protein structure prediction
methods for protein structure predictionmethods for protein structure prediction
methods for protein structure prediction
 
Introduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbjIntroduction to ncbi, embl, ddbj
Introduction to ncbi, embl, ddbj
 
Est database
Est databaseEst database
Est database
 
Gene prediction methods vijay
Gene prediction methods  vijayGene prediction methods  vijay
Gene prediction methods vijay
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencing
 

Similar to Finding genes

LECTURE 7.pptx
LECTURE 7.pptxLECTURE 7.pptx
LECTURE 7.pptx
ericndunek
 
Open Reading Frames
Open Reading FramesOpen Reading Frames
Open Reading FramesOsama Zahid
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
University of Petroleum and Energy studies
 
C value
C value C value
C value
Vinod Pawar
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
ChijiokeNsofor
 
Lecture 4.ppt
Lecture 4.pptLecture 4.ppt
Lecture 4.ppt
khadijarafique14
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
Nawfal Aldujaily
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptx
Cherry
 
Genome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTKGenome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTK
Monica Munoz-Torres
 
Unit 1 transcription
Unit 1 transcriptionUnit 1 transcription
Unit 1 transcription
Dr. Mafatlal Kher
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical NotebookNaima Tahsin
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
Shifa Ansari
 
genomeannotation-160822182432.pdf
genomeannotation-160822182432.pdfgenomeannotation-160822182432.pdf
genomeannotation-160822182432.pdf
VidyasriDharmalingam1
 
Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07
Monica Munoz-Torres
 
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
Monica Munoz-Torres
 
08_Annotation_2022.pdf
08_Annotation_2022.pdf08_Annotation_2022.pdf
08_Annotation_2022.pdf
Kristen DeAngelis
 
Introduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinisIntroduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinis
Monica Munoz-Torres
 
Differentiated Fern Research Paper
Differentiated Fern Research PaperDifferentiated Fern Research Paper
Differentiated Fern Research Paper
Alison Reed
 
Present status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptxPresent status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptx
PrabhatSingh628463
 

Similar to Finding genes (20)

LECTURE 7.pptx
LECTURE 7.pptxLECTURE 7.pptx
LECTURE 7.pptx
 
Open Reading Frames
Open Reading FramesOpen Reading Frames
Open Reading Frames
 
Gene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptxGene identification using bioinformatic tools.pptx
Gene identification using bioinformatic tools.pptx
 
C value
C value C value
C value
 
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptxBTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
BTC 506 Gene Identification using Bioinformatic Tools-230302130331.pptx
 
Lecture 4.ppt
Lecture 4.pptLecture 4.ppt
Lecture 4.ppt
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Structural annotation................pptx
Structural annotation................pptxStructural annotation................pptx
Structural annotation................pptx
 
Genome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTKGenome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTK
 
Unit 1 transcription
Unit 1 transcriptionUnit 1 transcription
Unit 1 transcription
 
Bioinformatics.Practical Notebook
Bioinformatics.Practical NotebookBioinformatics.Practical Notebook
Bioinformatics.Practical Notebook
 
Genome annotation
Genome annotationGenome annotation
Genome annotation
 
genomeannotation-160822182432.pdf
genomeannotation-160822182432.pdfgenomeannotation-160822182432.pdf
genomeannotation-160822182432.pdf
 
Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07Apollo Introduction for i5K Groups 2015-10-07
Apollo Introduction for i5K Groups 2015-10-07
 
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation OverviewPathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
Pathema Burkholderia Annotation Jamboree: Prokaryotic Annotation Overview
 
Apollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research CommunityApollo Introduction for the Chestnut Research Community
Apollo Introduction for the Chestnut Research Community
 
08_Annotation_2022.pdf
08_Annotation_2022.pdf08_Annotation_2022.pdf
08_Annotation_2022.pdf
 
Introduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinisIntroduction to Apollo: i5K E affinis
Introduction to Apollo: i5K E affinis
 
Differentiated Fern Research Paper
Differentiated Fern Research PaperDifferentiated Fern Research Paper
Differentiated Fern Research Paper
 
Present status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptxPresent status and recent developments on available molecular marker.pptx
Present status and recent developments on available molecular marker.pptx
 

More from Sabahat Ali

RECOMBINATION MOLECULAR BIOLOGY PPT UPDATED new.pptx
RECOMBINATION MOLECULAR BIOLOGY  PPT UPDATED new.pptxRECOMBINATION MOLECULAR BIOLOGY  PPT UPDATED new.pptx
RECOMBINATION MOLECULAR BIOLOGY PPT UPDATED new.pptx
Sabahat Ali
 
Good laboratory practices in a pharmaceutical lab 1
Good laboratory practices in a pharmaceutical lab 1Good laboratory practices in a pharmaceutical lab 1
Good laboratory practices in a pharmaceutical lab 1
Sabahat Ali
 
Degradation of PLA at Mesophillic and thermophillic conditions
Degradation of PLA at Mesophillic and thermophillic conditionsDegradation of PLA at Mesophillic and thermophillic conditions
Degradation of PLA at Mesophillic and thermophillic conditions
Sabahat Ali
 
Life cycle Assesment and waste stratigies of PLA
Life cycle Assesment and waste stratigies of PLALife cycle Assesment and waste stratigies of PLA
Life cycle Assesment and waste stratigies of PLA
Sabahat Ali
 
Environmental biodegradation of PLA by Biotic and Abiotic factors
Environmental biodegradation of PLA by Biotic and Abiotic factorsEnvironmental biodegradation of PLA by Biotic and Abiotic factors
Environmental biodegradation of PLA by Biotic and Abiotic factors
Sabahat Ali
 
Energy expenditure and BMR
Energy expenditure and BMREnergy expenditure and BMR
Energy expenditure and BMR
Sabahat Ali
 
Agriculture applications of nanobiotechnology
Agriculture applications of nanobiotechnologyAgriculture applications of nanobiotechnology
Agriculture applications of nanobiotechnology
Sabahat Ali
 
Macronutrients and nutrition
Macronutrients and nutritionMacronutrients and nutrition
Macronutrients and nutrition
Sabahat Ali
 
Poly lactic Acid Biodegradation
Poly lactic Acid BiodegradationPoly lactic Acid Biodegradation
Poly lactic Acid Biodegradation
Sabahat Ali
 
Alzhemier's disease and koraskoff syndrome
Alzhemier's disease and koraskoff syndromeAlzhemier's disease and koraskoff syndrome
Alzhemier's disease and koraskoff syndrome
Sabahat Ali
 
Nerve cells, Nervous communication & its link to the celllular signalling
Nerve cells, Nervous communication & its link to the celllular signallingNerve cells, Nervous communication & its link to the celllular signalling
Nerve cells, Nervous communication & its link to the celllular signalling
Sabahat Ali
 
Peptide Hormones and Catecholamines
Peptide Hormones and CatecholaminesPeptide Hormones and Catecholamines
Peptide Hormones and Catecholamines
Sabahat Ali
 
Membrane Proteins & its types
Membrane Proteins & its typesMembrane Proteins & its types
Membrane Proteins & its types
Sabahat Ali
 
membrane lipids & its types
membrane lipids & its types membrane lipids & its types
membrane lipids & its types
Sabahat Ali
 
Biomembranes (lipids, proteins, carbohydrates)
Biomembranes (lipids, proteins, carbohydrates)Biomembranes (lipids, proteins, carbohydrates)
Biomembranes (lipids, proteins, carbohydrates)
Sabahat Ali
 
cell to cell signalling
cell to cell signallingcell to cell signalling
cell to cell signalling
Sabahat Ali
 
Protein Folding Mechanism
Protein Folding MechanismProtein Folding Mechanism
Protein Folding Mechanism
Sabahat Ali
 
Proetin Tertiary Structure
Proetin Tertiary StructureProetin Tertiary Structure
Proetin Tertiary Structure
Sabahat Ali
 
Restriction digestion
Restriction digestionRestriction digestion
Restriction digestion
Sabahat Ali
 
Polymerase Chain Reaction(PCR)
Polymerase Chain Reaction(PCR)Polymerase Chain Reaction(PCR)
Polymerase Chain Reaction(PCR)
Sabahat Ali
 

More from Sabahat Ali (20)

RECOMBINATION MOLECULAR BIOLOGY PPT UPDATED new.pptx
RECOMBINATION MOLECULAR BIOLOGY  PPT UPDATED new.pptxRECOMBINATION MOLECULAR BIOLOGY  PPT UPDATED new.pptx
RECOMBINATION MOLECULAR BIOLOGY PPT UPDATED new.pptx
 
Good laboratory practices in a pharmaceutical lab 1
Good laboratory practices in a pharmaceutical lab 1Good laboratory practices in a pharmaceutical lab 1
Good laboratory practices in a pharmaceutical lab 1
 
Degradation of PLA at Mesophillic and thermophillic conditions
Degradation of PLA at Mesophillic and thermophillic conditionsDegradation of PLA at Mesophillic and thermophillic conditions
Degradation of PLA at Mesophillic and thermophillic conditions
 
Life cycle Assesment and waste stratigies of PLA
Life cycle Assesment and waste stratigies of PLALife cycle Assesment and waste stratigies of PLA
Life cycle Assesment and waste stratigies of PLA
 
Environmental biodegradation of PLA by Biotic and Abiotic factors
Environmental biodegradation of PLA by Biotic and Abiotic factorsEnvironmental biodegradation of PLA by Biotic and Abiotic factors
Environmental biodegradation of PLA by Biotic and Abiotic factors
 
Energy expenditure and BMR
Energy expenditure and BMREnergy expenditure and BMR
Energy expenditure and BMR
 
Agriculture applications of nanobiotechnology
Agriculture applications of nanobiotechnologyAgriculture applications of nanobiotechnology
Agriculture applications of nanobiotechnology
 
Macronutrients and nutrition
Macronutrients and nutritionMacronutrients and nutrition
Macronutrients and nutrition
 
Poly lactic Acid Biodegradation
Poly lactic Acid BiodegradationPoly lactic Acid Biodegradation
Poly lactic Acid Biodegradation
 
Alzhemier's disease and koraskoff syndrome
Alzhemier's disease and koraskoff syndromeAlzhemier's disease and koraskoff syndrome
Alzhemier's disease and koraskoff syndrome
 
Nerve cells, Nervous communication & its link to the celllular signalling
Nerve cells, Nervous communication & its link to the celllular signallingNerve cells, Nervous communication & its link to the celllular signalling
Nerve cells, Nervous communication & its link to the celllular signalling
 
Peptide Hormones and Catecholamines
Peptide Hormones and CatecholaminesPeptide Hormones and Catecholamines
Peptide Hormones and Catecholamines
 
Membrane Proteins & its types
Membrane Proteins & its typesMembrane Proteins & its types
Membrane Proteins & its types
 
membrane lipids & its types
membrane lipids & its types membrane lipids & its types
membrane lipids & its types
 
Biomembranes (lipids, proteins, carbohydrates)
Biomembranes (lipids, proteins, carbohydrates)Biomembranes (lipids, proteins, carbohydrates)
Biomembranes (lipids, proteins, carbohydrates)
 
cell to cell signalling
cell to cell signallingcell to cell signalling
cell to cell signalling
 
Protein Folding Mechanism
Protein Folding MechanismProtein Folding Mechanism
Protein Folding Mechanism
 
Proetin Tertiary Structure
Proetin Tertiary StructureProetin Tertiary Structure
Proetin Tertiary Structure
 
Restriction digestion
Restriction digestionRestriction digestion
Restriction digestion
 
Polymerase Chain Reaction(PCR)
Polymerase Chain Reaction(PCR)Polymerase Chain Reaction(PCR)
Polymerase Chain Reaction(PCR)
 

Recently uploaded

Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
justice-and-fairness-ethics with example
justice-and-fairness-ethics with examplejustice-and-fairness-ethics with example
justice-and-fairness-ethics with example
azzyixes
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
YOGESH DOGRA
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 

Recently uploaded (20)

Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
justice-and-fairness-ethics with example
justice-and-fairness-ethics with examplejustice-and-fairness-ethics with example
justice-and-fairness-ethics with example
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
Mammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also FunctionsMammalian Pineal Body Structure and Also Functions
Mammalian Pineal Body Structure and Also Functions
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 

Finding genes

  • 1. An Introduction to Bioinformatics Finding genes in prokaryotes
  • 2. Usually the primary challenge that follows the sequencing of anything from a small segment of DNA to a complete genome is to establish where the location functional elements such as: genes (intron/exon boundaries) promoters, terminators etc DNA sequences that may potentially encode proteins are called Open Reading Frames (ORFs) The situation in prokaryotes is relatively straightforward since scarcely any eubacterial and archaeal genes contain introns
  • 3. FINDING ORFs The simplest method in prokaryotes is to scan the DNA for start and stop codons The DNA is double stranded and each strand has three potential reading frames (codons are groups of 3 bases) THE CAT ATE THE RAT Frame 1 T HEC ATA TET HER AT Frame 2 TH ECA TAT ETH ERA T Frame 3 The scan must look at all 6 reading frames
  • 4. Any region of DNA between a start codon and a stop codon in the same reading frame could potentially code for a polypeptide and is therefore an ORF Start AUG (methionine) Stop UAA UAG UGA small potential coding sequences like this will occur frequently by chance, and therefore the longer they are the more likely they are to represent real coding regions, genes Problems Small genes may be missed The actual start codon may be internal to the ORF There may be overlapping genes
  • 5. The simplest tool for finding ORFs is ORF Finder at NCBI It simply scans all 6 reading frames and shows the position of the ORFs which are greater than a user defined minimum size The genetic code used for the analysis can be altered by the user This would be important if e.g. mitochondrial or ciliate nuclear DNA were being analysed
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15. To overcome the limitations of ORF finder, more sophisticated programmes detect compositional biases and increase the reliability of gene detection These compositional biases are regular, though very diffuse, And arise for a variety of reasons: many organisms there is a detectable preference for G or C over A and T in the third ("wobble") position in a codon all organisms do not utilize synonymous codons with the same frequency - consequently there is a codon bias there is an unequal usage of amino acids in proteins sufficient to cause a bias in all three positions of codons and increase the overall codon bias
  • 16. the %GC content of the first two codon positions of the universal genetic code is approximately 50%, therefore, organisms which have a low or high %GC content will exhibit a marked bias at the third position of codons to achieve their overall %GC content The most recent approaches to using compositional features to distinguish coding from non-coding regions employ ‘Markov models’ such approaches include the popular GENEMARK and GLIMMER programs
  • 17.
  • 18.
  • 19.
  • 20.
  • 21. Finding Genes in Eukaryotes An Introduction to Bioinformatics
  • 22. AIMS To establish the concept of ORFs and their relationship to genes To describe the features used by software to find ORFs/genes To become familiar with Web-based programmes used to find ORFs/genes OBJECTIVES To be able to distinguish between the concepts of ORF and gene Use ORF Finder to find ORFs in prokaryotic nucleotide sequences To describe the complications of the eukaryote “signals” To be aware of the Web-based programmes To be able to use the eukaryote programmes for a number of organisms
  • 23. Organisms whose cells have a membrane-bound nucleus and many specialised structures located within their cell boundary. In these organisms, genetic material is organized into chromosomes that reside in the nucleus.
  • 24. Principles • Content - codon usage – often species or class specific • Signals - PWMs – principle is the same, signals are different – Complication of introns/exons
  • 25. Eukaryotic promoter TATA boxGC boxCAAT box 5’ 3’ -110 -40 -25 +1mRNA In addition - transcription factor binding sites Genes can be enormous! Controlled by “distant” enhancers
  • 26. AAUAA ~ 12bp polyA AAAAA…... Kozak sequence At translational start Polyadenylation sequence AUG Signals on the mRNA STOP
  • 27. Introns and Exons Chicken 1α2 collagen gene has - 38 kb > 50 Introns Muscular Dystrophy gene is 2.5 Mb and has ? Exons!
  • 28. Splicing signals C A T C A G C T AGGT AGT N AGG( )>11 5’Exon 3’Exon GT-AG rule
  • 29. Exon finding • Initial exons, from the initiation codon to the first splice site; • Internal exons from splice site to splice site; • Terminal exons from splice site to stop codon; • Single introns corresponding to uninterrupted, intronless genes, i.e., running from initiation codon to stop codon.
  • 30. Intergrated Gene Parsing • Search for signals • Perform a content analysis • Define the intron/exon boundaries
  • 31. Gene finding web sites http://www.tigr.org/~salzberg/appendixa.html >25 listed sites GENSCAN FGENES