Gene Discovery
Supachai Topanurak
Email: Supachai.top@mahidol.ac.th
19 January 2018
Eukaryotic cell
Understanding the words
• Gene
• Genome
• Genotype and Phenotype
• Bioinformatics
• Gene Annotation, Genome annotation
• Proteome
Genotype and Phenotype
• Genotype and phenotype are very similar-
sounding words that are related, but actually
mean different things. The genotype is the set
of genes in our DNA which is responsible for a
particular trait. The phenotype is the physical
expression, or characteristics, of that trait.
The DNA molecule
• Composed of 2
polymers of
nucleotides
• Polymers are oriented
in antiparallel
• Molecule resembles a
spiral staircase of
complementary base
pairs
Nucleotide structure of DNA
• Each nucleotide of
DNA contains:
– Deoxyribose
– Phosphate
– Nitrogen base (either
A, G, C, T)
Nucleotide structure of RNA
• Each nucleotide of
RNA contains:
– Ribose
– Phosphate
– Nitrogen base (either
A, G, C, U*)
*contains Uracil instead
of Thymine
DNA structure
• “Double helix”
propsed by Watson
and Crick (1953)
• Antiparallel backbones
• Complementary base
pairing:
– Adenine to Thymine
– Cytosine to Guanine
DNA structure
Chromosomes vs Genes
• A chromosome
constitutes an entire DNA
molecule + protein
– Protein = histones
– Supercoiled DNA in
nucleosomes
– Humans contain 46 such
molecules (23 pairs)
• 44 somatic chromosomes
• 2 sex chromosomes (X +Y)
Chromosomes vs Genes
• Genes constitute
distinct regions on the
chromosome
• Each gene codes for a
protein product
• DNA -> RNA-> protein
• Differences in proteins
brings about differences
between individuals and
species
Gene structure
• Genes must have:
 Exons
 Start site
 Control region
From genes to proteins
DNA
RNA
mRNA
TRANSCRIPTION
SPLICING
PROMOTER
ELEMENTS
PROTEIN
TRANSLATION
START
CODON
STOP
CODON
SPLICE
SITES
From genes to proteins
From genes to proteins
Comparative Sequence Sizes
• Yeast chromosome 3 350,000
• Escherichia coli (bacterium) genome 4,600,000
• Largest yeast chromosome now mapped 5,800,000
• Entire yeast genome 15,000,000
• Entire human genome 3,000,000,000
DNA, genes & chromosomes
The objectives of this presentation are to:
• Understand the role and structure of DNA, genes and
chromosomes.
• Understand that proteins are encoded by genes
• Be aware that alterations in genetic material can cause disease
Gene Annotation
• Finding RNA-only genes
• Gene prediction
– Prokaryotes vs. eukaryotes
– Introns and exons
– Transcription signals
– ESTs
• Functional annotation
• Biochemical pathways and subsystems
• Metabolic reconstruction of whole organisms
Types of exons
5’
3’
Start Stop
Transcription start
Translation
StoppolyA
5’ untranslated
region
3’ untranslated
region
5’ 3’
Protein
coding
region
promoter
GT AG GT AG GT AG GT AG
Open reading frame
Gene
mRNA
Translation
Initial exon
Internal exon
Internal coding exon
Terminal exon
Eukaryote genome annotation
20
Genome
ATG STOP
AAAn
A B
Transcription
Primary Transcript
Processed mRNA
Polypeptide
Folded protein
Functional activity
Translation
Protein folding
Enzyme activity
RNA processing
m7G
Find locus
Find exons
using transcripts
Find exons
using peptides
Find function
Prokaryote genome annotation
21
Genome
START STOP
A B
Transcription
Primary Transcript
Processed RNA
Polypeptide
Folded protein
Functional activity
Translation
Protein folding
Enzyme activity
RNA processing
Find locus
Find CDS
Find function
START STOP
Therefore, prior to dividing, any cell
must first replicate DNA
• Each single-stranded (SS)
chromosome duplicates
to become a double-
stranded (DS)
chromosome
• Example:
– A human cell is formed
with 46 SS chromosomes
– Each chromosome
replicates to produce 46
DS chromosomes
DNA replication

Gene discovery

  • 1.
    Gene Discovery Supachai Topanurak Email:Supachai.top@mahidol.ac.th 19 January 2018
  • 2.
  • 3.
    Understanding the words •Gene • Genome • Genotype and Phenotype • Bioinformatics • Gene Annotation, Genome annotation • Proteome
  • 4.
    Genotype and Phenotype •Genotype and phenotype are very similar- sounding words that are related, but actually mean different things. The genotype is the set of genes in our DNA which is responsible for a particular trait. The phenotype is the physical expression, or characteristics, of that trait.
  • 5.
    The DNA molecule •Composed of 2 polymers of nucleotides • Polymers are oriented in antiparallel • Molecule resembles a spiral staircase of complementary base pairs
  • 6.
    Nucleotide structure ofDNA • Each nucleotide of DNA contains: – Deoxyribose – Phosphate – Nitrogen base (either A, G, C, T)
  • 7.
    Nucleotide structure ofRNA • Each nucleotide of RNA contains: – Ribose – Phosphate – Nitrogen base (either A, G, C, U*) *contains Uracil instead of Thymine
  • 8.
    DNA structure • “Doublehelix” propsed by Watson and Crick (1953) • Antiparallel backbones • Complementary base pairing: – Adenine to Thymine – Cytosine to Guanine
  • 9.
  • 10.
    Chromosomes vs Genes •A chromosome constitutes an entire DNA molecule + protein – Protein = histones – Supercoiled DNA in nucleosomes – Humans contain 46 such molecules (23 pairs) • 44 somatic chromosomes • 2 sex chromosomes (X +Y)
  • 11.
    Chromosomes vs Genes •Genes constitute distinct regions on the chromosome • Each gene codes for a protein product • DNA -> RNA-> protein • Differences in proteins brings about differences between individuals and species
  • 12.
    Gene structure • Genesmust have:  Exons  Start site  Control region
  • 13.
    From genes toproteins
  • 14.
  • 15.
    From genes toproteins
  • 16.
    Comparative Sequence Sizes •Yeast chromosome 3 350,000 • Escherichia coli (bacterium) genome 4,600,000 • Largest yeast chromosome now mapped 5,800,000 • Entire yeast genome 15,000,000 • Entire human genome 3,000,000,000
  • 17.
    DNA, genes &chromosomes The objectives of this presentation are to: • Understand the role and structure of DNA, genes and chromosomes. • Understand that proteins are encoded by genes • Be aware that alterations in genetic material can cause disease
  • 18.
    Gene Annotation • FindingRNA-only genes • Gene prediction – Prokaryotes vs. eukaryotes – Introns and exons – Transcription signals – ESTs • Functional annotation • Biochemical pathways and subsystems • Metabolic reconstruction of whole organisms
  • 19.
    Types of exons 5’ 3’ StartStop Transcription start Translation StoppolyA 5’ untranslated region 3’ untranslated region 5’ 3’ Protein coding region promoter GT AG GT AG GT AG GT AG Open reading frame Gene mRNA Translation Initial exon Internal exon Internal coding exon Terminal exon
  • 20.
    Eukaryote genome annotation 20 Genome ATGSTOP AAAn A B Transcription Primary Transcript Processed mRNA Polypeptide Folded protein Functional activity Translation Protein folding Enzyme activity RNA processing m7G Find locus Find exons using transcripts Find exons using peptides Find function
  • 21.
    Prokaryote genome annotation 21 Genome STARTSTOP A B Transcription Primary Transcript Processed RNA Polypeptide Folded protein Functional activity Translation Protein folding Enzyme activity RNA processing Find locus Find CDS Find function START STOP
  • 22.
    Therefore, prior todividing, any cell must first replicate DNA • Each single-stranded (SS) chromosome duplicates to become a double- stranded (DS) chromosome • Example: – A human cell is formed with 46 SS chromosomes – Each chromosome replicates to produce 46 DS chromosomes
  • 23.