1
Shilpa V. Malaghan. 
II Ph.D (GPB) 
UAS, DHARWAD
Contents 
 Introduction 
 Origin (formation) 
 Identification 
 Expression 
 Utilities 
3 
3
Introduction 
The word ‘pseudogene’ was first used by Jacq et 
al., 1977. 
Pseudogenes are dysfunctional relatives of genes 
that have lost their protein-coding ability or are 
otherwise no longer expressed in the cell. 
Charecteristic features, 
• Highest homology to parental functional gene 
• Presence of disablements to prevent its expression 
4
Central dogma of cell biology 
Formation of pseudogene 
5
Origin (formation) of Pseudogenes 
1) Pseudogene formation after DNA- mediated 
gene duplication 
Ondrage et al., 2011 
6
2) Pseudogene formation after RNA- mediated 
gene duplication (processed ψn) 
Ondrage et al., 2011 
7
Types of Pseudogenes (ψn) 
Processed ψn 
Unprocessed ψn 
Unitary / disabled ψn 
8
Processed ψn: 
 Lack the promoter so “dead on arrival” 
 Contains poly-A tail 
 Lack intron sequence 
 Random association with parent gene 
9
Unprocessed ψn 
 Highest structural similarity with parent gene 
 Intact exons and introns 
 Prevalence of ORF disrupting mutations 
 Close association with parent gene 
10
Unitary ψn 
 No gene duplication before pseudogenization 
 Rare in occurrence 
 Fixed disabling mutations 
11
Evidence for TE-Driven Pseudogene 
Formation 
Wicker et al., 2011 
12
Detection of Pseudogenes 
Homology-Based Approaches: 
 Harrison's Approach 
(pseudogene annotation pipeline, 2001) 
 Sakai's Approach 
 PPFINDER 
 Pseudogene Finder (PSF) 
 PseudoPipe 
13
Pseudogene databases 
 HoppsiGEN 
 PseudoGeneQuest 
 Pseudogene.Org 
 University of Iowa Pseudogene Resource 
14
Identification of pseudogenes in the rice gene 
complement 
Nissen et al., 2009 
1 Pseudogenes (with parent gene and at least one frameshift or premature stop codon) 
2 GPFs not supported by cDNA or EST evidence 
3 The UTRs of the GPFs are longer than mean + 2 standard deviations 
4 The CDS of the GPFs are shorter than 50 amino acids 
5 The GPFs contain a stretch of 18 adenines in a 20-base window, within -200 to 400 bases from the end of the 
annotated UTR, or within 600 bases of the stop codon if no UTR is annotated 
6 The GPFs have a significantly smaller number of exons 
7 The GPFs contain a single exon and are within a segmentally duplicated region but have no paralog in the duplicated 
region 
15
characterization of pseudogenes in the rice gene 
complement 
Nissen et al., 2009 
16
General steps fallowed in pseudogene identification 
Nissen et al., 2009 
17
Plastid trnF(GAA) pseudogenes in four species of 
Solanum (Solanaceae) 
Peter et al., 2011 
18
Manual alignment of Solanum ψn copies 
Peter et al., 2011 
19 
5' 3'
Shared syntenic presentation of pseudogene 
copies in four different Solanum species 
Peter et al., 2011 20
Evolution of trnF (GAA) pseudogenes in cruaciferous plants 
Roswitha et al., 2008 
21
22
Expression mechanism 
23
Analysis of miRNA targeting the 3'-UTR of TUSC2 and pseudogene TUSC2P 
on chromosome Y and chromosome X. 
Zina et al.,2014 
24
Expression analysis of TUSC2P in cancer and non-cancer 
cell lines by real-time PCR and reverse transcription–PCR 
Zina et al.,2014 
25
TUSC2P and TUSC2 30-UTR cloning in expression vectors 
Zina et al.,2014 
real-time PCR results. 
26
TUSC2P and TUSC2 30-UTR can function as competing 
endogenous RNAs (ceRNAs). 
Zina et al.,2014 
27
TUSC2P/TUSC2-UTR attracts endogenous miRNAs thus freeing TUSC2, 
Zina et al.,2014 
TIMP2 and TIMP3 mRNAs to be translated 
28
Identifying small RNAs derived from gene-pseudogene pairs or 
adjacent pseudogene-pseudogene pairs. 
Guo et al., 2009 
29
The length distribution of small RNAs from four 
Guo et al., 2009 
distinct sources. 
30
Function classification of rice pseudogenes following gene 
ontology terms…. 
Guo et al., 2009 
31
Model for the evolution of B-located 
pseudogenes. (Banae et al., 2013) 
32
Gene structure model for selected gene-like 
fragments. 
(Banae et al., 2013) 
33
sequence comparison between rye B-located pseudogene-like 
fragments (6 to 15) and their A-located parental 
counterparts. 
(Banae et al., 2013) 
34
Tissue- and Species-Specific Transcription of B-Located Pseudogene-Like 
Fragments in Rye and Wheat. 
B specific expression 
Rye specific expression 
Constitutive expression 
Regulation of A 
located gene 
expression 
(Banae et al., 2013) 
35 
35
Accession specific expression of Rye 
Pseudogenes 
(Banae et al., 2013) 
36
Genome-Wide Distribution of Small RNA-Generating Loci in 
Arabidopsis. 
Kristin et al., 2007 
37
Small RNA Loci in Protein-Coding Genes and 
Pseudogenes 
Kristin et al., 2007 
38
Relationship between the numbers of ψn and annotated 
functional genes in Pfam domain families. 
Zou et al., 2009 39
40 
Other utilities 
 Evolutionary related studies 
 Neutral substitution rate 
 Information on splice diversity of RNA 
 As record of fast gene expression fossils 
40
41
Formation and expression ofpseudogenes

Formation and expression ofpseudogenes

  • 1.
  • 2.
    Shilpa V. Malaghan. II Ph.D (GPB) UAS, DHARWAD
  • 3.
    Contents  Introduction  Origin (formation)  Identification  Expression  Utilities 3 3
  • 4.
    Introduction The word‘pseudogene’ was first used by Jacq et al., 1977. Pseudogenes are dysfunctional relatives of genes that have lost their protein-coding ability or are otherwise no longer expressed in the cell. Charecteristic features, • Highest homology to parental functional gene • Presence of disablements to prevent its expression 4
  • 5.
    Central dogma ofcell biology Formation of pseudogene 5
  • 6.
    Origin (formation) ofPseudogenes 1) Pseudogene formation after DNA- mediated gene duplication Ondrage et al., 2011 6
  • 7.
    2) Pseudogene formationafter RNA- mediated gene duplication (processed ψn) Ondrage et al., 2011 7
  • 8.
    Types of Pseudogenes(ψn) Processed ψn Unprocessed ψn Unitary / disabled ψn 8
  • 9.
    Processed ψn: Lack the promoter so “dead on arrival”  Contains poly-A tail  Lack intron sequence  Random association with parent gene 9
  • 10.
    Unprocessed ψn Highest structural similarity with parent gene  Intact exons and introns  Prevalence of ORF disrupting mutations  Close association with parent gene 10
  • 11.
    Unitary ψn No gene duplication before pseudogenization  Rare in occurrence  Fixed disabling mutations 11
  • 12.
    Evidence for TE-DrivenPseudogene Formation Wicker et al., 2011 12
  • 13.
    Detection of Pseudogenes Homology-Based Approaches:  Harrison's Approach (pseudogene annotation pipeline, 2001)  Sakai's Approach  PPFINDER  Pseudogene Finder (PSF)  PseudoPipe 13
  • 14.
    Pseudogene databases HoppsiGEN  PseudoGeneQuest  Pseudogene.Org  University of Iowa Pseudogene Resource 14
  • 15.
    Identification of pseudogenesin the rice gene complement Nissen et al., 2009 1 Pseudogenes (with parent gene and at least one frameshift or premature stop codon) 2 GPFs not supported by cDNA or EST evidence 3 The UTRs of the GPFs are longer than mean + 2 standard deviations 4 The CDS of the GPFs are shorter than 50 amino acids 5 The GPFs contain a stretch of 18 adenines in a 20-base window, within -200 to 400 bases from the end of the annotated UTR, or within 600 bases of the stop codon if no UTR is annotated 6 The GPFs have a significantly smaller number of exons 7 The GPFs contain a single exon and are within a segmentally duplicated region but have no paralog in the duplicated region 15
  • 16.
    characterization of pseudogenesin the rice gene complement Nissen et al., 2009 16
  • 17.
    General steps fallowedin pseudogene identification Nissen et al., 2009 17
  • 18.
    Plastid trnF(GAA) pseudogenesin four species of Solanum (Solanaceae) Peter et al., 2011 18
  • 19.
    Manual alignment ofSolanum ψn copies Peter et al., 2011 19 5' 3'
  • 20.
    Shared syntenic presentationof pseudogene copies in four different Solanum species Peter et al., 2011 20
  • 21.
    Evolution of trnF(GAA) pseudogenes in cruaciferous plants Roswitha et al., 2008 21
  • 22.
  • 23.
  • 24.
    Analysis of miRNAtargeting the 3'-UTR of TUSC2 and pseudogene TUSC2P on chromosome Y and chromosome X. Zina et al.,2014 24
  • 25.
    Expression analysis ofTUSC2P in cancer and non-cancer cell lines by real-time PCR and reverse transcription–PCR Zina et al.,2014 25
  • 26.
    TUSC2P and TUSC230-UTR cloning in expression vectors Zina et al.,2014 real-time PCR results. 26
  • 27.
    TUSC2P and TUSC230-UTR can function as competing endogenous RNAs (ceRNAs). Zina et al.,2014 27
  • 28.
    TUSC2P/TUSC2-UTR attracts endogenousmiRNAs thus freeing TUSC2, Zina et al.,2014 TIMP2 and TIMP3 mRNAs to be translated 28
  • 29.
    Identifying small RNAsderived from gene-pseudogene pairs or adjacent pseudogene-pseudogene pairs. Guo et al., 2009 29
  • 30.
    The length distributionof small RNAs from four Guo et al., 2009 distinct sources. 30
  • 31.
    Function classification ofrice pseudogenes following gene ontology terms…. Guo et al., 2009 31
  • 32.
    Model for theevolution of B-located pseudogenes. (Banae et al., 2013) 32
  • 33.
    Gene structure modelfor selected gene-like fragments. (Banae et al., 2013) 33
  • 34.
    sequence comparison betweenrye B-located pseudogene-like fragments (6 to 15) and their A-located parental counterparts. (Banae et al., 2013) 34
  • 35.
    Tissue- and Species-SpecificTranscription of B-Located Pseudogene-Like Fragments in Rye and Wheat. B specific expression Rye specific expression Constitutive expression Regulation of A located gene expression (Banae et al., 2013) 35 35
  • 36.
    Accession specific expressionof Rye Pseudogenes (Banae et al., 2013) 36
  • 37.
    Genome-Wide Distribution ofSmall RNA-Generating Loci in Arabidopsis. Kristin et al., 2007 37
  • 38.
    Small RNA Lociin Protein-Coding Genes and Pseudogenes Kristin et al., 2007 38
  • 39.
    Relationship between thenumbers of ψn and annotated functional genes in Pfam domain families. Zou et al., 2009 39
  • 40.
    40 Other utilities  Evolutionary related studies  Neutral substitution rate  Information on splice diversity of RNA  As record of fast gene expression fossils 40
  • 41.