EUKARYOTIC	
  GENE	
  
REGULATION	
  I	
  	
  
Jill	
  Howlin	
  PhD	
  
	
  
Canceromics	
  Branch	
  
Department	
  of	
  Oncology,	
  Lund	
  University	
  
Medicon	
  Village	
  
	
  
http://www.med.lu.se/english/klinvetlund/canceromics/research	
  
Lectures	
  
•  Eukaryotic	
  Gene	
  Regulation	
  I	
  
•  Eukaryotic	
  Gene	
  Regulation	
  II	
  
The	
  Gene,	
  the	
  Genome	
  &	
  Gene	
  Expression	
  
	
  
	
  Some	
  revision,	
  some	
  new	
  concepts	
  	
  
	
  
Control	
  of	
  gene	
  expression:	
  	
  	
  	
  
	
  
	
  Transcription	
  factors	
  
	
  Epigenetic	
  regulation:	
  imprinting	
  &	
  methylation	
  
	
  
Analysis	
  of	
  Gene	
  Expression	
  &	
  Regulation:	
  	
  	
  
	
  
	
  transcriptomics	
  
	
  RNA-­‐seq	
  
	
  DNA-­‐protein	
  interaction	
  
	
  	
  
	
  
Source	
  material:	
  
•  Molecular	
  Biology	
  of	
  the	
  Cell.	
  5th	
  edition.	
  Alberts	
  et	
  al.	
  	
  
•  The	
  Cell:	
  A	
  Molecular	
  Approach.	
  2nd	
  edition.	
  Cooper	
  GM.	
  
	
  
•  Human	
  Molecular	
  Genetics.	
  4th	
  edition.	
  Strachan	
  and	
  Read	
  
•  Virtual	
  Cell	
  Animation	
  Collectio	
  http://vcell.ndsu.nodak.edu/animations/	
  	
  
	
  	
  (also	
  available	
  as	
  free	
  iphone/ipad	
  app	
  from	
  itunes)	
  
	
  
•  Wikipedia	
  http://en.wikipedia.org/	
  
•  ENCODE	
  explorer	
  http://www.nature.com/encode/#/threads	
  
•  PubMed	
  http://www.ncbi.nlm.nih.gov/pubmed	
  
	
  
	
  
BIM12	
  
What	
  is	
  a	
  GENE?	
  
discrete	
  segments	
  of	
  DNA	
  (or	
  RNA)	
  that	
  comprise	
  a	
  functional	
  unit	
  of	
  
hereditary	
  material	
  	
  
	
  
•  complementary	
  RNA	
  molecule	
  that	
  serves	
  as	
  a	
  template	
  for	
  protein	
  
•  Genome	
  :	
  3Gbp	
  DNA,	
  	
  <	
  20,000	
  protein	
  coding	
  genes	
  
Genetics	
  	
  
Gregor	
  Mendel	
  published	
  his	
  ‘Experiments	
  in	
  Plant	
  Hybridization’	
  in	
  1866	
  
	
  
Charles	
  Darwin	
  published	
  ‘Origin	
  of	
  the	
  Species’	
  in	
  1859	
  
	
  
	
  
	
   	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  merged	
  in	
  early	
  20th	
  century	
  
	
  
	
  
Around	
  1910	
  Thomas	
  hunt	
  Morgan	
  demonstrated	
  genes	
  are	
  on	
  chromosomes	
  and	
  are	
  the	
  
mechanical	
  basis	
  of	
  heredity	
  
	
  
The	
  Hershey–Chase	
  experiments	
  1952	
  by	
  Alfred	
  Hershey	
  and	
  Martha	
  Chase	
  confirmed	
  that	
  DNA	
  
was	
  the	
  genetic	
  material	
  
DNA	
  deoxyribonucleicacid	
  
Sugar-­‐phosphate	
  
polymer/backbone	
  
Base	
  
pairs	
   	
  	
  	
  A	
  	
  	
  	
  -­‐	
  	
  	
  	
  	
  T	
  
	
  	
  	
  C	
  	
  	
  	
  -­‐	
  	
  	
  	
  G	
  
The	
  Double	
  Helix	
  
•  allowed	
  scientists	
  to	
  understand	
  how	
  DNA	
  was	
  replicated	
  
•  1962	
  Nobel	
  Prize	
  James	
  Watson,	
  Francis	
  Crick	
  &	
  Maurice	
  Wilkins.	
  	
  
•  Rosalind	
  Franklin	
  generated	
  the	
  X-­‐ray	
  diffraction	
  images	
  
Photo	
  51	
  
DNA	
  Packaging	
  
30nm
1nm	
  per	
  turn,	
  3.6nm	
  pitch	
  
	
  	
  	
  	
  	
  Nucleosome	
  
	
  	
  	
  	
  	
  	
  	
  	
  
147bp	
  DNA	
  wrapped	
  
around	
  a	
  complex	
  of	
  8	
  
core	
  histones	
  (H2A,	
  
H2B,	
  H3,	
  H4)	
  
	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
Histone	
  H1	
  
The	
  nucleosomes	
  fold	
  to	
  
a	
  30nm	
  fiber..	
  
that	
  form	
  loops	
  300nm	
  
in	
  length…	
  
..the	
  300nm	
  fiber	
  is	
  
compressed	
  and	
  folded	
  
further	
  to	
  produce	
  a	
  
250nm	
  wide	
  fiber	
  
700nm	
  
Tight	
  coiling	
  of	
  the	
  250nm	
  fiber	
  
produces	
  the	
  chromatid	
  of	
  a	
  
chromosome	
  
1400nm	
   	
  	
  	
  	
  	
  Chromsome	
  
23	
  pairs	
  	
  
	
  -­‐	
  11	
  autosomes	
  and	
  	
  
	
  -­‐	
  sex	
  chromosomes	
  
	
  
Chromosomes	
  
Cell	
  Division	
  &	
  Genetic	
  Inheritance	
  
NB:	
  Somatic	
  mutations,	
  Germ-­‐line	
  mutations	
  
Chromosome	
  disorders	
  
Abnormal	
  number	
  aneuploidy	
  
	
  
•  Down	
  Syndrome	
  3	
  x	
  chromosome	
  21	
  
•  Klinefelter	
  syndrome	
  46/47,	
  XXY,	
  or	
  XXY	
  syndrome	
  
Abnormal	
  	
  structure	
  deletions,	
  duplications,	
  translocations	
  
	
  
•  Jacobsen	
  Syndrome	
  11q	
  deletion	
  disorder	
  
	
  -­‐	
  Germ	
  line	
  -­‐	
  
transcription	
  
•  DNA	
  to	
  RNA	
  
basic	
  mechanism-­‐	
  
DNA	
  helix	
  unwinds	
  
Direction	
  of	
  
transcription	
  
RNA	
  polymerase	
  
RNA	
  strand	
  
Template	
  DNA	
  	
  strand	
  
cis-­‐acting	
  factors	
  on	
  the	
  same	
  molecule	
  e.g.:	
  promoter	
  
RNA	
  Polymerase	
  II	
  synthesizes	
  RNA	
  from	
  genes	
  encoding	
  proteins	
  	
  
	
  
promoters	
  sequences:	
  	
  -­‐TATA	
  box,	
  -­‐GC	
  box,	
  -­‐CAAT	
  box	
  
RNAPOLII	
  requires	
  basal	
  transcription	
  apparatus	
  	
  
General	
  transcription	
  factors:	
  
-­‐TFIIA,	
  TFIIB,	
  TFIID,	
  TFIIF,	
  TFIIH	
  
	
  
trans-­‐acting	
  factors	
  produced	
  elsewhere	
  
transcription	
  
animation	
  
mRNA	
  processing	
  
Capping	
  &	
  Poly-­‐A-­‐tail	
  
5’	
  methyl	
  cap:	
  first	
  modification	
  of	
  the	
  mRNA	
  as	
  soon	
  as	
  it	
  emerges	
  
	
  
•  In	
  the	
  nucleus,	
  it	
  binds	
  a	
  protein	
  complex,	
  the	
  CBC	
  (cap-­‐binding	
  complex)	
  	
  
	
  
•  Binding	
  proteins	
  and	
  processing	
  enzymes	
  generate	
  the	
  3’	
  poly-­‐adenylation	
  signal:	
  Poly-­‐A	
  
tail	
  -­‐	
  Following	
  splicing	
  and	
  cleavage	
  	
  
•  Poly-­‐A-­‐binding	
  proteins	
  
splicing	
  
•  processed	
  mRNA	
  undergos	
  splicing	
  before	
  translation	
  
•  removal	
  of	
  the	
  intronic	
  regions	
  	
  
•  cleavage	
  of	
  the	
  exons	
  
•  Mature	
  mRNA	
  has	
  5’	
  and	
  3’	
  UTRs	
  ,	
  untranslated	
  regions	
  
Splice	
  junctions	
  &	
  the	
  Spliceosome	
  
	
  	
  
Spliceosome	
  	
  	
  
Protein–RNA	
  complex	
  (snRNA	
  –snRNP)	
  
-­‐  small	
  nuclear	
  RNAs	
  	
  
-­‐  >50	
  proteins	
  	
  
	
  
3	
  consensus	
  sites:	
  
	
  
•  introns	
  start	
  /end	
  
•  GT	
  (GU	
  for	
  RNA)	
  	
  
•  AG	
  (always	
  the	
  same)	
  
	
  
•  An	
  A	
  forms	
  the	
  branch	
  point	
  	
  (can	
  vary)	
  
lariat	
  
splicing	
  animation	
  
translation	
  
•  	
  mRNA	
  to	
  protein	
  
	
  
•  DECODED	
  by	
  the	
  ribosome	
  to	
  produce	
  	
  
	
  	
  specific	
  polypeptides	
  
•  initiation,	
  elongation	
  and	
  termination	
  	
  
•  tRNAs,	
  transfer	
  RNAs:	
  function	
  as	
  adaptors	
  
Binds	
  to	
  the	
  codon	
  in	
  the	
  mRNA	
  sequence	
  
Always	
  CCA	
  at	
  the	
  3’	
  end	
  
70	
  to	
  80	
  nucleotide	
  long	
  	
  
tRNA	
  
translation	
  
animation	
  
Wobble	
  hypothesis	
  
64	
  =	
  43	
  possible	
  codons	
  
	
  
49	
  different	
  tRNAs	
  
	
  
1966	
  Francis	
  Crick	
  
	
  
	
  
Less	
  spatially	
  confined	
  =	
  non-­‐standard	
  base	
  pairing	
  
	
  
	
  
Mutations	
  
Permanent	
  changes	
  in	
  the	
  DNA	
  sequence	
  
	
  
	
  
spontaneously	
  occurring	
  (mistakes	
  in	
  replication,	
  failure	
  of	
  DNA	
  repair)	
  	
  	
  
	
  
or	
  induced	
  (radiation,	
  UV,	
  chemical)	
  	
  
	
  
	
  
•  	
   point	
  mutations	
  
•  	
   insertions	
  	
  
•  	
   deletions	
  
•  	
   translocation	
  
•  Frame	
  shift	
  mutation	
  :	
  a	
  disruption	
  in	
  the	
  reading	
  frame	
  
The	
  sun	
  was	
  hot	
  but	
  the	
  old	
  man	
  did	
  not	
  get	
  his	
  hat	
  
	
  
T	
  hes	
  unw	
  ash	
  otb	
  utt	
  heo	
  ldm	
  and	
  idn	
  otg	
  eth	
  ish	
  at	
  
Or	
  
Th	
  esu	
  nwa	
  sho	
  tbu	
  tth	
  eol	
  dma	
  ndi	
  dno	
  tge	
  thi	
  sha	
  t	
  
	
  
•  Nonsence	
  mutation	
  
premature	
  STOP/	
  truncated	
  protein	
  
	
  
•  Missence	
  
Single	
  point	
  mutation	
  =	
  different	
  amino	
  acid	
  
	
  
•  Neutral	
  mutation	
  
e.g.:	
  AAA	
  to	
  AGA,	
  lysine	
  to	
  argine	
  
	
  
•  Silent	
  mutation	
  
No	
  effect	
  
Mutations	
  &	
  SNPs	
  
	
  
Heritable	
  disorders	
  e.g.:	
  single	
  mutation	
  ,	
  germ	
  cell	
  (rare)	
  
	
  
SCD	
  sickle	
  cell	
  anaemia:	
  	
  	
  
haemoglobin	
  (β-­‐globin	
  )	
  GAG	
  to	
  GTG	
  point	
  mutation	
  	
  (Malaria	
  resistance)	
  
	
  
Cystic	
  Fibrosis:	
  ΔF508	
  deletion	
  (phenylalanine)	
  the	
  CFTR	
  gene	
  
SNPs…natural	
  variation	
  	
  	
  =>	
  1%	
  of	
  population	
  (common)	
  
	
  
~	
  90%	
  of	
  variation	
  	
  
	
  
Coding/noncoding,	
  100	
  to	
  300	
  bases	
  
	
  
no	
  effect	
  on	
  cell	
  function	
  
	
  
	
  
GWAS	
  genome-­‐wide	
  association	
  studies	
  
•  SNP	
  arrays,	
  NG	
  sequencing	
  
•  SNPs	
  &	
  	
  Risk	
  of	
  disease	
  
	
  
Types	
  of	
  SNPs…associated	
  with	
  the	
  disease	
  phenotype	
  
	
  
non-­‐coding	
  regions	
  >	
  coding	
  	
  
	
  
SNPs	
  rather	
  than	
  cause	
  disease	
  may	
  confer	
  differential	
  susceptibility	
  e.g.:	
  Alzheimer's	
  disease	
  and	
  
ApoE	
  SNPs	
  
Common	
  misconceptions:	
  
•  The	
  only	
  purpose	
  of	
  a	
  gene	
  is	
  to	
  encode	
  a	
  protein	
  
	
  
•  One	
  gene	
  encodes	
  one	
  mRNA	
  and	
  gives	
  rise	
  to	
  one	
  specific	
  protein	
  
•  DNA	
  that	
  is	
  not	
  a	
  gene	
  is	
  considered	
  ‘junk	
  DNA’	
  
	
  
	
  	
  	
  "Biologists	
  should	
  not	
  deceive	
  themselves	
  with	
  the	
  thought	
  that	
  some	
  new	
  class	
  of	
  biological	
  
molecules,	
  of	
  comparable	
  importance	
  to	
  proteins,	
  remains	
  to	
  be	
  discovered.	
  This	
  seems	
  highly	
  
unlikely."	
  (Francis	
  Crick,	
  1958)	
  
	
  	
  	
  	
  	
  
	
  
The	
  ‘gene’	
  today	
  has	
  an	
  identity	
  crisis	
  
	
  
	
  
old!
•  2003	
  (April)	
  completion	
  of	
  Human	
  Genome	
  Project	
  	
  	
  	
  	
  
	
  	
  	
  	
  	
  (99%	
  /99.99%	
  accuracy)	
  
•  2012	
  (September)	
  The	
  ENCODE	
  project	
  ,	
  based	
  on	
  GENCODE	
  
	
  
regions	
  of	
  transcription,	
  transcription	
  factor	
  association,	
  chromatin	
  structure	
  and	
  histone	
  modification	
  in	
  
the	
  entire	
  genome	
  	
  
	
  
•  2014	
  (August)	
  GENCODE	
  v20	
  built	
  on	
  Grch38	
  (December	
  2013)	
  
Human	
  Genome	
  assembly	
  &	
  annotation	
  
 
	
  
	
  
A	
  lot	
  fewer	
  genes	
  than	
  originally	
  thought!	
  
	
  
	
  -­‐	
  no.	
  of	
  protein	
  coding	
  genes	
  continues	
  to	
  fall	
  /lncRNA	
  rises	
  
	
  
<20,000	
  protein–coding	
  (<1.5%)	
  	
  
	
  
	
  20,026	
  protein-­‐coding	
  (GENCODE	
  v8,	
  March	
  2011)	
  
	
  
Non-­‐protein-­‐coding	
  sequences	
  makes	
  up	
  majority	
  
	
  
¾	
  capable	
  of	
  being	
  transcribed	
  
	
  
	
  
80%	
  of	
  the	
  components	
  of	
  the	
  human	
  genome	
  now	
  have	
  at	
  least	
  one	
  biochemical	
  
function	
  associated	
  with	
  them!??	
  
	
  
	
  
Human	
  Genome	
  conclusions	
  
Gene	
  Structure	
  and	
  Function	
  
REFERENCE GENOME: GRCH38ASSEMBLY
ANNOTATION
5’ – 3’
ANNOTATION
3’ – 5’
1.	
  
2.	
  
What’s	
  a	
  ‘gene’??	
  
1.  Protein-­‐coding	
  genes	
  
2.  Pseudogenes	
  
3.  RNA	
  genes	
  	
  (Sebastian)	
   Non-­‐protein	
  coding	
  
(4 x)!
http://www.gencodegenes.org/data.html	
  
Pseudogenes	
  	
  
	
  
They	
  are	
  dysfunctional	
  relatives	
  of	
  known	
  genes	
  in	
  the	
  genome	
  that	
  never	
  become	
  
proteins	
  	
  
	
  
	
  
Pseudogenes	
  	
  
	
  
Gene	
  of	
  origin	
  
	
  	
  
	
   	
  1	
  .Reterotransposition	
  
mRNA	
  spontaneously	
  reverse	
  transcribed	
  back	
  into	
  DNA	
  and	
  inserted	
  into	
  chromosomal	
  
DNA	
  (looks	
  like	
  cDNA	
  structure,	
  exons,	
  no	
  promoter)	
  
	
  
	
   	
  2.	
  Gene	
  duplication	
  
	
  with	
  subsequent	
  acquisition	
  of	
  deleterious	
  mutations	
  (has	
  introns,	
  exon	
  structure	
  and	
  
promoter)	
  
	
  
	
   	
  3.	
  Disabled/Unitary	
  Genes	
  
A	
  gene	
  disabled	
  by	
  a	
  mutation	
  that	
  does	
  not	
  have	
  a	
  deleterious	
  effect	
  on	
  the	
  organism	
  and	
  
gets	
  fixed	
  in	
  the	
  population	
  
L-­‐gulono-­‐γ-­‐lactone	
  oxidase	
  (GULO)-­‐	
  biosynthesis	
  of	
  vitamin	
  C	
  (GULOP	
  gene	
  in	
  humans)	
  
	
  
Pseudogenes	
  in	
  Gene	
  Regulation	
  
•  Pseudogene-­‐derived	
  small	
  interfering	
  RNAs	
  (siRNA)	
  
	
  
Some	
  pseudogenes	
  actually	
  encode	
  siRNAs	
  that	
  regulate	
  the	
  expression	
  of	
  protein-­‐coding	
  mRNAs-­‐	
  
‘parent’	
  genes	
  
	
  
	
  
•  An	
  	
  miRNA	
  decoy	
  function	
  e.g.:	
  PTEN	
  
•  -­‐PBL	
  case	
  
	
  
	
  
Control	
  of	
  transcription	
  promoters,	
  enhancers,	
  repressors..	
  
•  Complex	
  control	
  	
  
•  -­‐cis	
  and	
  -­‐trans	
  factors	
  including:	
  regulatory	
  &	
  promoter	
  sequences,	
  RNA	
  polymerase,	
  the	
  
general	
  transcription	
  factors	
  and	
  gene	
  specific	
  activators	
  and	
  repressors	
  
	
  
•  RNA	
  pol	
  II:	
  	
  -­‐protein-­‐coding	
  genes	
  [Pol	
  I	
  (rRNA)	
  and	
  III	
  (tRNA	
  and	
  some	
  small	
  RNAs)]	
  
	
  
•  Pol	
  II	
  requires	
  assembly	
  of	
  the	
  general	
  TFs	
  on	
  the	
  promoter	
  in	
  order	
  to	
  initiate	
  transcription	
  
•  specific	
  gene	
  transcription	
  may	
  also	
  be	
  regulated	
  by	
  activator	
  and	
  repressor	
  proteins	
  binding	
  
at	
  near	
  OR	
  distant	
  sites	
  	
  
RNA	
  Pol	
  II	
  promoters	
  vary	
  but	
  often	
  include:	
  
	
  
	
  TATA	
  box	
  
	
  GC	
  box	
  
	
  CAAT	
  box	
  
	
  
	
  
Transcriptional	
  repressors	
  compete	
  
with	
  transcriptional	
  activators	
  	
  
Transcript	
  variation	
  one	
  gene	
  =	
  many	
  mRNAs	
  
Mainly	
  due	
  to	
  the	
  fact	
  that:	
  1	
  Gene	
  >	
  1	
  promoter	
  &	
  >	
  1	
  splicing	
  pattern	
  
	
  
	
  
	
  
	
  
	
  
At	
  least	
  half	
  of	
  all	
  genes	
  have	
  
2	
  or	
  more	
  promoters	
  
Alternative	
  splicing,	
  as	
  well	
  as	
  creating	
  different	
  protein	
  isoforms,	
  can	
  also	
  generate	
  different	
  5’	
  and	
  3’	
  UTRs	
  
Transcript	
  variation	
  …many	
  protein	
  isoforms	
  
•  Equally	
  however	
  the	
  same	
  final	
  protein	
  sequence	
  can	
  come	
  from	
  slightly	
  different	
  transcripts!	
  As	
  long	
  as	
  CDS	
  not	
  
affected	
  
	
  
Role	
  of	
  mRNA	
  processing	
  in	
  gene	
  regulation?	
  
•  seems	
  wasteful	
  
•  exon-­‐intron	
  arrangement	
  facilitates	
  new	
  genetic	
  recombination	
  
•  evidence	
  in	
  protein	
  domains	
  
•  variation	
  -­‐	
  several	
  protein	
  variants	
  can	
  be	
  produced	
  from	
  one	
  gene	
  
•  The	
  5’,	
  3’	
  modifications	
  of	
  mRNA	
  also	
  have	
  a	
  role	
  in	
  stability,	
  transport	
  and	
  
recognition	
  
Influence	
  of	
  mRNA	
  structure	
  
Final	
  mRNA	
  checks	
  
Translation	
  Initiation	
  factors	
  
Poly-­‐A	
  binding	
  proteins	
  
5’	
  cap	
  &	
  CBC	
  
Assembly	
  	
  	
  	
  	
  	
  	
  >	
  >	
  	
  	
  	
  	
  	
  	
  	
  export	
  	
  	
  	
  	
  	
  	
  	
  >	
  >	
  	
  	
  	
  	
  	
  	
  	
  	
  translation	
  
Quality	
  control	
  in	
  cytosol	
  	
  
•  EJCs	
  serve	
  to	
  label	
  the	
  mRNA	
  
•  Nonsense-­‐mediated	
  decay	
  (NMD)	
  rids	
  cells	
  of	
  mRNAs	
  with	
  premature	
  termination	
  codons	
  
•  hUpf	
  complex	
  triggers	
  NMD	
  in	
  the	
  cytoplasm	
  when	
  recognized	
  downstream	
  of	
  STOP	
  
	
  
Exon	
  junction	
  complexes	
  
Correct	
  stop	
  codon	
  
Premature	
  stop	
  codon	
   Upf	
  proteins	
  
Upf	
  triggers	
  mRNA	
  degradation	
  
mRNA	
  stability	
  
•  varies	
  (mins-­‐hrs)	
  
•  Exo/endonuclease,	
  other	
  decay	
  mechanisms	
  	
  
•  A	
  deadenylase	
  shortens	
  the	
  Poly-­‐A	
  tail	
  in	
  the	
  cytoplasm	
  
	
  
•  Decapping	
  enzymes	
  -­‐	
  uncapped	
  mRNA	
  is	
  rapidly	
  degraded	
  by	
  exonucleases	
  
	
  
	
  
•  3’	
  binding	
  of	
  miRNA	
  mediated	
  degradation	
  
•  RNAi	
  mediated	
  degradation	
  	
  
AREs	
  can	
  stimulate	
  Poly-­‐A	
  tail	
  removal	
  
	
  
AU-­‐rich	
  elements	
  50–150	
  nt	
  (rich	
  in	
  adenosine	
  
and	
  uridine)	
  	
  
	
  
3-­‐UTRs	
  of	
  mRNAs	
  with	
  a	
  short	
  half-­‐life	
  <10%	
  
genes,	
  e.g:	
  growth	
  factors	
  
	
  
Final	
  aa	
  differs	
  from	
  the	
  genomic	
  DNA	
  
	
  
	
  A-­‐I	
  deamination	
  adenine	
  	
  >	
  inosine:	
  
	
  >100o	
  genes	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  	
  
	
  	
  
RNA	
  editing	
  
Adenosine	
  deaminases	
  
acting	
  on	
  RNA	
  (ADARs)	
  
	
  C-­‐U	
  deamination	
  cytosine>uracil	
  
By	
  cytidine	
  deaminase 	
   	
  	
  
	
  
Apoplipoprotein	
  B:	
  lipid	
  metabolism	
  
Control	
  at	
  the	
  protein	
  level	
  ..lost	
  after	
  translation	
  
•  Every	
  step	
  required	
  for	
  the	
  process	
  of	
  gene	
  expression	
  can	
  be	
  regulated	
  	
  
•  Translation	
  is	
  no	
  different	
  
•  Protein	
  folding	
  is	
  initiated	
  as	
  synthesis	
  proceeds,	
  (Hsp70)	
  
•  Ubiquitin-­‐proteasome	
  system	
  
•  Abnormally	
  folded	
  proteins	
  can	
  form	
  disease	
  causing	
  protein	
  aggregates	
  
END	
  OF	
  PART	
  I	
  

Eukaryotic Gene Regulation I 2014 slides

  • 1.
    EUKARYOTIC  GENE   REGULATION  I     Jill  Howlin  PhD     Canceromics  Branch   Department  of  Oncology,  Lund  University   Medicon  Village     http://www.med.lu.se/english/klinvetlund/canceromics/research  
  • 2.
    Lectures   •  Eukaryotic  Gene  Regulation  I   •  Eukaryotic  Gene  Regulation  II   The  Gene,  the  Genome  &  Gene  Expression      Some  revision,  some  new  concepts       Control  of  gene  expression:            Transcription  factors    Epigenetic  regulation:  imprinting  &  methylation     Analysis  of  Gene  Expression  &  Regulation:          transcriptomics    RNA-­‐seq    DNA-­‐protein  interaction        
  • 3.
    Source  material:   • Molecular  Biology  of  the  Cell.  5th  edition.  Alberts  et  al.     •  The  Cell:  A  Molecular  Approach.  2nd  edition.  Cooper  GM.     •  Human  Molecular  Genetics.  4th  edition.  Strachan  and  Read   •  Virtual  Cell  Animation  Collectio  http://vcell.ndsu.nodak.edu/animations/        (also  available  as  free  iphone/ipad  app  from  itunes)     •  Wikipedia  http://en.wikipedia.org/   •  ENCODE  explorer  http://www.nature.com/encode/#/threads   •  PubMed  http://www.ncbi.nlm.nih.gov/pubmed      
  • 4.
  • 5.
    What  is  a  GENE?   discrete  segments  of  DNA  (or  RNA)  that  comprise  a  functional  unit  of   hereditary  material       •  complementary  RNA  molecule  that  serves  as  a  template  for  protein   •  Genome  :  3Gbp  DNA,    <  20,000  protein  coding  genes  
  • 6.
    Genetics     Gregor  Mendel  published  his  ‘Experiments  in  Plant  Hybridization’  in  1866     Charles  Darwin  published  ‘Origin  of  the  Species’  in  1859                            merged  in  early  20th  century       Around  1910  Thomas  hunt  Morgan  demonstrated  genes  are  on  chromosomes  and  are  the   mechanical  basis  of  heredity     The  Hershey–Chase  experiments  1952  by  Alfred  Hershey  and  Martha  Chase  confirmed  that  DNA   was  the  genetic  material  
  • 7.
    DNA  deoxyribonucleicacid   Sugar-­‐phosphate   polymer/backbone   Base   pairs        A        -­‐          T        C        -­‐        G  
  • 8.
    The  Double  Helix   •  allowed  scientists  to  understand  how  DNA  was  replicated   •  1962  Nobel  Prize  James  Watson,  Francis  Crick  &  Maurice  Wilkins.     •  Rosalind  Franklin  generated  the  X-­‐ray  diffraction  images   Photo  51  
  • 9.
    DNA  Packaging   30nm 1nm  per  turn,  3.6nm  pitch            Nucleosome                   147bp  DNA  wrapped   around  a  complex  of  8   core  histones  (H2A,   H2B,  H3,  H4)                                                 Histone  H1   The  nucleosomes  fold  to   a  30nm  fiber..   that  form  loops  300nm   in  length…   ..the  300nm  fiber  is   compressed  and  folded   further  to  produce  a   250nm  wide  fiber   700nm   Tight  coiling  of  the  250nm  fiber   produces  the  chromatid  of  a   chromosome   1400nm            Chromsome  
  • 10.
    23  pairs      -­‐  11  autosomes  and      -­‐  sex  chromosomes     Chromosomes  
  • 11.
    Cell  Division  &  Genetic  Inheritance   NB:  Somatic  mutations,  Germ-­‐line  mutations  
  • 12.
    Chromosome  disorders   Abnormal  number  aneuploidy     •  Down  Syndrome  3  x  chromosome  21   •  Klinefelter  syndrome  46/47,  XXY,  or  XXY  syndrome   Abnormal    structure  deletions,  duplications,  translocations     •  Jacobsen  Syndrome  11q  deletion  disorder    -­‐  Germ  line  -­‐  
  • 16.
    transcription   •  DNA  to  RNA   basic  mechanism-­‐   DNA  helix  unwinds   Direction  of   transcription   RNA  polymerase   RNA  strand   Template  DNA    strand  
  • 17.
    cis-­‐acting  factors  on  the  same  molecule  e.g.:  promoter   RNA  Polymerase  II  synthesizes  RNA  from  genes  encoding  proteins       promoters  sequences:    -­‐TATA  box,  -­‐GC  box,  -­‐CAAT  box   RNAPOLII  requires  basal  transcription  apparatus     General  transcription  factors:   -­‐TFIIA,  TFIIB,  TFIID,  TFIIF,  TFIIH     trans-­‐acting  factors  produced  elsewhere  
  • 18.
  • 19.
  • 20.
    Capping  &  Poly-­‐A-­‐tail   5’  methyl  cap:  first  modification  of  the  mRNA  as  soon  as  it  emerges     •  In  the  nucleus,  it  binds  a  protein  complex,  the  CBC  (cap-­‐binding  complex)       •  Binding  proteins  and  processing  enzymes  generate  the  3’  poly-­‐adenylation  signal:  Poly-­‐A   tail  -­‐  Following  splicing  and  cleavage     •  Poly-­‐A-­‐binding  proteins  
  • 21.
    splicing   •  processed  mRNA  undergos  splicing  before  translation   •  removal  of  the  intronic  regions     •  cleavage  of  the  exons   •  Mature  mRNA  has  5’  and  3’  UTRs  ,  untranslated  regions  
  • 22.
    Splice  junctions  &  the  Spliceosome       Spliceosome       Protein–RNA  complex  (snRNA  –snRNP)   -­‐  small  nuclear  RNAs     -­‐  >50  proteins       3  consensus  sites:     •  introns  start  /end   •  GT  (GU  for  RNA)     •  AG  (always  the  same)     •  An  A  forms  the  branch  point    (can  vary)   lariat  
  • 23.
  • 24.
    translation   •   mRNA  to  protein     •  DECODED  by  the  ribosome  to  produce        specific  polypeptides   •  initiation,  elongation  and  termination     •  tRNAs,  transfer  RNAs:  function  as  adaptors   Binds  to  the  codon  in  the  mRNA  sequence   Always  CCA  at  the  3’  end   70  to  80  nucleotide  long     tRNA  
  • 25.
  • 26.
    Wobble  hypothesis   64  =  43  possible  codons     49  different  tRNAs     1966  Francis  Crick       Less  spatially  confined  =  non-­‐standard  base  pairing      
  • 27.
    Mutations   Permanent  changes  in  the  DNA  sequence       spontaneously  occurring  (mistakes  in  replication,  failure  of  DNA  repair)         or  induced  (radiation,  UV,  chemical)         •    point  mutations   •    insertions     •    deletions   •    translocation  
  • 28.
    •  Frame  shift  mutation  :  a  disruption  in  the  reading  frame   The  sun  was  hot  but  the  old  man  did  not  get  his  hat     T  hes  unw  ash  otb  utt  heo  ldm  and  idn  otg  eth  ish  at   Or   Th  esu  nwa  sho  tbu  tth  eol  dma  ndi  dno  tge  thi  sha  t     •  Nonsence  mutation   premature  STOP/  truncated  protein     •  Missence   Single  point  mutation  =  different  amino  acid     •  Neutral  mutation   e.g.:  AAA  to  AGA,  lysine  to  argine     •  Silent  mutation   No  effect  
  • 29.
    Mutations  &  SNPs     Heritable  disorders  e.g.:  single  mutation  ,  germ  cell  (rare)     SCD  sickle  cell  anaemia:       haemoglobin  (β-­‐globin  )  GAG  to  GTG  point  mutation    (Malaria  resistance)     Cystic  Fibrosis:  ΔF508  deletion  (phenylalanine)  the  CFTR  gene   SNPs…natural  variation      =>  1%  of  population  (common)     ~  90%  of  variation       Coding/noncoding,  100  to  300  bases     no  effect  on  cell  function      
  • 30.
    GWAS  genome-­‐wide  association  studies   •  SNP  arrays,  NG  sequencing   •  SNPs  &    Risk  of  disease    
  • 31.
    Types  of  SNPs…associated  with  the  disease  phenotype     non-­‐coding  regions  >  coding       SNPs  rather  than  cause  disease  may  confer  differential  susceptibility  e.g.:  Alzheimer's  disease  and   ApoE  SNPs  
  • 32.
    Common  misconceptions:   • The  only  purpose  of  a  gene  is  to  encode  a  protein     •  One  gene  encodes  one  mRNA  and  gives  rise  to  one  specific  protein   •  DNA  that  is  not  a  gene  is  considered  ‘junk  DNA’          "Biologists  should  not  deceive  themselves  with  the  thought  that  some  new  class  of  biological   molecules,  of  comparable  importance  to  proteins,  remains  to  be  discovered.  This  seems  highly   unlikely."  (Francis  Crick,  1958)               The  ‘gene’  today  has  an  identity  crisis       old!
  • 33.
    •  2003  (April)  completion  of  Human  Genome  Project                    (99%  /99.99%  accuracy)   •  2012  (September)  The  ENCODE  project  ,  based  on  GENCODE     regions  of  transcription,  transcription  factor  association,  chromatin  structure  and  histone  modification  in   the  entire  genome       •  2014  (August)  GENCODE  v20  built  on  Grch38  (December  2013)   Human  Genome  assembly  &  annotation  
  • 34.
          A  lot  fewer  genes  than  originally  thought!      -­‐  no.  of  protein  coding  genes  continues  to  fall  /lncRNA  rises     <20,000  protein–coding  (<1.5%)        20,026  protein-­‐coding  (GENCODE  v8,  March  2011)     Non-­‐protein-­‐coding  sequences  makes  up  majority     ¾  capable  of  being  transcribed       80%  of  the  components  of  the  human  genome  now  have  at  least  one  biochemical   function  associated  with  them!??       Human  Genome  conclusions  
  • 35.
    Gene  Structure  and  Function   REFERENCE GENOME: GRCH38ASSEMBLY ANNOTATION 5’ – 3’ ANNOTATION 3’ – 5’ 1.   2.   What’s  a  ‘gene’??   1.  Protein-­‐coding  genes   2.  Pseudogenes   3.  RNA  genes    (Sebastian)   Non-­‐protein  coding  
  • 36.
  • 37.
    Pseudogenes       They  are  dysfunctional  relatives  of  known  genes  in  the  genome  that  never  become   proteins        
  • 38.
    Pseudogenes       Gene  of  origin          1  .Reterotransposition   mRNA  spontaneously  reverse  transcribed  back  into  DNA  and  inserted  into  chromosomal   DNA  (looks  like  cDNA  structure,  exons,  no  promoter)        2.  Gene  duplication    with  subsequent  acquisition  of  deleterious  mutations  (has  introns,  exon  structure  and   promoter)        3.  Disabled/Unitary  Genes   A  gene  disabled  by  a  mutation  that  does  not  have  a  deleterious  effect  on  the  organism  and   gets  fixed  in  the  population   L-­‐gulono-­‐γ-­‐lactone  oxidase  (GULO)-­‐  biosynthesis  of  vitamin  C  (GULOP  gene  in  humans)    
  • 39.
    Pseudogenes  in  Gene  Regulation   •  Pseudogene-­‐derived  small  interfering  RNAs  (siRNA)     Some  pseudogenes  actually  encode  siRNAs  that  regulate  the  expression  of  protein-­‐coding  mRNAs-­‐   ‘parent’  genes       •  An    miRNA  decoy  function  e.g.:  PTEN   •  -­‐PBL  case      
  • 40.
    Control  of  transcription  promoters,  enhancers,  repressors..   •  Complex  control     •  -­‐cis  and  -­‐trans  factors  including:  regulatory  &  promoter  sequences,  RNA  polymerase,  the   general  transcription  factors  and  gene  specific  activators  and  repressors     •  RNA  pol  II:    -­‐protein-­‐coding  genes  [Pol  I  (rRNA)  and  III  (tRNA  and  some  small  RNAs)]    
  • 41.
    •  Pol  II  requires  assembly  of  the  general  TFs  on  the  promoter  in  order  to  initiate  transcription   •  specific  gene  transcription  may  also  be  regulated  by  activator  and  repressor  proteins  binding   at  near  OR  distant  sites     RNA  Pol  II  promoters  vary  but  often  include:      TATA  box    GC  box    CAAT  box      
  • 42.
    Transcriptional  repressors  compete   with  transcriptional  activators    
  • 43.
    Transcript  variation  one  gene  =  many  mRNAs   Mainly  due  to  the  fact  that:  1  Gene  >  1  promoter  &  >  1  splicing  pattern             At  least  half  of  all  genes  have   2  or  more  promoters   Alternative  splicing,  as  well  as  creating  different  protein  isoforms,  can  also  generate  different  5’  and  3’  UTRs  
  • 44.
    Transcript  variation  …many  protein  isoforms   •  Equally  however  the  same  final  protein  sequence  can  come  from  slightly  different  transcripts!  As  long  as  CDS  not   affected    
  • 45.
    Role  of  mRNA  processing  in  gene  regulation?   •  seems  wasteful   •  exon-­‐intron  arrangement  facilitates  new  genetic  recombination   •  evidence  in  protein  domains   •  variation  -­‐  several  protein  variants  can  be  produced  from  one  gene   •  The  5’,  3’  modifications  of  mRNA  also  have  a  role  in  stability,  transport  and   recognition  
  • 46.
    Influence  of  mRNA  structure   Final  mRNA  checks   Translation  Initiation  factors   Poly-­‐A  binding  proteins   5’  cap  &  CBC   Assembly              >  >                export                >  >                  translation  
  • 47.
    Quality  control  in  cytosol     •  EJCs  serve  to  label  the  mRNA   •  Nonsense-­‐mediated  decay  (NMD)  rids  cells  of  mRNAs  with  premature  termination  codons   •  hUpf  complex  triggers  NMD  in  the  cytoplasm  when  recognized  downstream  of  STOP     Exon  junction  complexes   Correct  stop  codon   Premature  stop  codon   Upf  proteins   Upf  triggers  mRNA  degradation  
  • 48.
    mRNA  stability   • varies  (mins-­‐hrs)   •  Exo/endonuclease,  other  decay  mechanisms     •  A  deadenylase  shortens  the  Poly-­‐A  tail  in  the  cytoplasm     •  Decapping  enzymes  -­‐  uncapped  mRNA  is  rapidly  degraded  by  exonucleases       •  3’  binding  of  miRNA  mediated  degradation   •  RNAi  mediated  degradation     AREs  can  stimulate  Poly-­‐A  tail  removal     AU-­‐rich  elements  50–150  nt  (rich  in  adenosine   and  uridine)       3-­‐UTRs  of  mRNAs  with  a  short  half-­‐life  <10%   genes,  e.g:  growth  factors    
  • 49.
    Final  aa  differs  from  the  genomic  DNA      A-­‐I  deamination  adenine    >  inosine:    >100o  genes                       RNA  editing   Adenosine  deaminases   acting  on  RNA  (ADARs)    C-­‐U  deamination  cytosine>uracil   By  cytidine  deaminase         Apoplipoprotein  B:  lipid  metabolism  
  • 50.
    Control  at  the  protein  level  ..lost  after  translation   •  Every  step  required  for  the  process  of  gene  expression  can  be  regulated     •  Translation  is  no  different   •  Protein  folding  is  initiated  as  synthesis  proceeds,  (Hsp70)   •  Ubiquitin-­‐proteasome  system   •  Abnormally  folded  proteins  can  form  disease  causing  protein  aggregates  
  • 51.