Deciphering the regulatory code in the genome
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Deciphering the regulatory code in the genome

on

  • 1,624 views

There are messages hidden within our genome, regulating when and how long a gene is switched on. The presentation describes a method, STREAM, targeted at deciphering this regulatory code.

There are messages hidden within our genome, regulating when and how long a gene is switched on. The presentation describes a method, STREAM, targeted at deciphering this regulatory code.

Statistics

Views

Total Views
1,624
Views on SlideShare
1,601
Embed Views
23

Actions

Likes
0
Downloads
27
Comments
2

5 Embeds 23

http://streamtr.blogspot.com 19
http://www.slideshare.net 1
http://streamtr.blogspot.ru 1
http://streamtr.blogspot.com.au 1
http://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • Nice presentation of pretty sophisticated stuff.Thumbs up!
    Are you sure you want to
    Your message goes here
    Processing…
  • Though i was unable to fully understand the presentation... my vote is up for you for the following reasons
    1. Design
    2. Content (Though I was unable to understand)
    3. Flow of the content.

    I also liked the font used... good luck for the contest
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Deciphering the regulatory code in the genome Presentation Transcript

  • 1. Deciphering the regulatory code in the genome PhD completion seminar Denis C. Bauer Institute for Molecular Bioscience The University of Queensland, Australia By yankodesign  by linh.ngân 
  • 2. Research Aim Thermodynamic model Develop a method that translates the regulatory message in the DNA of when and how strong a gene is expressed. AAGAAGGTTTTAGTTTAGCC Express gene with  CACCGTAGGTACCTGAAGAA GAAGGTTTTAGTTTAGCCCA 70% capacity when it  CCGTAGGTACCTGAAG  is hot, Thanks! 
  • 3. Why understanding transcriptional regulation is important? •  Insight in the biology of gene pathways. •  Search for regulatory regions with specific function. •  “Re-programming” of genes has therapeutic potential. A transcription gene promoter DNA Broken regulatory  Design and insert a new  element  regulatory element 
  • 4. What do we need to know  for  building  a  model  able  to translate the regulatory  message ? 
  • 5. Background : Enhancer •  Genes can have independent “switches” (Enhancer) beyond the core promoter, which can start the transcription of the target gene under different conditions. transcription gene promoter enhancer regions
  • 6. Background: Enhancer •  Transcription is regulated by the binding of activator and repressor TFs to an enhancer region. enhancer binding site map Active TF 8 Activators transcription Concentration 2 Repressors
  • 7. Background: Repression •  Transcriptional regulation is also dependent on the interplay between activators and repressors, i.e. where they bind relative to each other. Repressor range binding site map enhancer
  • 8. On  which  system  would  we  test  the  model’s  abiliJes ? 
  • 9. Background: Even-skipped gene (eve) Drosophila melanogaster 1 Embryo stained for eve 2 Function representation 3 1 hLp://insects.eugenes.org/  2 Small et al.  3 hLp://bioinform.geneJka.ru 
  • 10. Background: Regulation of eve MSE MSE eve MSE MSE MSE Late1            3+7                        2            P                       late2                     4+6                    1        5  lacZ  Janssens, H. et al. QuanJtaJve and predicJve model of transcripJonal control of the  Drosophila melanogaster even skipped gene. Nat Genet, 2006, 38, 1159‐1165  
  • 11. Hypothesis TF  Bindin ns  Genome     conce ntraJo g site  map  re, a rchitectu RNA,  n, m ethylaJo …  predicts gene activation
  • 12. Research Goals •  Optimize Thermodynamic models efficiently. •  Analyze robustness of these models. •  Explore the regulation of a particular gene. •  Examine how the regulatory program evolves. •  Extend current thermodynamic model. Cooperphoto/CORBIS 
  • 13. Model definition Site occupancy (Hill function) Kt · K(s, t) · [t] p(s, t) = 1 + Kt · K(s, t) · [t] Free parameters TF PARAMS Total activation K Binding affinity W (S, T ) = Ets p(s, ts ) 1 − Ets · p(s , ts ) · d(s, s ) s∈S A s ∈S R E Effectiveness quenching of the activator activator contribution GENERAL PARAMS Transcription rate (Arrhenius function)  R0 Max. transcription  R exp W (S, T ) − G0 iff W < G0 rate 0 R(S, T ) =  R0 otherwise, G0 Energy barrier   ts ts Buena Vista Pictures  s s Janssens, H. et al. QuanJtaJve and predicJve model of transcripJonal control of the  Drosophila melanogaster even skipped gene. Nat Genet, 2006, 38, 1159‐1165  
  • 14. Training the model 200 100 50 0 < [TF ], [TF ], [TF ], [TF ] > 0 20 40 60 80 100 1 2 3 4 TF Binding TF Concentration Thermodynamic Model predicted Adjust model expression and parameters to 150 100 compare it to improve fit 50 target 0 40 50 60 70 80 90
  • 15. Optimization methods •  Two optimization paradigms –  Simulated Annealing •  LAM schedule (Reinitz et al. 2003) •  Geometric cooling –  Gradient descent •  Three GD variants approximating the objective function, which was not continuously differentiable. •  Judged on accuracy achieved in the given time –  Drosophila MSE2 data with 400 data points and 7 TF (16 free parameters).
  • 16. Optimization Simulated Annealing Gradient Descent 1.00 20 20 SA LAM 0.99 SA geom 0.99 15 15 RMS error 0.98 RMS error CC CC 10 10 0.97 0.97 SA_geom 5 5 0.96 GD_softmax SA LAM GD_nomax SA geom 0.95 GD_max 0.95 0 0 1 2 5 10 50 200 1 2 1 5 2 105 20 10 50 100 50 200200500 time [minutes] time [minutes] time [minutes] Suggests: many local minima. Bauer, D. C. & Bailey, T. L. OpJmizing staJc thermodynamic models of transcripJonal  regulaJon. BioinformaJcs, 2009, 25, 1640‐1646  
  • 17. If  gradient  descent  gets  stuck  in  local  minima  all  the  Jme,  how  does  the  opJmizaJon  landscape  look like ? 
  • 18. Landscape analysis •  Synthetic data based on real MSE2 data –  global minimum and solution (parameter values) are known. –  Measuring distance of the optimization solution to the starting position and the known solution. –  Measuring error reduction at the solution compared to the starting position.
  • 19. Landscape analysis Experiment Ini$al distance to  Final distance to  Error Red.  solu4on (mean)  solu4on  (mean)  (mean)  1% perturbed  3.4·10−4 2.8·10−4 88%  random  0.1  0.11  97%  Conclusion: many local minima. Bauer, D. C. & Bailey, T. L. OpJmizing staJc thermodynamic models of transcripJonal  regulaJon. BioinformaJcs, 2009, 25, 1640‐1646  
  • 20. Does the model over-fit ? •  Cross-validation (5-fold) Experiment Mean RMS error  Mean CC   (SE)   (SE)  training  13.39 (0.004)  0.92  (4.8 · 10−5 ) tesJng  14.04 (0.005)  0.91  (5.7 · 10−5 ) •  Redundancy reduction –  Not enough data to begin with
  • 21. Summary: Optimization & Analysis •  The objective function is ill-posed. –  It has a plethora of local minima. –  It might have many global minima. •  Hence SA is the method of choice. •  There might be a tendency to over-fit the data. hLp://www2.cmp.uea.ac.uk/~aih/code/SVM/KernelTrickDemo.html  hLp://images.nciku.com/ 
  • 22. Research Goals •  Optimize Thermodynamic models efficiently •  Analyze robustness of these models •  Explore the regulation of a particular gene •  Examine how the regulatory program evolves •  Extend current thermodynamic model Cooperphoto/CORBIS 
  • 23. Regulation and Evolution of eve •  Mechanism for regulating eve is conserved: –  Stripe 2 elements from other Drosophila species activate eve in D. mel. correctly. –  Despite the substantial difference in the regulatory DNA sequence. hLp://www.bio.ilstu.edu/Edwards/  Hare, E. E. et al. Sepsid even‐skipped enhancers are funcJonally conserved in Drosophila  despite lack of sequence conservaJon. PLoS Genet, 2008, 4, e1000106  
  • 24. Evaluate Evolution of MSE2 •  Test if the model can identify the MSE2 in these other species. •  Test if the model correctly predicts the transcriptional output of the homologous MSE2s.
  • 25. Searching for MSE2 •  Apply a model trained on D. mel. MSE2 to the TFBS-map from sequential windows to find the MSE2 in other species MSE2 promoter eve Other species 150 100 50 0 40 50 60 70 80 90 150 RMS error 100 50 0 40 50 60 70 80 90 < 23 27 43 … 13 … > Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules  and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220   
  • 26. Searching for MSE2: Result •  Correctly identified the MSE2 in 6/8 species 40 D. melanogaster 30 20 RMS error  10 40 D.pseudoobscura 30 20 10 rms error Genomic locaJon  40 Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules  30 rimshawi and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220    20
  • 27. Predicting the output in other species •  Apply a model trained on D. mel. MSE2 to the MSE2s in other species D. melanogaster  15 150 Target 10 D. melanogaster Log odds score (bits) relative RNA concentration 5 D. pseudoobscura 0 D. ananassae !5 100 D. mojavensis !10 !15 0 500 1000 1500 D. mojavensis  rel. genomic position 50 bicoid kruppel giant hunchback knirps caudal tailless 0 40 50 60 70 80 90 A!P position (%) Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules  and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220   
  • 28. Summary Application •  Model fits the data qualitatively. •  Predictions are biologically meaningful. •  However, there is room for improvement.
  • 29. Research Goals •  Optimize Thermodynamic models efficiently •  Analyze robustness of these models •  Explore the regulation of a particular gene •  Examine how the regulatory program evolves •  Extend current thermodynamic model Cooperphoto/CORBIS 
  • 30. One role fits them all? •  Dual function is proposed for some of the regulatory TFs. –  E.g. TF Hunchback (Hb) might be an activator when regulating stripe2 and repressor for stripe3. Late1            3+7                        2            P                       late2                     4+6                    1        5  Papatsenko, D. & Levine, M. S. Dual regulaJon by the Hunchback gradient in the  Drosophila embryo. Proc Natl Acad Sci U S A, 2008, 105, 2901‐2906   Schroeder, M. D. et al. TranscripJonal control in the segmentaJon gene network of  Drosophila. PLoS Biol, 2004, 2, E271  
  • 31. Determine the regulatory role of TFs •  Different data set: 44 CRMs important for D. mel. development but same set of TFs. •  Determine the best role for each TF in each of the CRMs –  Brute Force: train a model for all TF role-combinations on each of the 44 CRMs. –  Record the correlation achieved. –  Identify TFs that have dual-function. Segal, E. et al. PredicJng expression paLerns from regulatory sequence includes  Drosophila segmentaJon. Nature, 2008, 451, 535‐540  Bauer, D. C.; Buske, F. A. & Bailey, T. L. Dual funcJoning transcripJon factors regulated by  SUMOylaJon in the developmental gene network of Drosophila melanogaster submiLed  for publicaJon, 2009 
  • 32. TFs with dual role Bcd  Cad  Hb  Tll  Gt  Kr  Kni  TorRE  Det. roles  s  +  s  ‐  s  s  ‐  s  Literature  +  +  s  ‐  (s)  s  ‐  NA  (consensus)  “s”: dual-functioning, “+”: activator, “-”: repressor. •  E.g. Hb –  Activator for 17 CRMs –  Repressor for 27 CRMs Perkins, T. J. et al. Reverse engineering the gap gene network of Drosophila melanogaster.  PLoS Comput Biol, 2006, 2, e51   Schroeder, M. D. et al. TranscripJonal control in the segmentaJon gene network of  Drosophila. PLoS Biol, 2004, 2, E271  
  • 33. Improvement with dual function kr_CD1_ru hb_anterior_actv 1.0 1.0 1.0 target previous roles HbDual Experiment number of  mean CC   KrDual free  (SE)  0.8 0.8 0.8 HbKrDual best parameters  Previous  18  0.27 (0.008)  0.6 0.6 0.6 mRNA mRNA mRNA roles  HbDual  19  0.35 (0.009)  0.4 0.4 0.4 KrDual  19  0.37 (0.007)  0.2 0.2 0.2 HbKrDual  20  0.38 (0.007)  0.0 0.0 0.0 0 20 40 60 80 100 0 20 40 60 80 100 AP AP Bauer, D. C.; Buske, F. A. & Bailey, T. L. Dual funcJoning transcripJon factors regulated by  run_stripe5 SUMOylaJon in the developmental gene network of Drosophila melanogaster submiLed  eve_37ext_ru for publicaJon, 2009  .0 .0 .0
  • 34. Marker motifs for dual function •  Running MEME on the protein sequence of dual- functioning TFs to find short motifs (<6aa) present in all of them. CI KE 4 4 Q 3 3 K D ID bits bits 2 G 2 1 0 L E Y Q 1 0 L V 1 2 3 4 1 2 3 4 MEME (no SSC) 15.07.09 12:07 MEME (no SSC) 15.07.09 12:07 SUMOyla(on  mo(f 
  • 35. SUMOylation •  Small Ubiquitin-related Modifier a SUMO protease SU small protein covalently attached ATP to target-proteins. SU SUMO •  Involved in many pathways/ SU pathway mechanisms E1 activating enzyme –  Compartmentisation target protein + E3 ligasis –  Transcriptional regulation SU •  Can reverse the function of a TF e.g. E2 conjugating enzyme Ikaros (the human homologue of Kr) •  SUMO (Smt3) is present in D. mel during development Bauer, D. C.; Buske, F. A.; Bailey, T. L. & Bodén, M. PredicJng SUMOylaJon sites in  developmental transcripJon factors of Drosophila melanogaster NeurocompuJng, 2009,  in submission   del Arco, P. G. et al. Ikaros SUMOylaJon: switching out of repression. Mol Cell Biol 2005,  25, 2688‐2697   
  • 36. Conclusion •  Thermodynamic models can be best optimized using SA but over-fitting is an issue to keep in mind. Bauer, D. C. & Bailey, T. L. OpJmizing staJc thermodynamic models of transcripJonal regulaJon. BioinformaJcs, 2009, 25, 1640‐1646   •  Non-the-less, they are applicable for –  examining the mechanisms of transcriptional regulation, –  explore the evolution of a particular regulatory mechanism Bauer, D. C. & Bailey, T. L. Studying the funcJonal conservaJon of cis‐regulatory modules and their transcripJonal output. BMC BioinformaJcs, 2008, 9, 220    •  Model prediction improves when dual-function is allowed. Bauer, D. C.; Buske, F. A. & Bailey, T. L. Dual funcJoning transcripJon factors regulated by SUMOylaJon in the developmental gene network of Drosophila  melanogaster submiLed for publicaJon, 2009  –  SUMOylation seems to be a good candidate for the biological mechanism of role-change. Bauer, D. C.; Buske, F. A.; Bailey, T. L. & Bodén, M. PredicJng SUMOylaJon sites in developmental transcripJon factors of Drosophila melanogaster  NeurocompuJng, 2009, in submission  
  • 37. Acknowledgments •  IMB •  Funding –  Timothy Bailey (supervisor) –  Institute for Molecular –  Mikael Bodén (supervisor) Bioscience, The University of –  Sean Grimmond (thesis committee) Queensland –  Nick Hamilton (thesis committee) –  Australian Research Council –  Fabian Buske Centre of Excellence in –  Stefan Maetschke Bioinformatics –  National Institutes of Health •  Stony Brook University –  John Reinitz –  UQ International Research Tuition Award Framework for modeling, visualizing, and predicJng the  regulaJon of the transcripJon rate of a target gene  www.bioinforma(cs.org.au/stream 
  • 38. www.bioinforma(cs.org.au/stream  •  Framework for modeling, visualizing, and predicting the regulation of the transcription rate of a target gene. •  Publicly available •  Modular: New functions can be plugged in Many functions Command line Bauer, D.C. and Bailey, T.L, STREAM ‐ StaJc Thermodynamic REgulAtory Model for  transcripJonal. BioinformaJcs, 2008, 24, 2544‐2545.