Data analysis workshop for massive           sequencing dataTowards an understanding of diversity in  biological and biome...
“Some people enjoy reading papers,                  “Some people enjoy reading papers, jugglingjuggling possibilities and ...
“…organisms of the most different sorts areconstructed from the very same battery ofgenes. The diversity of life forms res...
Salmonella : A Gram-negativepathogen with a varied lifestyle
Signal transduction cascade bytwo-component regulatory systems    Signal             low Mg2+    Sensor      PhoQ   Regula...
Two-component systems regulate physiological         and virulence functions    System           Signal                   ...
The Salmonella PMRA/PMRB system   responds to Fe3+ and low Mg2+       low Mg2+                  high Fe3+PhoQ             ...
The E. Coli PMRA/PMRB system   responds to Fe3+ but not to low Mg2+       low Mg2+                  high Fe3+PhoQ         ...
The Salmonella but not the E. coli ugd gene is       regulated by the PhoP protein           PhoQ                         ...
PhoP-PhoQ Two component system  regulates 5% of Salmonella genes     Consensus MotifSalmonella LT2 & E. coli K12
Single motif vs. a family of PhoP                          submotifs                                     +Sensitivity+Spec...
PhoP submotifs improve BS detection 26 BS
Genome wide analysis: custom tiling    arrays and ChIP assays
Evolution of submotives thougout the            Gamma/Enterobacteria                    S01                               ...
The submotifs and the PhoP protein evolve at              correlated rates
In vitro affinities correlate well with the top three                 families of submotifs
+                               -Zwir et al., PNAS, 2005; Zwir et al, Bioinformatics, 2005,         Harari et al., BMC Bio...
Submotif & distances from the    RNAP binding site    Close          Medium          Remote                               ...
Two closely related species show     distinct promoter’s preferences        Close   Medium   RemoteSubmotifs & distances c...
Two far related species show distinct       promoter architectures
PhoP-activated genes are bound andtranscribed at different times and levels
Predicting gene binding and transcription of          PhoP regulated targets ancestral  horizontally-acquired
SummaryTF Affinity for its binding sites determine promotertime and levels in naked DNABinding and Transcription in vivo...
Two paradigms: multiple genes with small  effect, or few genes with large effect      London Metro             Boston Metr...
Phenotypic-genotypic relations describe a risk         surface of Schizophrenia   R19:                                    ...
Uncovering genotype-phenotype relations by  independently clustering both domains                                  Phenoty...
Identifying significant genotype-phenotype    relations among inter-domain clusters                                       ...
Phenotype relations
Genotype relations~=
Optimal (multiobjective/multimodal) relations        are hierarchically organized
Relations reflect the risk of Schizophrenia                    First degree relatives have                     a genetic p...
Validation using an independent set of               subjects         Relation Risk(%)   Affected   Relative   Control    ...
Qualitative significance of learned SNPs                 Pathway analysis             Process for Neurological Disease    ...
Neuronal cell adhesion pathway derived from   the genotype domain of the relations
Novel pathways: oxidative stress andepigenetic control of gene expression
SummaryWe proposed the first data-driven definition of the Schizophrenia riskfunctionConcurrent CGWAS provides a panoram...
AcknowledgementsEduardo Groisman LabHoward Hughes Medical Institute                                  Dept. of Computer Sci...
Acknowledgments Francisco Herrera                  Mining for Modeling Lab          DECSAI,    University of Granada      ...
Upcoming SlideShare
Loading in …5
×

Towards an understanding of diversity in biological and biomedical systems

943 views
776 views

Published on

Towards an understanding of diversity in biological and biomedical systems
Igor zwir
Massive sequencing data analysis workshop
Granada 2011

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
943
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Towards an understanding of diversity in biological and biomedical systems

  1. 1. Data analysis workshop for massive sequencing dataTowards an understanding of diversity in biological and biomedical systems Igor Zwir Department Computer Science and Artificial Intelligence, University of Granada, Granada, Spain Howard Hughes Medical Institute Yale School of Medicine, NewHeaven, CT, US Department of Psychiatry Washington University School of Medicine, St. Louis, MO, US e-mail: zwiri@psychiatry.wustl.edu
  2. 2. “Some people enjoy reading papers, “Some people enjoy reading papers, jugglingjuggling possibilities and formulating ideas, possibilities and formulating ideas, even ifeven if they can’t work a pipette” they can’t write a line of a computer program”(“Reasoning for results”, Nature, Bray, D., 2001) (“Reasoning for results”, Groisman Lab, 2007)
  3. 3. “…organisms of the most different sorts areconstructed from the very same battery ofgenes. The diversity of life forms results fromsmall changes in the regulatory systems thatgovern expression of these genes.” François Jacob In Of flies, mice and men
  4. 4. Salmonella : A Gram-negativepathogen with a varied lifestyle
  5. 5. Signal transduction cascade bytwo-component regulatory systems Signal low Mg2+ Sensor PhoQ Regulator PhoP -PO3 Effectors mgtA mgtB Response Mg2+ transport Mg2+ transport
  6. 6. Two-component systems regulate physiological and virulence functions System Signal Function ArcA/ArcB Quinones Anaerobic respiration OmpR/EnvZ Osmolarity changes Osmoadaptation NtrB/NtrC Low nitrogen levels Nitrogen metabolism PhoP/PhoQ Low Mg2+ Virulence, growth in low Mg2+ PmrA/PmrB Fe3+ and Al3+ Resistance to polymyxin B SsrA/SpiR Unknown Virulence TtrR/TtrS Tetrathionate Anaerobic respiration
  7. 7. The Salmonella PMRA/PMRB system responds to Fe3+ and low Mg2+ low Mg2+ high Fe3+PhoQ PmrB PhoP -PO3 PmrA -PO3 pmrD PmrD pbgP LPS modification
  8. 8. The E. Coli PMRA/PMRB system responds to Fe3+ but not to low Mg2+ low Mg2+ high Fe3+PhoQ PmrB PhoP -PO3 PmrA -PO3 pmrD PmrD pbgP85.4% 93.3% LPS modification
  9. 9. The Salmonella but not the E. coli ugd gene is regulated by the PhoP protein PhoQ PhoQ PhoP -PO3 PhoP -PO3 ugd ugd 85.4% 93.3% 85.5%(the median amino acid identity between Salmonella and E. coli proteins is 90%)
  10. 10. PhoP-PhoQ Two component system regulates 5% of Salmonella genes Consensus MotifSalmonella LT2 & E. coli K12
  11. 11. Single motif vs. a family of PhoP submotifs +Sensitivity+Specificity +Specificity Harari et al., PloS computational Biology, 2010
  12. 12. PhoP submotifs improve BS detection 26 BS
  13. 13. Genome wide analysis: custom tiling arrays and ChIP assays
  14. 14. Evolution of submotives thougout the Gamma/Enterobacteria S01 S05 Information content PhoP (Halpem Bruno) Background (HKY85 Model)Perez et al., PloS Genetics, 2009; Harari et al., PloS computational Biology, 2010
  15. 15. The submotifs and the PhoP protein evolve at correlated rates
  16. 16. In vitro affinities correlate well with the top three families of submotifs
  17. 17. + -Zwir et al., PNAS, 2005; Zwir et al, Bioinformatics, 2005, Harari et al., BMC Bioinformatics, 2009
  18. 18. Submotif & distances from the RNAP binding site Close Medium Remote 45% 21% Harari et al., PloS computational Biology, 2010
  19. 19. Two closely related species show distinct promoter’s preferences Close Medium RemoteSubmotifs & distances can distinguish Salmonella & E. coli
  20. 20. Two far related species show distinct promoter architectures
  21. 21. PhoP-activated genes are bound andtranscribed at different times and levels
  22. 22. Predicting gene binding and transcription of PhoP regulated targets ancestral horizontally-acquired
  23. 23. SummaryTF Affinity for its binding sites determine promotertime and levels in naked DNABinding and Transcription in vivo depends on wherethe binding sites sit (promoter architectures)Cis-acting features in the PhoP-activated promotersdetermine non-arbitrary organized architecturesThe differences of the regulon througout distinctspecies depends on the evolution of the binding sitesand promoter architectures
  24. 24. Two paradigms: multiple genes with small effect, or few genes with large effect London Metro Boston Metro de Vries, Nature Medicine, 2009
  25. 25. Phenotypic-genotypic relations describe a risk surface of Schizophrenia R19: R10:6 affected, 11 affected,1 Relative 6 Relatives Gottesman II, Gould TD. Am J Psychiatry, 2003 0.1% of the population affected Multigenic disease Non-genetic contributions Risk: Monozygotic twins 50% - Dizygotic twins 15%.
  26. 26. Uncovering genotype-phenotype relations by independently clustering both domains Phenotype clusters Trios (affected, relatives and controls) Subjects 70 clinical attributes Cognitive Motor Genotype clusters Behavioral Structural Subjects SNPs chips
  27. 27. Identifying significant genotype-phenotype relations among inter-domain clusters 0.01 1E-10Romero-Zaliz et al, Nucleic Acids Research, 2008; Romero-Zaliz. et al, IEEE Trans. on Evol. Computation, 2008, de Erausquin et al, Mol. Psych in Press
  28. 28. Phenotype relations
  29. 29. Genotype relations~=
  30. 30. Optimal (multiobjective/multimodal) relations are hierarchically organized
  31. 31. Relations reflect the risk of Schizophrenia First degree relatives have a genetic predisposition
  32. 32. Validation using an independent set of subjects Relation Risk(%) Affected Relative Control R22 91 10164 10170 R19 88 10155 10192 R05 61 10184 R06 57 10156 R11 32 10181 R30 28 20148 10127 R29 17 10198 10158 10165 R24 9 10193 10151 10166 R25 1 10157
  33. 33. Qualitative significance of learned SNPs Pathway analysis Process for Neurological Disease . . . . . . . . .
  34. 34. Neuronal cell adhesion pathway derived from the genotype domain of the relations
  35. 35. Novel pathways: oxidative stress andepigenetic control of gene expression
  36. 36. SummaryWe proposed the first data-driven definition of the Schizophrenia riskfunctionConcurrent CGWAS provides a panoramic vision of phenotype-genotype associations, each of which can be used by traditionalGWAS analysisFour signaling pathways associated with risk of schizophrenia wereidentifiedPhenotype-genotype relations were sufficient to reliably predictsubject statusThis finding opens the door for early detection and preventativeintervention prior to the onset of psychotic symptoms inhigh/intermediate risk populations
  37. 37. AcknowledgementsEduardo Groisman LabHoward Hughes Medical Institute Dept. of Computer Science andDongwoo Shin Artificial IntelligenceChistian Perez University of Granada, SpainHenry Huang Lab Coral del ValDept. of Molecular Microbiology Pat AndersWashington U. Javier ArnedoSchool of Medicine, USA Luis Miguel Merino Rocio Romero-Zaliz (U. de Granada)Gabriel de Erausquin Lab Cristina Rubio-Escudero (U. Seville)Departments of Psychiatry and Christopher Previti (U. Bergen)Neurology Oscar Harari (Washington U.)Harvard Med. School
  38. 38. Acknowledgments Francisco Herrera Mining for Modeling Lab DECSAI, University of Granada DECSAI, University of Granada Coral del Val DECSAI, University of Granada Gabriel de Eraúsquin Department of Psychiatry, Washington University in St. Louis Igor Zwir DECSAI, Eduardo Groisman University of Granada HHMI, Department of Molecular Biology, Washington University in St. Louis Kathleen Marchal Henry Huang Department of Microbial Department of Molecular Biology, and Molecular Systems Washington University in St. LouisKatholieke Universiteit Leuven

×