Published on

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. L. ARAVIND National Center for Biotechnology Information Apprehending Life’s complexity: Making and communicating biological discoveries
  2. 2. We are becoming meme and teme machines ! It is all about replicators: biological and otherwise The good old genes Memes Temes
  3. 3. Summary of issues <ul><li>Discovery in biology </li></ul><ul><ul><li>Different philosophies: Natural history versus “hypothesis driven science” </li></ul></ul><ul><ul><li>Evolutionary theory and computation as a bridge between the philosophical antipodes </li></ul></ul><ul><ul><ul><li>Example of the PAS domain </li></ul></ul></ul><ul><ul><li>A rich breeding ground for memes </li></ul></ul><ul><li>Levels of organization in the living world and its complexity </li></ul><ul><ul><li>Microscopic, mesoscopic and macroscopic world views need integration </li></ul></ul><ul><ul><li>Seeking gold at the end of the maze </li></ul></ul><ul><ul><li>Following natural order: hierarchies and networks </li></ul></ul><ul><ul><ul><li>Examples of classifications and hierarchies </li></ul></ul></ul><ul><li>The meme machine: transmission of discoveries </li></ul><ul><li>Databases and search tools </li></ul><ul><li>Scientific collaboration and competition </li></ul><ul><li>Journal systems </li></ul>
  4. 4. The two philosophies in biology Natural history: discovery of new forms, cataloguing and classification Hypothesis-> attempt at falsification->paradigms: Popper’s world view Largely a history of clash or neglect
  5. 5. Building the bridge: Evolutionary theory and computation + . . . . <ul><li>Sequence profile analysis </li></ul><ul><li>Structure similarity comparisons </li></ul><ul><li>Contextual analysis </li></ul>Understanding and predicting protein (biomolecule) function Systems biology: Ensembles of biomolecules in functional guilds The “omics” (regular and meta): From sequence to organismal biology and ecology
  6. 6. Early domain universe The protein universe shows enormous diversity but an underlying unity <ul><li>These relationships are powerful predictors of protein evolution, function and behavior </li></ul>? ? ? ? <ul><li>The largest assemblage of homologous domains that can unified by sequence features is formally a superfamily </li></ul><ul><li>Several superfamilies may share a common folding pattern and arrangement of secondary structure elements: unified to a fold </li></ul>
  7. 7. ALL LIFE FORMS BACTERIA ARCHAEA EUKARYA <ul><li>The ribosome, and the associated enzymes like some RNAses (including RnaseHII), PseudoU synthases, RNA methylases, thioU synthases, Clamp loader ATPase, RecA, RNA polymerases, translation GTPases, AATRS, ABC, MinD ATPases, OSGP like chaperone/protease. PCNA, DNA ligases, rRNA and tRNAs </li></ul>DNA polymerases, Holliday junction resolvases, Primases, Replicative Helicases, Origin recognition complexes Ribozymes are well-known: so an RNA world of sorts must have existed There was a common ancestor of all life; the main functions of this life form revolved around RNA metabolism and translation; some cellular functions related to DNA had developed but modern DNA replication “crystallized” later So there was a RNA centered ancestral form with a possible DNA intermediate in replication Unifying life and inferring the common ancestor
  8. 8. Getting behind biological clocks, photodetectors and oxygen sensors Regulation of circadian rhythms in animals Periodic growth and sporulation in fungi Light regulated expression of photosynthetic pigments Oxygen-seeking behavior in aerobic bacteria A master regulator of the clock the period protein (per) WC-1 and WC-2 two light sensory regulators of gene-expression in Neurospora BAT a regulator of photosynthetic pigment expression The aerotaxis receptor of E.coli and other bacteria
  9. 9. The PAS domain <ul><li>A ligand binding domain which binds diverse ligands like heme, tetrahydropyrrole and flavin nucleotides </li></ul><ul><li>Thus, it can sense diverse stimuli like light, redox or both </li></ul><ul><li>Transmits this stimulus to a diverse range of other “effector” domains </li></ul>Curr Biol. 1997 Nov 1;7(11):R674-7. PAS: a multifunctional domain family comes to light. Ponting CP, Aravind L. Curr Biol. 7(11):R674-7. PAS: a multifunctional domain family comes to light. Ponting CP and Aravind L PAS PAS bHLH PAS AAA+ HTH Transcription WC-1 SIM PER PAS GATA PAS PAS PAS PAS PAS C6
  10. 10. PAS PAS S/T-Kinase GAF GAF Adenylyl cyclase PAS GAF PAS PAS ERG-channels: redox sensing in animal hearts Phytochrome: Light sensing in plants and bacteria Signaling intracellular redox states Small-molecule based regulation of signaling enzymes Birth of a meme… <ul><li>Detection of the PAS domain allows a definitive functional prediction </li></ul><ul><li>The mechanisms of critical molecules across the entire diversity of life could be predicted </li></ul><ul><li>It was a very successful meme indeed: 887 publications following up on the original characterization and function prediction of the PAS domain have emerged since – around 80-90 per year. </li></ul>The predictions: H-kinase
  11. 11. Overview of biological complexity Discovery and classification of domains Mesoscopic Characterization of biological functional systems Function prediction & classification Microscopic Computational analysis of whole biological systems or networks Reconstructing organismal biology and whole ecosystems Macroscopic Evolutionary trajectories: Genomes to Biology
  12. 12. Eukaryotic signaling proteins show non-linear scaling with proteome size… However, major superfamilies of signaling proteins show largely linear trends: invention of many lineage-specific systems independent of the large superfamilies Deviations point to important functional adaptations: convergent evolution of LRR+kinase architectures
  13. 13. (Prolyl hydroxylases) Rs Pbcv1 Dm Arabidopsis Drosophila Lineage specific expansion of a domain family Definition: The increase in numbers of a domain in particular lineage with respect to its number in sister reference lineage Homo
  14. 14. Section of the contextual network for the Ub pathway LF LF WLM UB WLM PUG WLM UB LF LF WLM WLM PNGase Thioredoxin PAW PNGase PAW PUG UBA PUG PPPDE PUL DOMAIN Thioredoxin PPPDE UB OTU-DUB UB C 2 H 2 OTU-DUB UB Asp Protease UBA Thioredoxin X UBX UBX Thioredoxin ZZ finger UBA PUL DOMAIN WD40 LF LF LF LF LF Calpain A 2 0 Z n F UB/ UBX UBCH LF C 2 H 2 - U An1 ZnF OTU-DUB PNGase RAB-GEF PUL DOMAIN Thioredoxin PPPDE Asp Protease PAW ZZ finger WLM (metallopeptidase) Yif1 TM TM TM TM TM // RAB WD40 Calpain E2 // UBA * * Predicted DUB * * * * *
  15. 15. Domain architectural “complexity” of eukaryotic signaling proteins <ul><li>Complexity can vary drastically even between sister lineages: parasitism causes a general fall in complexity </li></ul><ul><li>The complexity in free-living forms is high in the chromalveolate+crown group clade. </li></ul><ul><li>Multicellularity and cellular complexity resulted in increases in domain architectural complexity but clearly the increase was greatest in the animal lineage alone. </li></ul><ul><li>Fungi as a whole show a reduction of complexity concomitant with their gene loss with respect to the ancestor of the crown group lineage. </li></ul>
  16. 16. Biology of Networks Nodes Links Interaction A B Network Proteins Physical Interaction Protein-Protein A B Protein Interaction Metabolites Enzymatic conversion Protein-Metabolite A B Metabolic Transcription factor Target genes Transcriptional Interaction Protein-DNA A B Transcriptional
  17. 17. 112 TFs 711 TGs 1295 Interactions E. coli transcriptional regulatory network Small-scale biochemical experiments Large-scale ChIP-chip experiments and genetic deletion and over-expression data 157 TFs 4410TGs 12873 Interactions Datasets Yeast transcriptional regulatory network
  18. 18. Scale-free structure Presence of few nodes with many links and many nodes with few links Transcriptional networks are scale-free Scale free structure provides robustness to the system Albert & Barabasi, Rev Mod Phys (2002) N (k)  k  1
  19. 19. Crp NarL Crp NarL E. coli H. influenzae B. pertussis NarL Crp Regulatory hubs which are condition specific can be either lost or replaced The same protein in organisms living in different lifestyles may confer different adaptive value. Hence it may emerge as a regulatory hub in the organism to which it confers high adaptive value and not in the others Different proteins should emerge as hubs in organisms with different lifestyle
  20. 20. Apprehending the diversity of eukaryotes “ crown group” Most studied “ microbial eukaryotes” Most diverse and prevalent animals fungi Slime molds plants Chlorophytes rhodophytes diatoms Heteroloboseans parbasalids Diplomonads Euglenozoa ciliates Apicomplexans
  21. 21. Some notable associations that might favor inter-eukaryotic gene flow Primary endosymbiosis with cyanobacterium Secondary endosymbiosis with different plant lineages Plant lineages Karyoklepty (e.g. ciliates) Endosymbiosis Engulfment Parasitic nucleus Nuclear invasion Karyoparasitism (e.g. Rhodophytes) Endoparasitism (e.g. apicomplexa)
  22. 22. Composite selves: bacterial origins for Vitamin B12 receptors <ul><li>We discovered a novel domain that forms the common denominator for Vitamin B12 binding and recognition in both bacteria and animals. This helped us understand how B12 is taken up by animal guts </li></ul><ul><li>Domain architectures and unusual phyletic distribution of this domain strongly suggested a bacterial origin for the primary animal Vitamin B12 receptor </li></ul>
  23. 23. The medium for biological discovery The Dali Database <ul><li>BLAST </li></ul><ul><li>PSI-BLAST </li></ul><ul><li>HMMER </li></ul><ul><li>HHPRED </li></ul><ul><li>DALI </li></ul><ul><li>MUSTANG </li></ul><ul><li>KALIGN </li></ul><ul><li>MUSCLE </li></ul><ul><li>… . </li></ul>Labs (including “Omics” centers) Primary archival databases Search methods and strategies Secondary databases Journals Lost in the blackhole
  24. 24. Sociology of the process: Complexity, competition and currency Complexity <ul><li>Dispersion of efforts </li></ul><ul><li>Lack of integration </li></ul>Gold rush for the “hot” issues Publications seen as currency in scientific community <ul><li>Intense competition </li></ul><ul><li>Secrecy and strife </li></ul>Transmission of discoveries is hampered Can we / should we intercede? Increased Collaboration
  25. 25. Genes: Natural selection; scientific memes: peer review? Does the axe peer review, as it stands, hamper effective scientific transmission? <ul><li>Great science was done without modern-style peer review </li></ul><ul><li>Long delays in publishing - damaging in a competitive scientific environment </li></ul><ul><li>Inane reviews with hardly any constructive value </li></ul><ul><li>Nitpicking – surely a primate instinct, but does is help in science? </li></ul><ul><li>Obstructionists: peer review as an tool against competitors </li></ul><ul><li>Closed one-sided process </li></ul><ul><li>Crackpot science : What do we do about it </li></ul><ul><li>Enormous volume of scientific production: strain on referees and journal editors </li></ul><ul><li>Constructive criticism helps! </li></ul><ul><li>Open peer review system: A viable compromise? </li></ul><ul><li>A test case for the model: Biology Direct at BMC journals </li></ul>
  26. 26. Conclusions Given the “special” interests: 1)Journals and publishers 2)Evaluation of scientists by host institutions 3)Triaging scientific publications 4)Allocating Funds for Biological research 5) Need to bar crackpots Given the competition: 1)Blogs 2)Wikis 3)Open access, open peer-review etc. 4)The ubiquity of the internet 5) The drive from the memes and temes! Will out of the box thinking help?