Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scholarly Communication for Bioinformatics Students

1,202 views

Published on

Presentation made to the incoming bioinformatics and systems biology students at UCSD on how they could get involved in changing scholarly communication. Given February 28, 2011

  • Be the first to comment

  • Be the first to like this

Scholarly Communication for Bioinformatics Students

  1. 1. The Changing Face of Scholarly Communication and the Opportunities it Affords the Bioinformatics/Systems Biology Student<br />Philip E. Bourne<br />University of California San Diego<br />pbourne@ucsd.edu<br />http://www.sdsc.edu/pb<br />Third UCSD Bioinformatics and Systems Biology Expo – 2/28/2011<br />
  2. 2. Observation 1: Everyone in this Room is Driven by One Thing Above All Else<br />
  3. 3. Observation 2: We Are a Field That Uses/Produces Public On-Line Data Like No Other <br />
  4. 4. Observation 3: We Have Shaped the Way Data Are Shared – We Have Had Very Little Impact on Publications<br />
  5. 5. Perhaps it is Time We Though Less About a Publication as a Reward and More About How it Can be Presented to Maximize its Use<br />
  6. 6. So What Needs to Happen<br />We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives<br />We need to be more open with both<br />We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery<br />Reward systems need to change<br />We need scientist management tools<br />We need to be less fixated on the big data problems<br />We need to unleash the full power of the Internet<br />Hard<br />Easy<br />
  7. 7. One Personal Example of Why This Needs to Happen Now<br />
  8. 8. Josh Sommer – A Remarkable Young ManCo-founder & Executive Director the Chordoma Foundation<br />http://sagecongress.org/Presentations/Sommer.pdf<br />
  9. 9. Chordoma<br />A rare form of brain cancer<br />No known drugs<br />Treatment – surgical resection followed by intense radiation therapy<br />http://upload.wikimedia.org/wikipedia/commons/2/2b/Chordoma.JPG<br />
  10. 10. http://sagecongress.org/Presentations/Sommer.pdf<br />
  11. 11. http://sagecongress.org/Presentations/Sommer.pdf<br />
  12. 12. http://sagecongress.org/Presentations/Sommer.pdf<br />
  13. 13. If I have seen further it is only by<br />standing on the shoulders of giants<br />Isaac<br />Isaac Newton<br />From Josh’s point of view the climb <br />up just takes too long<br />> 15 years and > $850M to be <br />more precise<br />Adapted: http://sagecongress.org/Presentations/Sommer.pdf<br />
  14. 14. http://sagecongress.org/Presentations/Sommer.pdf<br />
  15. 15. http://sagecongress.org/Presentations/Sommer.pdf<br />
  16. 16. http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_Foundation<br />
  17. 17. So We Have Seem What Needs the Change and Why. What about the How?<br />
  18. 18. We Need Data and Knowledge About That Data to Interoperate<br />The Knowledge and Data Cycle<br />0. Full text of PLoS papers stored <br />in a database<br />4. The composite view has<br />links to pertinent blocks <br />of literature text and back to the PDB<br />User clicks on content<br />Metadata and webservices to data provide an interactiveview that can be annotated<br />Selecting features provides a data/knowledge mashup<br />Analysis leads to new content I can share<br />4.<br />1.<br />3. A composite view of<br />journal and database<br />content results<br />1. A link brings up figures <br />from the paper<br />3.<br />2.<br />2. Clicking the paper figure retrieves<br />data from the PDB which is<br />analyzed<br />PLoS Comp. Biol. 2005 1(3) e34<br />
  19. 19. We Need Data and Knowledge About That Data to Interoperate – What is Stopping US?<br />Open Access<br />Governance – publishers vs. database providers<br />Reward<br />Metadata standards for provenance, privacy etc.<br />Exemplars<br /> ….<br />
  20. 20. A Small Example - The World Wide Protein Data Bank<br />The single worldwide repository for data on the structure of biological macromolecules<br />Vital for drug discovery and the life sciences<br />39 years old<br />Free to all<br />http://www.wwpdb.org<br />We need data and knowledge about that data to interoperate<br />PLoS Comp. Biol. 2005 1(3) e34<br />
  21. 21. The World Wide Protein Data Bank – The Best Case Scenario<br />Paper not published unless data are deposited – strong data to literature correspondence<br />Highly structured data conforming to an extensive ontology<br />DOI’s assigned to every structure<br />http://www.wwpdb.org <br />We need data and knowledge about that data to interoperate<br />PLoS Comp. Biol. 2005 1(3) e34<br />
  22. 22. Example Interoperability: The Database View<br />www.rcsb.org/pdb/explore/literature.do?structureId=1TIM<br />We need data and knowledge about that data to interoperate<br />BMC Bioinformatics 2010 11:220<br />
  23. 23. Example Interoperability: The Literature Viewhttp://biolit.ucsd.edu<br />Nucleic Acids Research 2008 36(S2) W385-389<br />We need data and knowledge about that data to interoperate<br />
  24. 24. ICTP Trieste, December 10, 2007<br />We need data and knowledge about that data to interoperate<br />
  25. 25. Semantic Tagging & Widgets are a Powerful Tool to Integrate Data and Knowledge of that Data, But as Yet Not Used Much<br />Will Widgets and Semantic Tagging Change Computational Biology? <br />PLoS Comp. Biol. 6(2) e1000673<br />We need data and knowledge about that data to interoperate<br />
  26. 26. Semantic Tagging of Database Content in The Literature or Elsewhere<br />http://www.rcsb.org/pdb/static.do?p=widgets/widgetShowcase.jsp<br />PLoS Comp. Biol. 6(2) e1000673<br />Semantic Tagging<br />
  27. 27. We need data and knowledge about that data to interoperate<br />
  28. 28. The Publishers are Starting to Do It<br />From Anita de Waard, Elsevier <br />
  29. 29. This is Literature Post-processingBetter to Get the Authors Involved<br />Authors are the absolute experts on the content<br />More effective distribution of labor<br />Add metadata before the article enters the publishing process<br />We need data and knowledge about that data to interoperate<br />
  30. 30. Word 2007 Add-in for authors<br />Allows authors to add metadata as they write, before they submit the manuscript<br />Authors are assisted by automated term recognition<br />OBO ontologies<br />Database IDs<br />Metadata are embedded directly into the manuscript document via XML tags, OOXML format<br />Open<br />Machine-readable<br />Open source, Microsoft Public License<br />http://www.codeplex.com/ucsdbiolit<br />We need data and knowledge about that data to interoperate<br />
  31. 31. Challenges<br />Authors <br />Carrot IF one or more publishers fast tracked a paper that had semantic markup it might catch on<br />Publishers<br />Carrot Competitive advantage<br />We need data and knowledge about that data to interoperate<br />
  32. 32. The Promise – A Hypothetical Example<br />Cardiac Disease<br />Literature<br />Immunology Literature<br />Shared Function<br />We need data and knowledge about that data to interoperate<br />
  33. 33. High-throughput Biology Requires High-throughput Knowledge Discovery<br />Consider an Example from Our Own Work…<br />Roger Chang Will Give You Another Example<br />
  34. 34. The TB-Drugome<br />Determine the TB structural proteome<br />Determine all known drug binding sites from the PDB<br />Determine which of the sites found in 2 exist in 1<br />Call the result the TB-drugome<br />Kinnings et al 2010 PLoS Comp Biol6(11): e1000976<br />High-throughput Data Requires High-throughput Knowledge<br />
  35. 35. 1. Determine the TB Structural Proteome<br />TB proteome<br />homology models<br />solved structures<br />2, 266<br />3, 996<br />284<br />1, 446<br />High quality homology models from ModBase (http://modbase.compbio.ucsf.edu) increase structural coverage from 7.1% to 43.3%<br />Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976<br />
  36. 36. 2. Determine all Known Drug Binding Sites in the PDB<br />Searched the PDB for protein crystal structures bound with FDA-approved drugs<br />268 drugs bound in a total of 931 binding sites<br />No. of drugs<br />Acarbose<br />Darunavir<br />Alitretinoin<br />Conjugated estrogens<br />Chenodiol<br />Methotrexate<br />No. of drug binding sites<br />Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976<br />
  37. 37. Map 2 onto 1 – The TB-Drugome<br />http://funsite.sdsc.edu/drugome/TB/<br />Similarities between the binding sites of M.tb proteins (blue), <br />and binding sites containing approved drugs (red). <br />
  38. 38. From a Drug Repositioning Perspective<br />Similarities between drug binding sites and TB proteins are found for 61/268 drugs<br />41 of these drugs could potentially inhibit more than one TB protein<br />conjugated estrogens &<br />methotrexate<br />No. of drugs<br />chenodiol<br />levothyroxine<br />testosterone<br />raloxifene<br />alitretinoin<br />ritonavir<br />No. of potential TB targets<br />Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976<br />
  39. 39. Top 5 Most Highly Connected Drugs<br />
  40. 40. We Need Better Ways to Associate Data and Knowledge and its More than Just Text Mining of PubMed Abstracts – Its About Changing the System<br />Our Future is in Your Hands!<br />
  41. 41. Acknowledgements<br />BioLit Team<br />Lynn Fink<br />Parker Williams<br />Marco Martinez<br />RahulChandran<br />Greg Quinn<br />Microsoft Scholarly Communications<br />Pablo Fernicola<br />Lee Dirks<br />SavasParastitidas<br />Alex Wade<br />Tony Hey<br />RCSB PDB team<br />Andreas Prilc<br />DimitrisDimitropoulos<br />TB Drugome Team<br />Lei Xie<br />Sarah Kinnings<br />Li Xie<br />http://funsite.sdsc.edu/drugome/TB/<br />http://biolit.ucsd.edu<br />http//www.pdb.org<br />http://www.codeplex.com/ucsdbiolit<br />
  42. 42. pbourne@ucsd.edu<br />Questions?<br />

×