Communicating Systems Biology –Why and How We Should Do Better       in a Digital World ?             Philip E. Bourne    ...
Why We Should Do Better• Discovery processes are increasingly complex  and broad in scope• Data must be connected more clo...
Why We Should Do BetterThe Scientific Process is Too Slow to Respond to      a Crisis – Either Global or Personal         ...
In a time of crisis the need for fast accessto accurate data and any knowledge associatedwith that data are paramount   St...
If that is not enough… For some people the scientificprocess may be too slow to save           their life           ICBP H...
Josh Sommer – A Remarkable Young ManCo-founder & Executive Director the Chordoma Foundation                               ...
Chordoma                                                               • A rare form of brain                             ...
http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012                                               ...
http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012                                               ...
http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012                                               ...
http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012                                               ...
http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012                                               ...
http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_FoundationICBP Houston April 27, 2012                         ...
Science is an Increasingly Social            EndeavorWitness the Story of Meredith            ICBP Houston April 27, 2012
A Requirement is More Open Science              But ….             ICBP Houston April 27, 2012
Openness is Misunderstood by              Scientists• Witness the confusion regarding open access• Witness PubMed Central ...
What Are the Impediments to Open            Science?Change                                      Reward   You don’t get ten...
How Can We Do Better? …        ICBP Houston April 27, 2012
How Can We Do Better?• Better communication, data and knowledge access,  and new modes of discovery, which means:   – We n...
Both Are Under Stress• PubMed contains ~21M         • 1330 databases  entries (May 2011)             reported in NAR 2011•...
Some More Comparisons    • Journals have a pretty        • Efforts to make the      standardized interface          interf...
We Need Data and      The Knowledge and Data Cycle                                             Knowledge About That0. Full...
We Need Data and Knowledge About ThatData to Interoperate – What is Stopping Us?• Governance – publishers vs. database  pr...
A Small Example - The World Wide                      Protein Data Bank                                                • T...
The World Wide Protein Data Bank –                The Best Case Scenario                                                • ...
Example Interoperability: The Database View          www.rcsb.org/pdb/explore/literature.do?structureId=1TIM              ...
Example Interoperability: The Literature View                      http://biolit.ucsd.edu                                 ...
Semantic Tagging & Widgets are aPowerful Tool to Integrate Data andKnowledge of that Data, But as Yet         Not Used Muc...
Semantic Tagging of Database Content        in The Literature or Elsewhere                   ICBP Houston April http://www...
Where Will It All End?http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html                         ...
This is Literature Post-processing       Better to Get the Authors Involved• Authors are the absolute experts on the  cont...
Word Add-in for Authors• Allows authors to add metadata as they write, before they  submit the manuscript• Authors are ass...
Challenges• Authors  – Carrot IF one or more publishers fast tracked a    paper that had semantic markup it might catch on...
The Promise – A Hypothetical Example Cardiac Disease Literature                                                        Imm...
How Can We Do Better?• Better communication, data and knowledge access,  and new modes of discovery, which means:   – We n...
One Small Example of the Problem                            • jMol, VMD … are de facto                              standa...
Github is Great But We Need Apps for                      ScienceComputational Biology Resources Lack Persistence and Usab...
A Few Things to Accelerate the Rate of         Scientific Discovery• Better communication, data and knowledge access,  and...
Reward Systems Need to Change          What is Needed?• Author disambiguation• Auditing (identification and metrics) of al...
What Are these Alternative Forms of                Scholarship?Reviews                                                    ...
ICBP Houston April 27, 2012                              Reward Systems Need to Change
A Unique Identifier is Going to Happen                              • It is DOIs for people                              •...
Ideally the ID will be Tagged to Every Piece of Scholarly Communication                         I an Not a Scientist I am ...
One Solution:Use the Traditional Reward System in New Ways    The Wikipedia Experiment – Topic Pages                      ...
How Can We Do Better?• Better communication, data and knowledge access,  and new modes of discovery, which means:   – We n...
The Truth About My Laboratory                          • I have ?? mail folders!                          • The intellectu...
The Truth About My Laboratory     • I generate way more negative that       positive data, but where is it?     • Content ...
Many Great Tools Out There                                       Taverna         ICBP Houston April 27, 2012              ...
The Dream of Discovery Informatics• At the end of the day a software agent reviews  all of our labs electronic notebooks. ...
How Can We Do Better?• Better communication, data and knowledge access,  and new modes of discovery, which means:   – We n...
Yes YouTube Can Increase the Rate of             Discovery                              Unleash the full power of the Inte...
The Lab Experiment                           Paper+Rich Media• My students enjoyed the experience• The shyest student was ...
Organic Growth                                                3 Years Later                                               ...
Products              What Emerged: SciveeCastsApplicationProduct     Primary CustomersJournals    PubCast    Journals, pu...
Proposal - The TeachU Workflow    Step 1                           Mac                                     PCpresenter sta...
Acknowledgements•   BioLit Team     –   Lynn Fink                               • wwPDB team     –   Parker Williams     –...
pbourne@ucsd.eduQuestions?   ICBP Houston April 27, 2012
What Is Open Science• Unrestricted access and reuse of scientific  knowledge as found in the literature and  elsewhere pro...
What Motivates Me to Talk About           Open Science?• I am a domain (life) scientist not a computer or information  sci...
What Are the Promises of Open              Science?• To accelerate the rate of scientific discovery  worldwide• To enable ...
MBT Features                       http://mbt.sdsc.edu                                                        • Offer a fr...
Upcoming SlideShare
Loading in …5
×

Communicating Systems Biology - Why and How We Should Do Better in a Digital World

1,679 views

Published on

ICBP Workshop, Houston, April 27, 2012

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,679
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
13
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Communicating Systems Biology - Why and How We Should Do Better in a Digital World

  1. 1. Communicating Systems Biology –Why and How We Should Do Better in a Digital World ? Philip E. Bourne University of California San Diego pbourne@ucsd.edu http://www.slideshare.net/pebourne/ ICBP Houston April 27, 2012
  2. 2. Why We Should Do Better• Discovery processes are increasingly complex and broad in scope• Data must be connected more closely to the methods under study• Science is an increasingly social endeavor http://www.discoveryinformaticsinitiative.org/ Yolanda Gil and Haym Hirsch ICBP Houston April 27, 2012
  3. 3. Why We Should Do BetterThe Scientific Process is Too Slow to Respond to a Crisis – Either Global or Personal By the time the paper is published we could all be dead http://knol.google.com/k/plos-currents-influenza# ICBP Houston April 27, 2012 Motivation
  4. 4. In a time of crisis the need for fast accessto accurate data and any knowledge associatedwith that data are paramount Structure Summary page activity for H1N1 Influenza related structures Jan. 2008 Jul. 2008 Jan. 2009 Jul. 2009 Jan. 2010 Jul. 2010 3B7E: Neuraminidase of A/Brevig Mission/1/1918 H1N1 strain in complex with zanamivir 1RUZ: 1918 H1 Hemagglutinin * http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm ICBP Houston April 27, 2012 Motivation
  5. 5. If that is not enough… For some people the scientificprocess may be too slow to save their life ICBP Houston April 27, 2012 Motivation
  6. 6. Josh Sommer – A Remarkable Young ManCo-founder & Executive Director the Chordoma Foundation http://sagecongress.org/Presentations/Sommer.pdf ICBP Houston April 27, 2012 Motivation
  7. 7. Chordoma • A rare form of brain cancer • No known drugs • Treatment – surgical resection followed by intense radiation therapyhttp://upload.wikimedia.org/wikipedia/commons/2/2b/Chordoma.JPG ICBP Houston April 27, 2012 Motivation
  8. 8. http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012 Motivation
  9. 9. http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012 Motivation
  10. 10. http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012 Motivation
  11. 11. http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012 Motivation
  12. 12. http://sagecongress.org/Presentations/Sommer.pdfICBP Houston April 27, 2012 Motivation
  13. 13. http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_FoundationICBP Houston April 27, 2012 Motivation
  14. 14. Science is an Increasingly Social EndeavorWitness the Story of Meredith ICBP Houston April 27, 2012
  15. 15. A Requirement is More Open Science But …. ICBP Houston April 27, 2012
  16. 16. Openness is Misunderstood by Scientists• Witness the confusion regarding open access• Witness PubMed Central ICBP Houston April 27, 2012
  17. 17. What Are the Impediments to Open Science?Change Reward You don’t get tenure for starting a blog! ICBP Houston April 27, 2012
  18. 18. How Can We Do Better? … ICBP Houston April 27, 2012
  19. 19. How Can We Do Better?• Better communication, data and knowledge access, and new modes of discovery, which means: – We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives – We need to be more open with both – We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery – Reward systems need to change – We need scientist management and discovery tools – We need to be less fixated on the big data problems – We need to unleash the full power of the Internet ICBP Houston April 27, 2012 Easy Hard
  20. 20. Both Are Under Stress• PubMed contains ~21M • 1330 databases entries (May 2011) reported in NAR 2011• ~100,000 papers indexed • MetaBase per month http://biodatabase.org• In Feb 2009: reports 2,651 entries – 67,406,898 interactive edited 12,587 times searches were done – 92,216,786 entries were viewed PLoS Comp. Biol. 2005 1(3) e34
  21. 21. Some More Comparisons • Journals have a pretty • Efforts to make the standardized interface interfaces different! • Little attempt at a • Journals have a business business model model compared to the Web 2.0 • The quality is declining as world numbers increase (?) • Quality is increasing (?) • Audience believes they • Not well sustained are sustainableDatabases versus journals PLoS Comp. Biol. 2008. 4(7): e1000136
  22. 22. We Need Data and The Knowledge and Data Cycle Knowledge About That0. Full text of PLoS papers stored 4. The composite view has in a database links to pertinent blocks Data to Interoperate of literature text and back to the PDB 1. User clicks on content 4. 2. Metadata and webservices to data provide an interactive 1. view that can be 3. A composite view of 1. A link brings up figures from the paper journal and database content results annotated 3. 3. Selecting features provides a data/knowledge mashup 2. 4. Analysis leads to new 2. Clicking the paper figure retrieves data from the PDB which is content I can share analyzed PLoS Comp. Biol. 2005 1(3) e34
  23. 23. We Need Data and Knowledge About ThatData to Interoperate – What is Stopping Us?• Governance – publishers vs. database providers• Reward• Metadata standards for provenance, privacy etc.• Exemplars• …. Caveat: Each discipline is different – I speak very much from a biomedical sciences perspective ICBP Houston April 27, 2012
  24. 24. A Small Example - The World Wide Protein Data Bank • The single worldwide repository for data on the structure of biological macromolecules • Vital for drug discovery and the life sciences • 41 years old • Free to all http://www.wwpdb.org ICBP Houston April 27, 2012PLoS Comp. Biol. 2005 1(3) e34 We need data and knowledge about that data to interoperate
  25. 25. The World Wide Protein Data Bank – The Best Case Scenario • Paper not published unless data are deposited – strong data to literature correspondence • Highly structured data conforming to extensive ontologies • DOI’s assigned to every structure http://www.wwpdb.org ICBP Houston April 27, 2012PLoS Comp. Biol. 2005 1(3) e34 We need data and knowledge about that data to interoperate
  26. 26. Example Interoperability: The Database View www.rcsb.org/pdb/explore/literature.do?structureId=1TIM ICBP Houston April 27, 2012BMC Bioinformatics 2010 11:220 We need data and knowledge about that data to interoperate
  27. 27. Example Interoperability: The Literature View http://biolit.ucsd.edu ICBP Houston April 27, 2012Nucleic Acids Research 2008 36(S2) W385-389 We need data and knowledge about that data to interoperate
  28. 28. Semantic Tagging & Widgets are aPowerful Tool to Integrate Data andKnowledge of that Data, But as Yet Not Used Much Will Widgets and Semantic Tagging Change Computational Biology? PLoS Comp. Biol. 6(2) e1000673 ICBP Houston April 27, 2012 We need data and knowledge about that data to interoperate
  29. 29. Semantic Tagging of Database Content in The Literature or Elsewhere ICBP Houston April http://www.rcsb.org/pdb/static.do?p=widgets/widgetShowcase.jsp 27, 2012Semantic Tagging PLoS Comp. Biol. 6(2) e1000673
  30. 30. Where Will It All End?http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html ICBP Houston April 27, 2012
  31. 31. This is Literature Post-processing Better to Get the Authors Involved• Authors are the absolute experts on the content• More effective distribution of labor• Add metadata before the article enters the publishing process ICBP Houston April 27, 2012 We need data and knowledge about that data to interoperate
  32. 32. Word Add-in for Authors• Allows authors to add metadata as they write, before they submit the manuscript• Authors are assisted by automated term recognition – OBO ontologies – Database IDs• Metadata are embedded directly into the manuscript document via XML tags, OOXML format – Open – Machine-readable• Open source, Microsoft Public License http://www.codeplex.com/ucsdbiolit ICBP Houston April 27, 2012 We need data and knowledge about that data to interoperate
  33. 33. Challenges• Authors – Carrot IF one or more publishers fast tracked a paper that had semantic markup it might catch on• Publishers – Carrot Competitive advantage ICBP Houston April 27, 2012 We need data and knowledge about that data to interoperate
  34. 34. The Promise – A Hypothetical Example Cardiac Disease Literature Immunology Literature Shared Function ICBP Houston April 27, 2012 We need data and knowledge about that data to interoperate
  35. 35. How Can We Do Better?• Better communication, data and knowledge access, and new modes of discovery, which means: – We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives – We need to be more open with both – We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery – Reward systems need to change – We need scientist management and discovery tools – We need to be less fixated on the big data problems – We need to unleash the full power of the Internet ICBP Houston April 27, 2012 Easy Hard
  36. 36. One Small Example of the Problem • jMol, VMD … are de facto standard important tools for rendering biological molecules .. but • They are not versatile ie do not for example: – Respond to the data they are reading – Offer views that match the users interests – Allow the user to annotate the data – Allow those annotations to be shared (published?) ICBP Houston April 27, 2012 Think More About the Tools
  37. 37. Github is Great But We Need Apps for ScienceComputational Biology Resources Lack Persistence and Usability. PLoSComp. Biol. 2008 4(7): e1000136
  38. 38. A Few Things to Accelerate the Rate of Scientific Discovery• Better communication, data and knowledge access, and new modes of discovery, which means: – We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives – We need to be more open with both – We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery – Reward systems need to change – We need scientist management tools – We need to be less fixated on the big data problems – We need to unleash the full power of the Internet ICBP Houston April 27, 2012 Easy Hard
  39. 39. Reward Systems Need to Change What is Needed?• Author disambiguation• Auditing (identification and metrics) of all scholarship - means new tools• Seniors need to promote alternative forms of scholarship• Juniors need to respond Ten Simple Rules for Getting Promoted as a Computational Biologist in Academia PLoS Comp Biol 2011 7(10 e1002001 ICBP Houston April 27, 2012 Reward Systems Need to Change
  40. 40. What Are these Alternative Forms of Scholarship?Reviews Curation Research [Grants] Journal Poster Article Session Conference Paper BlogsCommunity Service/Data ICBP Houston April 27, 2012 Reward Systems Need to Change
  41. 41. ICBP Houston April 27, 2012 Reward Systems Need to Change
  42. 42. A Unique Identifier is Going to Happen • It is DOIs for people • Some scientists will resist • The winner is ORCID? ICBP Houston April 27, 2012 Reward Systems Need to Change
  43. 43. Ideally the ID will be Tagged to Every Piece of Scholarly Communication I an Not a Scientist I am a Number PLoS Comp. Biol. 2008 4(12) e1000247 ICBP Houston April 27, 2012 Reward Systems Need to Change
  44. 44. One Solution:Use the Traditional Reward System in New Ways The Wikipedia Experiment – Topic Pages • Identify areas of Wikipedia that relate to the journal that are missing of stubs • Develop a Wikipedia page in the sandbox • Have a Topic Page Editor review the page • Publish the copy of record with associated rewards • Release the living version into Wikipedia ICBP Houston April 27, 2012
  45. 45. How Can We Do Better?• Better communication, data and knowledge access, and new modes of discovery, which means: – We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives – We need to be more open with both – We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery – Reward systems need to change – We need scientist management and discovery tools – We need to be less fixated on the big data problems – We need to unleash the full power of the Internet ICBP Houston April 27, 2012 Easy Hard
  46. 46. The Truth About My Laboratory • I have ?? mail folders! • The intellectual memory of my laboratory is in those folders • This is an unhealthy hub and spoke mentality ICBP Houston April 27, 2012 We Need Scientist Management Tools
  47. 47. The Truth About My Laboratory • I generate way more negative that positive data, but where is it? • Content management is a mess – Slides, posters….. – Data, lab notebooks …. – Collaborations, Journal clubs … • Software is open but where is it? http://artbyvida.com/portfolio.php • Farewell is for the data tooComputational Biology Resources Lack Persistence and Usability. PLoSComp. Biol. 2008 4(7): e1000136 ICBP Houston April 27, 2012 We Need Scientist Management Tools
  48. 48. Many Great Tools Out There Taverna ICBP Houston April 27, 2012 We Need Scientist Management Tools
  49. 49. The Dream of Discovery Informatics• At the end of the day a software agent reviews all of our labs electronic notebooks. Common themes and individual interests are extracted and searched against recent literature, public data, blogs, other social media and results returned and ranked for perusal next morning over coffee. ICBP Houston April 27, 2012
  50. 50. How Can We Do Better?• Better communication, data and knowledge access, and new modes of discovery, which means: – We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives – We need to be more open with both – We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery – Reward systems need to change – We need scientist management tools – We need to be less fixated on the big data problems – We need to unleash the full power of the Internet ICBP Houston April 27, 2012 Easy Hard
  51. 51. Yes YouTube Can Increase the Rate of Discovery Unleash the full power of the Internet
  52. 52. The Lab Experiment Paper+Rich Media• My students enjoyed the experience• The shyest student was actually the most bold in front of the camera• “We will become a generation of “science castors”• They liked the exposure for the most part – rather than the PI it puts them out in front ICBP Houston April 27, 2012 Unleash the full power of the Internet
  53. 53. Organic Growth 3 Years Later www.scivee.tv• Some of their work viewed 20,000+ times• Global audience of researchers, educators and academic/research institutions – 60,000 unique visitors & 2M pageviews/month – 16,000 registered users & 600 communities – 5,000 uploads of video content (about journal articles, conferences, research news and classes) – Growing 4-5% monthly• Sustainability - evolving a business model supporting journals and conferences ICBP Houston April 27, 2012 Unleash the full power of the Internet
  54. 54. Products What Emerged: SciveeCastsApplicationProduct Primary CustomersJournals PubCast Journals, publishers,societiesMeetings PosterCast Societies, conference orgs. SlideCastComm. PaperCast Societies, journals Podcast SlideCastEducation PosterCast Societies, universities SlideCastBooks BookCast Publishers, book sellers ICBP Houston April 27, 2012 Unleash the full power of the Internet
  55. 55. Proposal - The TeachU Workflow Step 1 Mac PCpresenter starts PowerPoint Step 4 Slides slides are uploaded Website Step 3 presenter stops recording and initiates upload Step 5 Step 2presenter starts slides and podcast Step 6 recording on are automatically listener smart phone Sync File synchronized plays back Podcast synchronized presentation Android iPhone Windows Phone 7 ICBP Houston April 27, 2012
  56. 56. Acknowledgements• BioLit Team – Lynn Fink • wwPDB team – Parker Williams – Marco Martinez – Andreas Prilc – Rahul Chandran – Dimitris Dimitropoulos – Greg Quinn• MBT • SciVee Team – John Moreland – Apryl Bailey – John Beaver – Leo Chalupa – Lynn Fink http://www.scivee.tv• Microsoft Scholarly Communications – Pablo Fernicola – Marc Friedman (CEO) – Lee Dirks – Ken Liu – Savas Parastitidas – Alex Ramos – Alex Wade – Willy Suwanto – Tony Hey – Ben Yukichhttp://biolit.ucsd.eduhttp//www.pdb.org ICBP Houston April 27, 2012http://www.codeplex.com/ucsdbiolit
  57. 57. pbourne@ucsd.eduQuestions? ICBP Houston April 27, 2012
  58. 58. What Is Open Science• Unrestricted access and reuse of scientific knowledge as found in the literature and elsewhere provided attribution is given• Ditto the data, protocols, software etc. from which that knowledge is derived• Something catalyzed by the Fourth Paradigm ICBP Houston April 27, 2012
  59. 59. What Motivates Me to Talk About Open Science?• I am a domain (life) scientist not a computer or information scientist• I have been co-directing a major open and freely accessible biological data source – the Protein Data Bank (PDB) for the past 11 years.• Almost 6 years ago I co-founded and remain the founding Editor in Chief of the open access journal PLoS Computational Biology• I co-founded SciVee.tv to disseminate science in new ways• There must be a business model to enable persistence and growth ICBP Houston April 27, 2012
  60. 60. What Are the Promises of Open Science?• To accelerate the rate of scientific discovery worldwide• To enable contributions from a broader geographic and economic base• To approach learning and comprehension in new ways• To reach a broader audience including the general public ICBP Houston April 27, 2012
  61. 61. MBT Features http://mbt.sdsc.edu • Offer a framework not an end user application • Responds to the data type • Support read write accessImmunologists • Encourages others to write end user Immunome Research, 2007 3(1):3 applications • Discourages feature creep Medicinal BMC Bioinformatics 2005, 6:21. Chemists ICBP Houston April 27, 2012 Think More About the Tools

×