UCSD Library Presentation 10182010

936 views
863 views

Published on

This was a presentation to the University of California San Diego (UCSD) Senate invited by the library.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
936
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

UCSD Library Presentation 10182010

  1. 1. Why is Scholarly Communication Broken and What Can Be Done?In Celebration of Open Access Week<br />Philip E. Bourne<br />University of California San Diego<br />pbourne@ucsd.edu<br />UCSD Libraries<br />Oct. 18, 2010<br />
  2. 2. Disclaimer<br />I am a domain (life) scientist not a computer or information scientist<br />I am fortunate enough to have a major biological resource (the Protein Data Bank) and a major biological journal (PLoS Computational Biology) as my playground<br />I am part of the long tail<br />I am naïve, but I am the majority<br />Oct. 18, 2010<br />UCSD Libraries<br />
  3. 3. Agenda<br />Motivation<br />What needs to be done?<br />A few examples<br />The role of the institution<br />Oct. 18, 2010<br />UCSD Libraries<br />
  4. 4. The Scientific Process is Too Slow to Respond to a Crisis – Either Global or Personal<br />Oct. 18, 2010<br />UCSD Libraries<br />By the time the paper is published<br /> we could all be dead<br />http://knol.google.com/k/plos-currents-influenza#<br />Motivation<br />
  5. 5. In a time of crisis the need for fast access <br />to accurate data and any knowledge of<br />that data are paramount<br />Structure Summary page activity for<br />H1N1 Influenza related structures<br />Jan. 2008<br />Jan. 2009<br />Jan. 2010<br />Jul. 2009<br />Jul. 2008<br />Jul. 2010<br />3B7E: Neuraminidase of A/Brevig Mission/1/1918 <br />H1N1 strain in complex with zanamivir<br />1RUZ: 1918 H1 Hemagglutinin<br />* http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm<br />Motivation<br />Oct. 18, 2010<br />UCSD Libraries<br />
  6. 6. If that is not enough…For some people the scientific process may be too slow to save their life<br />Oct. 18, 2010<br />UCSD Libraries<br />Motivation<br />
  7. 7. Josh Sommer – A Remarkable Young ManCo-founder & Executive Director the Chordoma Foundation<br />Oct. 18, 2010<br />UCSD Libraries<br />http://sagecongress.org/Presentations/Sommer.pdf<br />Motivation<br />
  8. 8. Chordoma<br />A rare form of brain cancer<br />No known drugs<br />Treatment – surgical resection followed by intense radiation therapy<br />Oct. 18, 2010<br />UCSD Libraries<br />http://upload.wikimedia.org/wikipedia/commons/2/2b/Chordoma.JPG<br />Motivation<br />
  9. 9. Oct. 18, 2010<br />UCSD Libraries<br />http://sagecongress.org/Presentations/Sommer.pdf<br />Motivation<br />
  10. 10. Oct. 18, 2010<br />UCSD Libraries<br />http://sagecongress.org/Presentations/Sommer.pdf<br />Motivation<br />
  11. 11. Oct. 18, 2010<br />UCSD Libraries<br />http://sagecongress.org/Presentations/Sommer.pdf<br />Motivation<br />
  12. 12. Oct. 18, 2010<br />UCSD Libraries<br />If I have seen further it is only by<br />standing on the shoulders of giants<br />Isaac<br />Isaac Newton<br />From Josh’s point of view the climb <br />up just takes too long<br />> 15 years and > $850M to be <br />more precise<br />Adapted: http://sagecongress.org/Presentations/Sommer.pdf<br />Motivation<br />
  13. 13. Oct. 18, 2010<br />UCSD Libraries<br />http://sagecongress.org/Presentations/Sommer.pdf<br />Motivation<br />
  14. 14. Oct. 18, 2010<br />UCSD Libraries<br />http://sagecongress.org/Presentations/Sommer.pdf<br />Motivation<br />
  15. 15. Oct. 18, 2010<br />UCSD Libraries<br />http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_Foundation<br />Motivation<br />
  16. 16. Now we are all hopefully motivated let us break this down to what actually needs to be done in my opinion Here are a few big things …<br />Oct. 18, 2010<br />UCSD Libraries<br />What Needs to be Done?<br />
  17. 17. A Few Things to Accelerate the Rate of Scientific Discovery<br />Better communication, data and knowledge access, and new modes of discovery, which means:<br />We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives<br />We need to be more open with both<br />We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery<br />Reward systems need to change<br />We need scientist management tools<br />We need to be less fixated on the big data problems<br />We need to unleash the full power of the Internet<br />Oct. 18, 2010<br />UCSD Libraries<br />Hard<br />Easy<br />
  18. 18. We Need Data and Knowledge About That Data to Interoperate<br />The Knowledge and Data Cycle<br />0. Full text of PLoS papers stored <br />in a database<br />4. The composite view has<br />links to pertinent blocks <br />of literature text and back to the PDB<br />User clicks on content<br />Metadata and webservices to data provide an interactiveview that can be annotated<br />Selecting features provides a data/knowledge mashup<br />Analysis leads to new content I can share<br />4.<br />1.<br />3. A composite view of<br />journal and database<br />content results<br />1. A link brings up figures <br />from the paper<br />3.<br />2.<br />2. Clicking the paper figure retrieves<br />data from the PDB which is<br />analyzed<br />PLoS Comp. Biol. 2005 1(3) e34<br />
  19. 19. We Need Data and Knowledge About That Data to Interoperate – What is Stopping US?<br />Governance – publishers vs. database providers<br />Reward<br />Metadata standards for provenance, privacy etc.<br />Exemplars<br /> ….<br />Oct. 18, 2010<br />UCSD Libraries<br />Caveat: Each discipline is different – I speak very much from a biomedical<br />sciences perspective<br />
  20. 20. Certainly the Argument for Interoperability in the Biomedical Sciences is Strong<br />1078 databases reported in NAR 2008<br />MetaBase http://biodatabase.org reports 2,651 entries edited 12,587 times<br />PubMed contains 18,792,257 entries<br />~100,000 papers indexed per month<br />In Feb 2009:<br />67,406,898 interactive searches were done<br />92,216,786 entries were viewed<br />Data as of April 14, 2009<br />PLoS Comp. Biol. 2005 1(3) e34<br />What Needs to be Done?<br />
  21. 21. Example Interoperability: The Database View<br />www.rcsb.org/pdb/explore/literature.do?structureId=1TIM<br />BMC Bioinformatics 2010 11:220<br />Oct. 18, 2010<br />UCSD Libraries<br />What Needs to be Done?<br />
  22. 22. Example Interoperability: The Literature Viewhttp://biolit.ucsd.edu<br />Nucleic Acids Research 2008 36(S2) W385-389<br />Oct. 18, 2010<br />UCSD Libraries<br />What Needs to be Done?<br />
  23. 23. ICTP Trieste, December 10, 2007<br />Oct. 18, 2010<br />UCSD Libraries<br />
  24. 24. Semantic Tagging & Widgets are a Powerful Tool to Integrate Data and Knowledge of that Data, But as Yet Not Used Much<br />Oct. 18, 2010<br />UCSD Libraries<br />Will Widgets and Semantic Tagging Change Computational Biology? <br />PLoS Comp. Biol. 6(2) e1000673<br />What Needs to be Done?<br />
  25. 25. Semantic Tagging of Database Content in The Literature or Elsewhere<br />http://www.rcsb.org/pdb/static.do?p=widgets/widgetShowcase.jsp<br />PLoS Comp. Biol. 6(2) e1000673<br />Semantic Tagging<br />
  26. 26. Oct. 18, 2010<br />UCSD Libraries<br />What Needs to be Done?<br />
  27. 27. The Publishers are Starting to Do It<br />Oct. 18, 2010<br />UCSD Libraries<br />From Anita de Waard, Elsevier <br />What Needs to be Done?<br />
  28. 28. This is Literature Post-processingBetter to Get the Authors Involved<br />Authors are the absolute experts on the content<br />More effective distribution of labor<br />Add metadata before the article enters the publishing process<br />Oct. 18, 2010<br />UCSD Libraries<br />What Needs to be Done?<br />
  29. 29. Word 2007 Add-in for authors<br />Allows authors to add metadata as they write, before they submit the manuscript<br />Authors are assisted by automated term recognition<br />OBO ontologies<br />Database IDs<br />Metadata are embedded directly into the manuscript document via XML tags, OOXML format<br />Open<br />Machine-readable<br />Open source, Microsoft Public License<br />http://www.codeplex.com/ucsdbiolit<br />Oct. 18, 2010<br />UCSD Libraries<br />What Needs to be Done?<br />
  30. 30. Challenges<br />Authors <br />Carrot IF one or more publishers fast tracked a paper that had semantic markup it might catch on<br />Publishers<br />Carrot Competitive advantage<br />Oct. 18, 2010<br />UCSD Libraries<br />What Needs to be Done?<br />
  31. 31. A Few Things to Accelerate the Rate of Scientific Discovery<br />Better communication, data and knowledge access, and new modes of discovery, which means:<br />We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives<br />We need to be more open with both<br />We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery<br />Reward systems need to change<br />We need scientist management tools<br />We need to be less fixated on the big data problems<br />We need to unleash the full power of the Internet<br />Oct. 18, 2010<br />UCSD Libraries<br />Hard<br />Easy<br />
  32. 32. Reward Systems Need to ChangeWhat is Needed?<br />Author disambiguation<br />Auditing (identification and metrics) of all scholarship - means new tools<br />Seniors need to promote alternative forms of scholarship<br />Juniors need to respond<br />Oct. 18, 2010<br />UCSD Libraries<br />Ten Simple Rules for Getting Promoted as a Computational Biologist in Academia <br />PLoS Comp Biol to appear<br />Reward Systems Need to Change<br />
  33. 33. Example Tools<br />Oct. 18, 2010<br />UCSD Libraries<br />http://www.researcherid.com/<br />http://pubnet.gersteinlab.org/<br />http://www.biomedexperts.com<br />
  34. 34. What Are these Alternative Forms of Scholarship?<br />Reviews<br />Curation<br />Research<br />[Grants]<br />Journal<br />Article<br />Poster<br />Session<br />Conference<br />Paper<br />Blogs<br />Community Service/Data<br />Reward Systems Need to Change<br />Oct. 18, 2010<br />UCSD Libraries<br />
  35. 35. Ideally the ID will be Tagged to Every Piece of Scholarly Communication<br />I an Not a Scientist I am a Number<br />PLoS Comp. Biol. 2008 4(12) e1000247<br />Reward Systems Need to Change<br />Oct. 18, 2010<br />UCSD Libraries<br />
  36. 36. A Few Things to Accelerate the Rate of Scientific Discovery<br />Better communication, data and knowledge access, and new modes of discovery, which means:<br />We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives<br />We need to be more open with both<br />We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery<br />Reward systems need to change<br />We need scientist management tools<br />We need to be less fixated on the big data problems<br />We need to unleash the full power of the Internet<br />Oct. 18, 2010<br />UCSD Libraries<br />Hard<br />Easy<br />
  37. 37. The Truth About My Laboratory<br />I have ?? mail folders!<br />The intellectual memory of my laboratory is in those folders<br />This is an unhealthy hub and spoke mentality<br />We Need Scientist Management Tools<br />Oct. 18, 2010<br />UCSD Libraries<br />
  38. 38. The Truth About My Laboratory<br />I generate way more negative that positive data, but where is it? <br />Content management is a mess<br />Slides, posters…..<br />Data, lab notebooks ….<br />Collaborations, Journal clubs …<br />Software is open but where is it?<br />Farewell is for the data too<br />http://artbyvida.com/portfolio.php<br />Computational Biology Resources Lack Persistence and Usability. PLoS Comp. Biol. 2008 4(7): e1000136<br />We Need Scientist Management Tools<br />
  39. 39. Many Great Tools Out There<br />Oct. 18, 2010<br />UCSD Libraries<br />Taverna<br />We Need Scientist Management Tools<br />
  40. 40. Where I See the Problems<br />The long tail is confused<br />Lack of interoperability between the options<br />The reward (publishing) is still removed from the available tools<br />Oct. 18, 2010<br />UCSD Libraries<br />We Need Scientist Management Tools<br />
  41. 41. Science is Increasingly a Digital Workflow<br />Scientist<br />Laboratory<br />Idea<br />Experiment<br />Data<br />Conclusions<br />Publisher<br />Publish<br />The Role of the Institution<br />
  42. 42. Maybe The Line is Somewhere Else?<br />Laboratory<br />Scientist<br />Idea<br />Experiment<br />Institution<br />Data<br />Lab Notebook<br />Conclusions<br />Publisher<br />Publish<br />The Role of the Institution<br />
  43. 43. This Amounts to Publishing WorkflowsBut That Has its Problems<br />Workflows are not linear<br />Workflow : paper is not 1:1<br />Confidentiality<br />Peer review<br />Infrastructure<br />Community acceptance<br />Reward system<br />The Role of the Institution<br />
  44. 44. Solutions to Publishing Workflows?<br />New organizations (university as publisher?)<br />Appropriate reward system<br />Shared governance <br /> author, institution, publisher<br />Crowd sourcing the electronic printing press<br />The Role of the Institution<br />
  45. 45. Crowd Sourcing the Electronic Printing Press(aka Workshop: Beyond the PDF)<br />Funded by DDCF, Microsoft, NCI, Sage Bionetworks:<br />Aims:<br />Define user requirements<br />Establish a specification document<br />Open source the development effort<br />Have a commitment from a publisher to publish a research object using the system<br />Act as an exemplar for what can be done<br />The Role of the Institution<br />
  46. 46. Logistics<br />UC San Diego<br />Jan 19-21, 2010<br />Under the auspices of W3C<br />FoRC will have a follow on meeting<br />The Role of the Institution<br />
  47. 47. pbourne@ucsd.edu<br />Questions?<br />Oct. 18, 2010<br />UCSD Libraries<br />

×