• Like

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Ucsd library10182010

  • 449 views
Published

 

Published in Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
449
On SlideShare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
2
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Why is Scholarly Communication Broken and What Can Be Done?In Celebration of Open Access Week
    Philip E. Bourne
    University of California San Diego
    pbourne@ucsd.edu
    UCSD Libraries
    Oct. 18, 2010
  • 2. Disclaimer
    I am a domain (life) scientist not a computer or information scientist
    I am fortunate enough to have a major biological resource (the Protein Data Bank) and a major biological journal (PLoS Computational Biology) as my playground
    I am part of the long tail
    I am naïve, but I am the majority
    Oct. 18, 2010
    UCSD Libraries
  • 3. Agenda
    Motivation
    What needs to be done?
    A few examples
    The role of the institution
    Oct. 18, 2010
    UCSD Libraries
  • 4. The Scientific Process is Too Slow to Respond to a Crisis – Either Global or Personal
    Oct. 18, 2010
    UCSD Libraries
    By the time the paper is published
    we could all be dead
    http://knol.google.com/k/plos-currents-influenza#
    Motivation
  • 5. In a time of crisis the need for fast access
    to accurate data and any knowledge of
    that data are paramount
    Structure Summary page activity for
    H1N1 Influenza related structures
    Jan. 2008
    Jan. 2009
    Jan. 2010
    Jul. 2009
    Jul. 2008
    Jul. 2010
    3B7E: Neuraminidase of A/Brevig Mission/1/1918
    H1N1 strain in complex with zanamivir
    1RUZ: 1918 H1 Hemagglutinin
    * http://www.cdc.gov/h1n1flu/estimates/April_March_13.htm
    Motivation
    Oct. 18, 2010
    UCSD Libraries
  • 6. If that is not enough…For some people the scientific process may be too slow to save their life
    Oct. 18, 2010
    UCSD Libraries
    Motivation
  • 7. Josh Sommer – A Remarkable Young ManCo-founder & Executive Director the Chordoma Foundation
    Oct. 18, 2010
    UCSD Libraries
    http://sagecongress.org/Presentations/Sommer.pdf
    Motivation
  • 8. Chordoma
    A rare form of brain cancer
    No known drugs
    Treatment – surgical resection followed by intense radiation therapy
    Oct. 18, 2010
    UCSD Libraries
    http://upload.wikimedia.org/wikipedia/commons/2/2b/Chordoma.JPG
    Motivation
  • 9. Oct. 18, 2010
    UCSD Libraries
    http://sagecongress.org/Presentations/Sommer.pdf
    Motivation
  • 10. Oct. 18, 2010
    UCSD Libraries
    http://sagecongress.org/Presentations/Sommer.pdf
    Motivation
  • 11. Oct. 18, 2010
    UCSD Libraries
    http://sagecongress.org/Presentations/Sommer.pdf
    Motivation
  • 12. Oct. 18, 2010
    UCSD Libraries
    If I have seen further it is only by
    standing on the shoulders of giants
    Isaac
    Isaac Newton
    From Josh’s point of view the climb
    up just takes too long
    > 15 years and > $850M to be
    more precise
    Adapted: http://sagecongress.org/Presentations/Sommer.pdf
    Motivation
  • 13. Oct. 18, 2010
    UCSD Libraries
    http://sagecongress.org/Presentations/Sommer.pdf
    Motivation
  • 14. Oct. 18, 2010
    UCSD Libraries
    http://sagecongress.org/Presentations/Sommer.pdf
    Motivation
  • 15. Oct. 18, 2010
    UCSD Libraries
    http://fora.tv/2010/04/23/Sage_Commons_Josh_Sommer_Chordoma_Foundation
    Motivation
  • 16. Now we are all hopefully motivated let us break this down to what actually needs to be done in my opinion Here are a few big things …
    Oct. 18, 2010
    UCSD Libraries
    What Needs to be Done?
  • 17. A Few Things to Accelerate the Rate of Scientific Discovery
    Better communication, data and knowledge access, and new modes of discovery, which means:
    We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives
    We need to be more open with both
    We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery
    Reward systems need to change
    We need scientist management tools
    We need to be less fixated on the big data problems
    We need to unleash the full power of the Internet
    Oct. 18, 2010
    UCSD Libraries
    Hard
    Easy
  • 18. We Need Data and Knowledge About That Data to Interoperate
    The Knowledge and Data Cycle
    0. Full text of PLoS papers stored
    in a database
    4. The composite view has
    links to pertinent blocks
    of literature text and back to the PDB
    User clicks on content
    Metadata and webservices to data provide an interactiveview that can be annotated
    Selecting features provides a data/knowledge mashup
    Analysis leads to new content I can share
    4.
    1.
    3. A composite view of
    journal and database
    content results
    1. A link brings up figures
    from the paper
    3.
    2.
    2. Clicking the paper figure retrieves
    data from the PDB which is
    analyzed
    PLoS Comp. Biol. 2005 1(3) e34
  • 19. We Need Data and Knowledge About That Data to Interoperate – What is Stopping US?
    Governance – publishers vs. database providers
    Reward
    Metadata standards for provenance, privacy etc.
    Exemplars
    ….
    Oct. 18, 2010
    UCSD Libraries
    Caveat: Each discipline is different – I speak very much from a biomedical
    sciences perspective
  • 20. Certainly the Argument for Interoperability in the Biomedical Sciences is Strong
    1078 databases reported in NAR 2008
    MetaBase http://biodatabase.org reports 2,651 entries edited 12,587 times
    PubMed contains 18,792,257 entries
    ~100,000 papers indexed per month
    In Feb 2009:
    67,406,898 interactive searches were done
    92,216,786 entries were viewed
    Data as of April 14, 2009
    PLoS Comp. Biol. 2005 1(3) e34
    What Needs to be Done?
  • 21. Example Interoperability: The Database View
    www.rcsb.org/pdb/explore/literature.do?structureId=1TIM
    BMC Bioinformatics 2010 11:220
    Oct. 18, 2010
    UCSD Libraries
    What Needs to be Done?
  • 22. Example Interoperability: The Literature Viewhttp://biolit.ucsd.edu
    Nucleic Acids Research 2008 36(S2) W385-389
    Oct. 18, 2010
    UCSD Libraries
    What Needs to be Done?
  • 23. ICTP Trieste, December 10, 2007
    Oct. 18, 2010
    UCSD Libraries
  • 24. Semantic Tagging & Widgets are a Powerful Tool to Integrate Data and Knowledge of that Data, But as Yet Not Used Much
    Oct. 18, 2010
    UCSD Libraries
    Will Widgets and Semantic Tagging Change Computational Biology?
    PLoS Comp. Biol. 6(2) e1000673
    What Needs to be Done?
  • 25. Semantic Tagging of Database Content in The Literature or Elsewhere
    http://www.rcsb.org/pdb/static.do?p=widgets/widgetShowcase.jsp
    PLoS Comp. Biol. 6(2) e1000673
    Semantic Tagging
  • 26. Oct. 18, 2010
    UCSD Libraries
    What Needs to be Done?
  • 27. The Publishers are Starting to Do It
    Oct. 18, 2010
    UCSD Libraries
    From Anita de Waard, Elsevier
    What Needs to be Done?
  • 28. This is Literature Post-processingBetter to Get the Authors Involved
    Authors are the absolute experts on the content
    More effective distribution of labor
    Add metadata before the article enters the publishing process
    Oct. 18, 2010
    UCSD Libraries
    What Needs to be Done?
  • 29. Word 2007 Add-in for authors
    Allows authors to add metadata as they write, before they submit the manuscript
    Authors are assisted by automated term recognition
    OBO ontologies
    Database IDs
    Metadata are embedded directly into the manuscript document via XML tags, OOXML format
    Open
    Machine-readable
    Open source, Microsoft Public License
    http://www.codeplex.com/ucsdbiolit
    Oct. 18, 2010
    UCSD Libraries
    What Needs to be Done?
  • 30. Challenges
    Authors
    Carrot IF one or more publishers fast tracked a paper that had semantic markup it might catch on
    Publishers
    Carrot Competitive advantage
    Oct. 18, 2010
    UCSD Libraries
    What Needs to be Done?
  • 31. A Few Things to Accelerate the Rate of Scientific Discovery
    Better communication, data and knowledge access, and new modes of discovery, which means:
    We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives
    We need to be more open with both
    We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery
    Reward systems need to change
    We need scientist management tools
    We need to be less fixated on the big data problems
    We need to unleash the full power of the Internet
    Oct. 18, 2010
    UCSD Libraries
    Hard
    Easy
  • 32. Reward Systems Need to ChangeWhat is Needed?
    Author disambiguation
    Auditing (identification and metrics) of all scholarship - means new tools
    Seniors need to promote alternative forms of scholarship
    Juniors need to respond
    Oct. 18, 2010
    UCSD Libraries
    Ten Simple Rules for Getting Promoted as a Computational Biologist in Academia
    PLoS Comp Biol to appear
    Reward Systems Need to Change
  • 33. Example Tools
    Oct. 18, 2010
    UCSD Libraries
    http://www.researcherid.com/
    http://pubnet.gersteinlab.org/
    http://www.biomedexperts.com
  • 34. What Are these Alternative Forms of Scholarship?
    Reviews
    Curation
    Research
    [Grants]
    Journal
    Article
    Poster
    Session
    Conference
    Paper
    Blogs
    Community Service/Data
    Reward Systems Need to Change
    Oct. 18, 2010
    UCSD Libraries
  • 35. Ideally the ID will be Tagged to Every Piece of Scholarly Communication
    I an Not a Scientist I am a Number
    PLoS Comp. Biol. 2008 4(12) e1000247
    Reward Systems Need to Change
    Oct. 18, 2010
    UCSD Libraries
  • 36. A Few Things to Accelerate the Rate of Scientific Discovery
    Better communication, data and knowledge access, and new modes of discovery, which means:
    We need data and knowledge about that data to interoperate i.e. we need new kinds of fast, versatile publications and data archives
    We need to be more open with both
    We need to think more about the tools that analyze, visualize and annotate data to maximize knowledge discovery
    Reward systems need to change
    We need scientist management tools
    We need to be less fixated on the big data problems
    We need to unleash the full power of the Internet
    Oct. 18, 2010
    UCSD Libraries
    Hard
    Easy
  • 37. The Truth About My Laboratory
    I have ?? mail folders!
    The intellectual memory of my laboratory is in those folders
    This is an unhealthy hub and spoke mentality
    We Need Scientist Management Tools
    Oct. 18, 2010
    UCSD Libraries
  • 38. The Truth About My Laboratory
    I generate way more negative that positive data, but where is it?
    Content management is a mess
    Slides, posters…..
    Data, lab notebooks ….
    Collaborations, Journal clubs …
    Software is open but where is it?
    Farewell is for the data too
    http://artbyvida.com/portfolio.php
    Computational Biology Resources Lack Persistence and Usability. PLoS Comp. Biol. 2008 4(7): e1000136
    We Need Scientist Management Tools
  • 39. Many Great Tools Out There
    Oct. 18, 2010
    UCSD Libraries
    Taverna
    We Need Scientist Management Tools
  • 40. Where I See the Problems
    The long tail is confused
    Lack of interoperability between the options
    The reward (publishing) is still removed from the available tools
    Oct. 18, 2010
    UCSD Libraries
    We Need Scientist Management Tools
  • 41. Science is Increasingly a Digital Workflow
    Scientist
    Laboratory
    Idea
    Experiment
    Data
    Conclusions
    Publisher
    Publish
    The Role of the Institution
  • 42. Maybe The Line is Somewhere Else?
    Laboratory
    Scientist
    Idea
    Experiment
    Institution
    Data
    Lab Notebook
    Conclusions
    Publisher
    Publish
    The Role of the Institution
  • 43. This Amounts to Publishing WorkflowsBut That Has its Problems
    Workflows are not linear
    Workflow : paper is not 1:1
    Confidentiality
    Peer review
    Infrastructure
    Community acceptance
    Reward system
    The Role of the Institution
  • 44. Solutions to Publishing Workflows?
    New organizations (university as publisher?)
    Appropriate reward system
    Shared governance
    author, institution, publisher
    Crowd sourcing the electronic printing press
    The Role of the Institution
  • 45. Crowd Sourcing the Electronic Printing Press(aka Workshop: Beyond the PDF)
    Funded by DDCF, Microsoft, NCI, Sage Bionetworks:
    Aims:
    Define user requirements
    Establish a specification document
    Open source the development effort
    Have a commitment from a publisher to publish a research object using the system
    Act as an exemplar for what can be done
    The Role of the Institution
  • 46. Logistics
    UC San Diego
    Jan 19-21, 2010
    Under the auspices of W3C
    FoRC will have a follow on meeting
    The Role of the Institution
  • 47. pbourne@ucsd.edu
    Questions?
    Oct. 18, 2010
    UCSD Libraries