Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The Era of Open

1,474 views

Published on

Presented at the WikiSym and OpenSym joint conference in Hong Kong on August 7, 2013.

Published in: Education, Technology

The Era of Open

  1. 1. The Era of Open Philip E. Bourne University of California San Diego pbourne@ucsd.edu WikiSym+OpenSym Aug 7, 2013 1
  2. 2. The Era of Open Has The Potential to Deinstitutionalize WikiSym+OpenSym Aug 7, 2013 2 Daniel Hulshizer/Associated Press
  3. 3. The Era of Open Has The Potential to Deinstitutionalize WikiSym+OpenSym Aug 7, 2013 3 Daniel Hulshizer/Associated Press
  4. 4. An Example of That Potential: The Story of Meredith WikiSym+OpenSym Aug 7, 2013 4 http://fora.tv/2012/04/20/Congress_Unplugged_Phil_Bourne
  5. 5. The Era of Open Has The Potential to Deinstitutionalize WikiSym+OpenSym Aug 7, 2013 5 Daniel Hulshizer/Associated Press
  6. 6. Deinstitutionalization Vs Conservatism WikiSym+OpenSym Aug 7, 2013 6 Daniel Hulshizer/Associated Press
  7. 7. It Starts with the Metrics of Success [Adapted from Carole Goble] WikiSym+OpenSym Aug 7, 2013 7
  8. 8. Committee on Academic Promotions • What Counts – Money – Grants – Papers – Teaching – Service • What Does Not – Sharing data – Sharing software – Open access – Collaboration – Patents – Startups WikiSym+OpenSym Aug 7, 2013 8 Getting Ahead as a Computational Biologist in Academia PLOS Comp Biol
  9. 9. The Era of Open Has The Potential to Deinstitutionalize WikiSym+OpenSym Aug 7, 2013 9 Daniel Hulshizer/Associated Press
  10. 10. Interim Solution: Use the Traditional Reward System The Wikipedia Experiment – Topic Pages  Identify areas of Wikipedia that relate to the journal that are missing of stubs  Develop a Wikipedia page in the sandbox  Have a Topic Page Editor Review the page  Publish the copy of record with associated rewards  Release the living version into Wikipedia WikiSym+OpenSym Aug 7, 2013 10
  11. 11. MOOCs Are Another Form of Disruption WikiSym+OpenSym Aug 7, 2013 11
  12. 12. In Short Most Academic Institutions Have Yet to Embrace the Open Digital Enterprise They Surely Will Become WikiSym+OpenSym Aug 7, 2013 12
  13. 13. • Anyone, anything, anytime • publication access, data, models, source codes, resources, transparent methods, standards, formats, identifiers, apis, licenses, education, policies • “accessible, intelligible, assessable, reusable” http://royalsociety.org/policy/projects/science-public-enterprise/report/ [Carole Goble] WikiSym+OpenSym Aug 7, 2013 13
  14. 14. Business Models Rule • The Internet demanded new business models to support scholarly communication • Open access was one such sustainable model: – Began with the community – Was driven by new organizations (PLOS, BMC, F1000, eLife, Dryad, Mendeley etc.) – Was NOT driven by academic institutions – Was driven by policies and funders WikiSym+OpenSym Aug 7, 2013 14
  15. 15. One Metric of Change: Multidisciplinary Open Access Mega Journal • This year PLOS ONE will publish over 30,000 papers! WikiSym+OpenSym Aug 7, 2013 15
  16. 16. This Disruption Got Us Thinking About… • A paper as only one form of knowledge discovery • The use of interaction and rich media from which to learn and actually do science • Reproducibility • Reward structures • Better management of the research lifecycle P.E. Bourne 2005 In the Future will a Biological Database Really be Different from a Biological Journal? PLOS Comp. Biol. 1(3) e34 WikiSym+OpenSym Aug 7, 2013 16
  17. 17. This Disruption Got Us Thinking About… • A paper as only one form of knowledge discovery • The use of interaction and rich media from which to learn and actually do science • Reproducibility • Reward structures • Better management of the research lifecycle P.E. Bourne 2005 In the Future will a Biological Database Really be Different from a Biological Journal? PLOS Comp. Biol. 1(3) e34 WikiSym+OpenSym Aug 7, 2013 17
  18. 18. Better Management of the Research Lifecycle is Not a New Concept WikiSym+OpenSym Aug 7, 2013 18
  19. 19. “An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment, [the complete data] and the complete set of instructions which generated the figures.” David Donoho, “Wavelab and Reproducible Research,” 1995 datasets data collections algorithms configurations tools and apps codes workflows scripts code libraries services, system software infrastructure, compilers hardware Morin et al Shining Light into Black Boxes Science 13 April 2012: 336(6078) 159-160 Ince et al The case for open computer programs, Nature 482, 2012 [Carole Goble]
  20. 20. The Research Lifecycle IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Authoring Tools Lab Notebooks Data Capture Software Repositories Analysis Tools Visualization Scholarly Communication Commercial & Public Tools Git-like Resources By Discipline Data Journals Discipline- Based Metadata Standards Community Portals Institutional Repositories New Reward Systems Commercial Repositories Training
  21. 21. The Research Lifecycle IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION Authoring Tools Lab Notebooks Data Capture Software Repositories Analysis Tools Visualization Scholarly Communication Commercial & Public Tools Git-like Resources By Discipline Data Journals Discipline- Based Metadata Standards Community Portals Institutional Repositories New Reward Systems Commercial Repositories Training
  22. 22. automate: workflows, pipeline & service integrative frameworks pool, share & collaborate web systems nanopub semantics & ontologies machine readable documentation scientific software engineering CS SE Carole Goble]
  23. 23. Why is This Important to Me Personally? • My wife is being treated for stage 1 breast cancer • This highlights for me the disparity between what is happening in the lab and what is happening in the clinic – In the lab cancer is a personalized and treatable condition – In the clinic we are still equally “poisoning” patients with drugs first introduced 10-20 years ago WikiSym+OpenSym Aug 7, 2013 23
  24. 24. http://sagecongress.org/Presentations/Sommer.pdf WikiSym+OpenSym Aug 7, 2013 24 Josh Sommer]
  25. 25. http://sagecongress.org/Presentations/Sommer.pdf WikiSym+OpenSym Aug 7, 2013 25 [Josh Sommer]
  26. 26. Most Laboratories • We are the long tail • Goodbye to the student is goodbye to the data • Very few of us have complied (or will comply with the data management plans we write into grants) • Too much software is unusable S.Veretnik, J.L.Fink, and P.E. Bourne 2008 Computational Biology Resources Lack Persistence and Usability. PLoS Comp. Biol. . 4(7): e1000136 WikiSym+OpenSym Aug 7, 2013 26
  27. 27. Today’s Research Lifecycle is Digitally Fragmented at Best • Proof: – I cant immediately reproduce the research in my own laboratory • It took an estimated 280 hours for an average user to approximately reproduce the paper – Workflows are maturing and becoming helpful – Data and software versions and accessibility prevent exact reproducability Daniel Garijo et al. 2013 Quantifying Reproducibility in Computational Biology: The Case of the Tuberculosis Drugome PLOS ONE under review. WikiSym+OpenSym Aug 7, 2013 27
  28. 28. At the Same Time The Disruption Continues WikiSym+OpenSym Aug 7, 2013 28
  29. 29. G8 open data charter http://opensource.com/government/13/7/open-data-charter-g8 WikiSym+OpenSym Aug 7, 2013 29
  30. 30. • In the US alone.. – March 2012 OSTP commits $200M to Big Data – OSTP demands sharing plans by August 2013 – GBMF/Sloan provide institutional awards for data science – NCBI considers data catalog and MyBibliography And the Disruption Continues WikiSym+OpenSym Aug 7, 2013 30
  31. 31. Where Will It End? First We Should Ask What It Is We Wish to Accomplish WikiSym+OpenSym Aug 7, 2013 31
  32. 32. 1. A link brings up figures from the paper 0. Full text of PLoS papers stored in a database 2. Clicking the paper figure retrieves data from the PDB which is analyzed 3. A composite view of journal and database content results Here is What I Want – The Paper As Experiment 1. User clicks on thumbnail 2. Metadata and a webservices call provide a renderable image that can be annotated 3. Selecting a features provides a database/literature mashup 4. That leads to new papers 4. The composite view has links to pertinent blocks of literature text and back to the PDB 1. 2. 3. 4. PLoS Comp. Biol. 2005 1(3) e34 32
  33. 33. Here is What I Want – Knowledge Push • Each evening the labs “Evernote” notebooks are scanned for commonalities from the days activities. These are seeds in a deep search of the webs research lifecycles that has become available since last searched. Results are ranked and presented for consideration over coffee the next morning http://www.discoveryinformaticsinitiative.org/diw2012 WikiSym+OpenSym Aug 7, 2013 33
  34. 34. Will End With … • Infrastructure: – Science, Nature, Cell and megajournals all “open access” – An array of coupled institutional repositories – A central repository – PubMed Central – Open software in full support of the research lifecycle – The research lifecycle in the cloud WikiSym+OpenSym Aug 7, 2013 34
  35. 35. Will End With … • Sociologically: – An end to build it and they will come – Alternative metrics accepted by the community – Alternative reward systems that recognize the realities of today’s scholarship, namely: • Open data availability • Software availability • Collaborative research WikiSym+OpenSym Aug 7, 2013 35
  36. 36. We Have a Way to Go Consider the Life Sciences • Good News – We have NCBI/EBI – Publishers are starting to embrace data – Workflows in support of the research lifecycle are catching on • Bad News – Sustainability remains a noun not a verb – Data are organized by type not by questions asked (silos) – Tenure committees are still in the dark ages WikiSym+OpenSym Aug 7, 2013 36
  37. 37. What Can We Do As a Community? WikiSym+OpenSym Aug 7, 2013 37
  38. 38. Build Trust 38 Data Trust in the data and the derived knowledge WikiSym+OpenSym Aug 7, 2013
  39. 39. What I Have Learned About Trust 1/2 • Trust is like compound interest • Comes from listening • Comes from engaging the community in every aspect of the process • Comes from data consistency and level of annotation • Comes from responsiveness • Comes from the quality of the delivery service 39WikiSym+OpenSym Aug 7, 2013
  40. 40. What I Have Learned About Trust 2/2 • Quality begats trust – Quality requires data models/ontologies • Quality requires people – Annotators are the unsung heroes • Trust requires provenance & versioning • Trust requires explaining that all data and knowledge are not created equal 40WikiSym+OpenSym Aug 7, 2013
  41. 41. Beyond Building Trust What Else Can We Do? WikiSym+OpenSym Aug 7, 2013 41
  42. 42. Think Globally Act Locally • Support emergent community commons/portals • Be involved in the support and development of metadata standards • Contribute to workflow development etc. to drive an open research lifecycle • Educate your mentors on the importance of open science and scholarly communication • Write software thinking of an App model WikiSym+OpenSym Aug 7, 2013 42
  43. 43. Understand That All Data/Knowledge Are NOT Created Equal • We need to understand how data are used • Sustainability is not more money from the funding agencies its about business models • Reductionism is not a dirty word • We need to do more with the long tailOn the Future of Genomic Data Science 11 February 2011: vol. 331 no. 6018 728-729 WikiSym+OpenSym Aug 7, 2013
  44. 44. Recognize That Institutions Must Play a Greater Role • We need institutional data/knowledge sharing plans • We need data/information scientists to be better recognized by institutions – its not all about papers – this implies new metrics 44WikiSym+OpenSym Aug 7, 2013
  45. 45. Learn from the App Store • The App model – Think of it operating on a content base rather than a mobile device – Simple and consistent user interface – Needs to pass some quality control – Has a reward • The App+ Model – Apps interoperate through a generic workflow interface WikiSym+OpenSym Aug 7, 2013 45
  46. 46. In Summary • Open science is a means to accelerate the rate of discovery • Disruption has begun, but there is great inertia in the system • All of us are stakeholders and capable of invoking further positive change • We need to get institutions and more scientists involved…. WikiSym+OpenSym Aug 7, 2013 46
  47. 47. Acknowledgements www.force11.org WikiSym+OpenSym Aug 7, 2013 47
  48. 48. pbourne@ucsd.edu • Force11 Manifesto • Fourth Paradigm: Data Intensive Scientific Discovery http://research.microsoft.com/enus/collabora tion/fourthparadigm/WikiSym+OpenSym Aug 7, 2013 48

×