• Like


Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Directions in Open Science


An internal presentation to the SRI AI Center, to get people up to speed on current goings-on in open science. Tries to cover far too many things, and slides probably aren't very comprehensible by …

An internal presentation to the SRI AI Center, to get people up to speed on current goings-on in open science. Tries to cover far too many things, and slides probably aren't very comprehensible by themselves.

Published in Technology , News & Politics
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide
  • An OA journal is just a joiurnal…
  • Datacite: working with data centres to assign persistent identifiers to datasets, we are developing an infrastructure that supports simple and effective methods of data citation, discovery, and access.
  • Artificialartificila intelligence


  • 1. Directions in Open Science Mike Travers SRI Bioinformatics Research Group For AIC Lunch and Learn, 30 Jan 2012
  • 2. About this talk• Partly a trip report from Open Science Summit 2011• Partly an attempt to define open science and explore its impact• Partly an excuse to talk about some of my own vaguely related work• And partly some semi-crazy speculation about future projects in this space
  • 3. The Open Science Summit unites researchers, life scienceindustry professionals, students, patients and otherstakeholders to discuss the future of collaborative scienceand innovation.…in-depth sessions on new models for drug discovery andclinical trials, personal genomics, the patent system, thefuture of scientific publications, and more.
  • 4. What is Open Science?• Many different things, but boils down to:• Removing barriers to scientific communication and collaboration: – Social – Technical – Legal – Economic – Bureaucratic• To accelerate scientific progress• Utilizing modern technology
  • 5. Driven by technological change• The Internet has radically reduced communication costs• So old institutions of scientific communication are now obstacles – Closed academic publishers, notably:• Internet will transform scientific media just like it has newspapers, TV, social life….• The difference is: science is more important than sharing cat pictures
  • 6. For-profitacademicpublishing is aracketA very lucrativeoneStarting to berumbles ofcomplaint(boycotts) fromacademics
  • 7. Open• Most visible and successful branch of open science• Articles are free to read, pay to publish• Funders are starting to require some form of public access
  • 8. Gold: OA journal, Green: OA self-archivingOpen Access to the Scientific Journal Literature: Situation 2009, PLoS ONE, Bo-ChristerBjörk et al
  • 9. Research Works Act• H.R.3699 – “A bill to ensure the continued publication and integrity of the peer-reviewed research works by the private sector.”No Federal agency may adopt, implement, maintain, continue, or otherwise engage in anypolicy, program, or other activity that--(1) causes, permits, or authorizes network dissemination of any private-sector researchwork without the prior consent of the publisher of such work; or(2) requires that any actual or prospective author, or the employer of such an actual orprospective author, assent to network dissemination of a private-sector research work.
  • 10. Myth 1: American consumers have a right to free access to articles their taxdollars fund.FactAmerican taxpayers do not fund peer reviewed research articles; they fund someof the research that is used in those articles…
  • 11. Beyond Open Access• Not going to say a whole lot about OA, because:• It’s easy to understand• It’s pretty clearly going to win in the long term• By itself, not a very radical change to how science is done: – Knowledge is still in paper-sized chunks – Papers are peer-reviewed prior to publication; – Once something is published, it’s static• All these parameters are being challenged in some way by other efforts• George Whitesides (Harvard chemist): “The concept of the scientific paper is eroding before our very eyes”
  • 12. Variations on publishing• “Peer review is broken” – Too slow – Too biased – Too rigid – May be “the worst system except for all the others”• Pre-peer-review publication – EgarXiv.org• Micropublication – Crowdsourcing, blogs, wikis….• Open-notebook science – No gap at all between bench and publication• Database-linked publications• Dynamic Review Papers
  • 13. Biggest sequencing operation in the world Generating 6 terabytes/day of genomic dataOpen-Source Genomic Analysis of Shiga-Toxin–Producing E. coliO104:H4 Rohde et al 2011 (NEJM)Toxic E. coli outbreak in Germany May 2011:We released these data into the public domain… which elicited a burst of crowd-sourced, curiosity-driven analyses carried out by bioinformaticians on fourcontinents. Twenty-four hours after the release of the genome, it had beenassembled; … Five days after the release of the sequence data, we had designed andreleased strain-specific diagnostic primer sequences, and within a week, two dozenreports had been filed on an open-source wiki …dedicated to analysis of the strain https://github.com/ehec-outbreak-crowdsourced
  • 14. GigaScienceis a new integrated database and journalco-published in collaboration between BGI Shenzhenand BioMed Central, to meet the needs of a newgeneration of biological and biomedical research as itenters the era of "big-data."
  • 15. Dynamic Review Papers Conventional paperPaired withDynamically-updated,wiki-basedpaper/database/model
  • 16. Driving apps
  • 17. Who comes to Open Science Summits?
  • 18. Activist Organizations
  • 19. Participatory Medicine& Disease Foundations
  • 20. Startups Social paper and citation management Scientific services marketplace Web-based molecule library management
  • 21. Citizen Science
  • 22. Somewhat less garage-• Independent research institute, started from data released by Merck• Repository of experimental data (Sage Commons)• Network of cooperating institutions• Starting to build a computational platform (Synapse)
  • 23. Synthetic Biology
  • 24. And some individual researchers• Peter Murray-Rust Chemist, Cambridge, promoter of Chemical Markup Language and semantic web “Closed science makes people die!”• Victoria Stodden Statistician, Columbia, reproducibility of computational science (cfClimateGate)
  • 25. Some open science success stories• Galaxy Zoo• FoldIt• Nutrient Network (NutNet)• Prazinquantel synthesis
  • 26. Galaxy Zoo• Citizen science (loosely)• Image classification task• Mechanical Turk-like approach (but unpaid)• About 200K participants• Discovered a whole new class of galaxies (“green pea”) and a quasar mirror• 22 published papers in 3 years
  • 27. Social sharing of algorithms (“recipes”)Descent with modification
  • 28. Matthew Todd, chemist at Uof SyndneySchistosmiasisLooking for synthesis forknown drug Prazinquantel(PZQ) in enantiopure formOpen-notebook science(LabTrove)
  • 29. Nutrient Network (NutNet)
  • 30. What paper has the most authors?• NutNet paper: 40 authors, 41 institutions• This one from SLAC and elsewhere: 407 authors, but only 35 institutions
  • 31. Three variations on the scientific process• Automated Science• Distributed Science• Web-scale Intelligent Science• Open Science as the lubrication / accelerant that makes these feasible
  • 32. Afferent: Automation for Drug Discovery• Combinatorial Chemistry• Planning software to drive lab robots
  • 33. Distributed Science• Some science (eg evaluation of drug candidates) is highly parallelizable,• Hence distributable• CollabRx was initially supposed to support “virtual pharma companies” that would tie disparate academic research efforts into focused teams
  • 34. Web-scale Intelligent Science• Imagine all of science as a giant distributed computational process• Individual scientists are agents – working on a small part of the problem – Sharing their results – Getting feedback and funding dependent on success• Centralized data integration and decision tools used to help determine next useful experiment
  • 35. Steps towards distributed intelligence• Adaptive clinical trials – Rather than a classical trial with two arms run to completion – Change the distribution of test cases based on ongoing results• Now imagine this strategy applied more globally across all treatments for a disease• Credit for this slightly mad vision goes mainly to Marty Tenenbaum: – AI Meets Web 2.0 (2006) – Shrager, Tenenbaum, Travers, Cancer Commons: Biomedicine in the Internet Age (2011)
  • 36. What does all that have to do with Open Science?• Open Science is lowering barriers to collaboration• So it’s a necessary but not sufficient step towards this new kind of science• CollabRx may just have been too early: – the groundwork hasn’t been laid yet, – we are still working on basics – (eg standards for representation)• Reducing friction (or transaction costs) can be incredibly important
  • 37. “Changing the cost of innovationfundamentally changes the nature ofinnovation” – Joichi Ito TCP, HTTP etc are the containerization of data. So what’s the analog for scientific knowledge?
  • 38. Standardized Legal andInstitutional Mechanisms
  • 39. A mix oftechnical, institutional, and legal standardization:-Standard licenses(parameterizable)-RDF representation forlicenses.-Web Tools to generatethese-Sites that collect and“market” availablematerials.
  • 40. BioBike, a platform for open science • Conceived of as a vehicle for getting biologists to do their own knowledge- based biocomputing. • Lisp + Frame system + Bioinformatics Tools – Through-the-web programmability – Community sharing of code and data – Visual Programming Language • Open SourceJeff Elhai, Arnaud Taton, J. P. Massar, John K. Myers, Michael Travers, Johnny Casey, MarkSlupesky, Jeff Shrager. BioBIKE: A Web-based, programmable, integrated biological knowledgebase. Nucleic Acids Research, 2009
  • 41. BioBike and Open Science • BioBike wasn’t for Open Science per se • But it did explore some ideas in web- based biocomputation • The next-generation BioBike platform: – Data: Big data, Open data, semantic web integrated – Programming: Able to deal with large scale and distributed workflows with human elements – Collaboration: Integrating differentKnowOS: The (Re)Birth of the Knowledge OperatingSystem. Mike Travers, in aMassar, and Jeff communities JP “trading zone”Shrager, International Lisp Conference 2005
  • 42. What is a platform?• The economic meaning of “platform” is interesting• Something that: – Supports two-sided network effects – Stands in the middle and extracts a toll• Examples: – Credit cards (merchants ↔ consumers) – Operating systems (application developers ↔ users)• Science has more complicated networks and relations – Data providers – Data consumers – Service providers – Analysts (statisticians, eg) – Patients• A science platform is not going to make anyone rich like Facebook, but it would be nice to have a powerful and standard way for all these groups to collaborate.
  • 43. Open Data is outstripping analysis capacity• Or in other words: – data is cheap, – attention, knowledge, & expertise are expensive• A platform for collaborative computational interpretation of biological data• To better leverage the expensive resources
  • 44. identifies advancing new computational infrastructure as a priority for driving innovation in science and engineering. Scientific discovery and innovation are advancing along fundamentally new pathways opened by the development of increasingly sophisticated software.the overarching goal of transforminginnovations in research and education intosustained software resources that are anintegral part of the cyberinfrastructure
  • 45. Anti-open arguments• Peer-review is an essential filter; without it too much nonsense gets out• Electronic availability of articles actually leads to narrowing of science (Evans, 2008)• Privacy, HIPAA, etc.• Need to retain IP for economic motivation• The problem isn’t availability of data; it’s making sense of what we do have• See PRISM for more
  • 46. Opener Science• Science is already pretty open!• institutions of openness played a role in the foundation of science, including the first scientific journals
  • 47. Historical Origins of Open Science• Before the invention of science, knowledge of the natural world was closely guarded, passed down from master to apprentice.• The development of institutions of openness was a key factor in the scientific revolution (Paul David, Stanford economist)• …and the printing press was a key factor in that.
  • 48. So…• The printing press is almost 600 years old• The scientific journal is almost 350 years old• There’s been some advancement in communication technology since then…• Science will eventually change: – Either a modest acceleration of the scientific process, – Or as significant and discontinuous as the first scientific revolution• Which one? An open question.
  • 49. Further Reading
  • 50. End