Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How to Execute A Research Paper


Published on

Slides describing Force11 Work and background of several of the speakers, used for talks to University of Lethbridge, Carnegie Mellon and to Elsevier internally

Published in: Education, Technology

How to Execute A Research Paper

  1. 1. How To ExecuteThe Research Paper Anita de WaardDisruptive Technologies Director Elsevier Labs Pittsburgh, April 2012
  2. 2. Outline• Ten people who are changing scholarly publishing: – New forms – Workflow/data integration – New models of business/attribution• So what does this mean?• Some projects to help us move towards these new models: – Claim-evidence networks – Workflow/data integration and executable papers – Creating a community of practice
  3. 3. Theme 1: New forms of publication• Main issue: the format of the scientific paper comes from a time when our communication was paper- centric• Solution: Rethink the unit and form of the scholarly publication from the ground (i.e., the experiment) up• Three projects doing that:
  4. 4. Steve Pettifer, U Manchester• Utopia: ‘Everything you always wanted to do with a PDF….’: interactive, sharable• Working on integration with DOMEO to add/share annotations• Final goal: don’t ‘reconstruct the cow from a hamburger’: include workflows and models
  5. 5. Gully Burns, USC ISI• KEfED: model of research as an activity• Map out dependent/independent variables within an experiment and model them• Start: appendix to paper; later: precede paper, graft paper on top of model.
  6. 6. Tim Clark, Harvard/MGH• Annotation ontology allows you to trace claims• DOMEO offers interface to do both automated entity markup + manual mark up of claim/evidence networks rdf:type <> swande:Claim dct:title Intramembranous Aβ behaves as chaperones of other membrane proteins G1 swanrel:referencesAsSupportiveEvidence <> G5 pav:contributedBy <> G6
  7. 7. Theme 2: data and workflow integration• Issues: – Format of the research paper hard to integrate within a scientific/clinical workflow – Hard to reproduce/deduce: what methods were used and what data was created for a piece of research, making reproduction or even review difficult• Some solutions for sharing workflows and data:
  8. 8. Dave DeRoure, Oxford e-Research Centre• Research objects: consist of all Workflow 16 academic output, including: Results produc Q es T - L Papers Includ ed in Published - Workflows Feeds into Included in in - Data Logs produc Include Included es d in in - Talks, lectures Metadata Slides Paper - Blogs produc es Published in Common pathways Workflow 13• Move towards executable work: Results - Execute periodically to validate - Run automatically when data updates – by self or others! - Notify researchers of new results
  9. 9. Phil Bourne, UCSD• Big need: keep track of the data in my lab!• Other need: know what I did/what other people did – Yolanda Gil made workflow representation, was hard to remember what we did…• Need: better ways to record, share, archive what we did.• New role for the publisher >
  10. 10. Deborah McGuinness, RPI• Future Web: • ‘if everything is everywhere, how do we find it/know what we want?’ • Internet, Web, Grid, Cloud, Semantic Grid Middleware• Xinformatics: • Where X = geo, eco, econo… • Linked Data to Semantics• Semantic Foundations: • Pushing the boundaries of Semantic Web standards • Ontology evolution
  11. 11. Theme 3: New Models for Access/Attribution• Issues: – User-created content, crowdsourcing means (scientific) impact is measured very differently from the past – Need new models for copyright/IP – Citizen scientists participate as well• Some efforts to address this:
  12. 12. Paul Groth, VU AmsterdamAltmetrics: “the creation and study ofnew metrics based on the Social Webfor analyzing and informing scholarship.”Including: -Downloads -Where readers read -Data citation -Social network diffusion -Slide reuse -Peer review contributions -Youtube views
  13. 13. Leslie Chan, U. Toronto Scarborough• ElPub conference series that focus on globally connecting information scientists• Bioline International system “a not-for-profit scholarly publishing cooperative committed to providing open access to quality research journals published in developing countries”:
  14. 14. John Wilbanks, Kauffman/CC• As data becomes more accessible, need: • raw metadata • standards processes • consensus processes • document submission standards • data archives• Ways of governing access: • Privacy vs. IP vs. policies • Technology only helps so much… • This is mostly a social/policy issue
  15. 15. Cameron Neylon, Cambridge• Main arguments for Open Access: • Citizen science is becoming more important • Science changes when it is crowdsourced: Tim Gowers: ‘This is to normal research as driving is to pushing a car’• Three principles: • Scale and connectivity • Reduced friction to access • Demand-side filters
  16. 16. In summary, scientists are working on:• Tools for knowledge… – Visualisation (Steve Pettifer) – Modeling (Gully Burns) – Annotation (Tim Clark)• Ways to link to – Workflows (Dave De Roure) – Lab data (Phil Bourne) – Linked research data (Deborah McGuinness)• And models for – Attribution/credit (Paul Groth) – Allowing new players to participate (Leslie Chan) – Copyright/IP rights (John Wilbanks) – Networked science (Cameron Neylon).
  17. 17. New roles for publishers and libraries• Technically, there is no reason to publish in a journal– or for that matter, to publish a paper at all:• Perhaps a good blog post linked to workflows and data with some validation from peers and good download statistics might work just as well?• Is publishing in journals mostly a habit?
  18. 18. “Publishers have been thinking we’re going out of business for 20 years, what has suddenly changed?”The internet! Not the technical web, but the social web….‘The value of a […] network is proportional to the square of the number of users of the system (n²)’ Metcalfe’s Law 1990’s: 2000’s: 2015: Big Player Medium Participant Irrelevant!
  19. 19. What do we need? Internet of things: (Bleecker, [1]) Interact with ‘objects that blog’ or ‘Blogjects’, that: track where they are and where they’ve been;have histories of their encounters and experienceshave agency - an assertive voice on the social web [2] Research Objects: (Bechofer et al, [2]) Create semantically rich aggregations of resources, that can possess some scientific intent or support some research objective Networked Knowledge: (Neylon, [3]) If we care about taking advantage of the web and internet for research then we must tackle the building of scholarly communication networks. These networks will have two critical characteristics: scale and a lack of friction. [3][[1] Bleecker, J. ‘A Manifesto for Networked Objects — Cohabiting with Pigeons, Arphids and Aibos in the Internet of Things] Bechhofer, S., De Roure, D., Gamble, M., Goble, C. and Buchan, I. (2010) Research Objects: Towards Exchange andReuse of Digital Knowledge. In: The Future of the Web for Collaborative Science (FWCS 2010), April 2010, Raleigh, NC, USA. 19[3] Neylon, C. ‘Network Enabled Research: Maximise scale and connectivity, minimise friction’, ‘
  20. 20. Some examples of networked science:• Galaxy Zoo: citizen science: classify galaxies in the comfort of your own home – like Hanny!• Tim Gowers, Polymath: “…the real contributors will be the process owners and project leaders that are able to provide horizontal leadership. To support this shift, organizations will need to reward and recognize horizontal contributions as much, if not more, than hierarchical positions.”• Mathoverflow: virtual network of mathemagicians working collectively to answer big, small, clear and fuzzy questions
  21. 21. Executable Papers• E.g.:
  22. 22. Wrapping a story around your data: metadata 1. Research: Each item in the system has metadata (including metadata provenance) and relations to other data items added to it. 2. Workflow: All data items created in the lab are added to a metadata (lab-owned) workflow system. 3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document. metadata 4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to reviewers. metadata Reports are stored in the authoring/editing system, the paper gets updated, until it is validated. 5. Publishing and distribution: When a paper is published, a collection of validated information is exposed to the world. It remains connected to its related data item, and its heritage can Rats were subjected to two grueling be traced. tests (click on fig 2 to see underlying data). 6. User applications: distributed applications run on this These results suggest that the neurological pain pro- ‘exposed data’ universe. Some other publisherReview Revise Edit Concept developed with Ed Hovy, Phil Bourne, Gully Burns and Cartic Ramakrishnan
  23. 23. Creating claim-evidence networks:• DOMEO: connect to Science Direct• Rich Boyce’s Drug-drug interactions: tracing heritage of claims• Founding that: linguistic markers for identifying cited/own knowledge: 23
  24. 24. How a claim becomes a fact:• Voorhoeve, 2006: “These miRNAs neutralize p53- mediated CDK inhibition, possibly through direct inhibition of the expression of the tumorsuppressor LATS2.”• Kloosterman and Plasterk, 2006: “In a genetic screen, miR-372 and miR-373 were found to allow proliferation of primary human cells that express oncogenic RAS and active p53, possibly by inhibiting the tumor suppressor LATS2 (Voorhoeve et al., 2006).”• Yabuta et al., 2007: “[On the other hand,] two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).”• Okada et al., 2011: “Two oncogenic miRNAs, miR-372 and miR-373, directly inhibit the expression of Lats2, thereby allowing tumorigenic growth in the presence of p53 (Voorhoeve et al., 2006).”
  25. 25. Working on ontology:• Add to formal knowledge representations, e.g. Biological Expression Language add {V = 3, S = N, B = 0}: • SET Evidence = "Arterial cells are highly susceptible to oxidative stress, which can induce both necrosis and apoptosis (programmed cell death) [1,2]" • biologicalProcess(GO:"response to oxidative stress") increases biologicalProcess(GO:"apoptotic process") • biologicalProcess(GO:"response to oxidative stress") increases biologicalProcess(GO:necrosis)• Improve triple search engines, e.g. compare in iHop: • The Lats2 tumor suppressor protein has been implicated earlier in promoting p53 activation in response to mitotic apparatus stress {V = 2, S = NN, B = 0} • Our findings reveal that miR-373 would be a potential oncogene and it participates in the carcinogenesis of human esophageal cancer {V = 1/2?, S = A, B = D}
  26. 26. Application: Elsevier/Philips Use Case: 3 Content Sources, 2 Link Steps Step 1: Patient data + diagnosis link to Guideline recommendation B. Elsevier-publishedA. Philips’ Electronic Patient Records Clinical Guideline Step 2: Guideline recommendation links to research report/data C. Elsevier (or other publisher’s) Research Report or Data
  27. 27. Application: Find ‘Claimed Knowledge Updates’ Work done with Agnes Sandor, 27 Xerox Research Europe
  28. 28. FORCE11 Community of Practice• Workshop in August of 2011: 35 invited attendees from different parts of science, industry, funding agencies, data centers• Goal: map main obstacles preventing new models of science publishing and develop ways to overcome them• Just received funding from Sloan foundation to: • Start online community • Hold next workshop • Collaboratively work on new efforts
  29. 29. Summary:• Ten people who are changing scholarly publishing:• We (publishers, editors, libraries, etc) need to revisit if and how we are needed• Some projects are underway to help us move towards these new models: – Networked science – Workflow/data integration – Identifying claim-evidence trails• ….happy to collaborate on others!