Scratchpads: past, present and future


Published on

2011 kew e taxonomy workshop 26 may

Published in: Technology, Education
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Scratchpad home page. People apply for sites from here. Funded by EDIT and partly by GBIF.
  • Scratchpads used in many ways. Cross disciplinary flexibility.
  • Many different kinds of sites.
  • High levels of usage.
  • 2 Major new funding sources from December 2010. Both of these are big collaborative projects. I won’t go into the details of what we are doing in each one respectively. Rather, I am going to focus on the developments o the Scratchpads that have come about thanks to these projects.
  • Re-developmet of the Scratchpads 5 years after the first version.
  • New consistent theme to improve the look of the sites.
  • Workflows to chain functions together making the sites much easier to use.
  • Themed sites so the Scratchpads can be used for other projects and focused tightly on these specific needs of these projects.
  • Editable species pages with more intelligent selection of third party content.
  • More integrated mapping.
  • Including third party content integrated with your content.
  • The second major transition that I want to comment on concerns the movement away from the scholarly paper (or article) as the major means of scholarly communication. For something like 350 years now, the scientific paper has been the primary way of establishing precedence and validity of scholarly claims. However, as research practices have changed, other forms of scholarly communication are better suited to transmitting information. Research is increasingly a collaborative, data intensive and networked activity. Scholarly databases, in their many different forms, are increasingly becoming a better means of communication information than scholarly papers, because their contents can be re-used in many different ways, compared to scientific papers, which frequently bury this information. Yet in large part our scholarly communication system has not adapted. PDF papers are still the predominant form of scholarly communication, and these hide or exclude the wealth of information and effort – the so-called “Da rk data ” that lead to the construction of the paper. As an example of this, I recently published a paper that had just 2,500 words, but represented the collective effort of 6 people in three different continents who have been building the underling dataset for the past 10 years. While those 2,500 words that I published have a certain value, it’s the underlying data from hundreds of different species, several different genes, and 100’s of hours of analysis that represents the lasting wealth of information we created. In effect what we need is a natively digital scholarly communication system that supports the life-cycle of this data from its creation, through to its synthesis into scholarly papers. In many ways my thinking in this areas is summed up by this quote by a digital librarian, which nicely expresses the problem. “ the future scholarly communication system should closely resemble—and be intertwined with—the scholarly endeavor itself, rather than being its after-thought or annex ”
  • So first off, a low cost journal infrastructure. We have a small number of scratchpad communities that are using their Scratchpads as a scientific journal to publish PDF articles. These are the websites of various biodiversity societies and publish their society journal through a Scratchpad. I know in detail about two of these societies. One is a specialist publication on mosquitoes and publishing the Eu ropean Mosquito bulletin , which looks at the spread of mosquitoes across Europe. Another is a specialist group on stick insects, of which there are about 5,000 species. Each of these journals have independent editorial control and peer review which is managed by their respective societies. They are free to publish in, entirely Open Access and there are no page limits or colour limits. The infrastructure is free and the only costs involved are those of the time taken by the peer review and editorial board, which is managed by the societies. However they are only available electronically. Each of these societies have ISBN numbers for their journals but because they are not part of any formal publishing house the articles have no doi’s, are not indexed by the PubMed database or have any ISI impact factor. They are also relatively unsophisticated in their use of the site. They don’t have any online submission (e-mail) – this is all managed by e-mail, and the societies are just publishing PDF articles.
  • Nevertheless there are some Scratchpad users which for particular kids of data, are sharing their datasets more widely we specialist data publishers. Where there is an available data standard, we have enabled services that allow Scratchpad users to share their data with larger data publishers. An area where this is currently working is with Specimen information. Across the Scratchpads we have over 18k specimen records and biological observations where people have recorded information about the particular taxonomic specimens in a standardised way. These data are connected to the GBIF network and added to the 276 Million specimen records already available. The records are crucial for mapping the distribution of species for a wide range of studies including climate change, changes in agricultural patterns and changing land use.
  • Arguably our most ambitious developments in the context of scholarly communication concern the semi-automated construction of scholarly papers from databases. In many cases Scratchpads contain extensive data relating to the biology and description of particular species. These are highly suited to formal specialist publications, but doing so require the laborious assembly of these data into a manuscript. As part of the ViBRANT collaboration with Pensoft, and in particular their two journals specialising in the formal taxonomic descriptions of plants and animals, we have developed a 5-step workflow for users to selecting appropriate data from their Scratchpad, adding additional metadata describing this information so that it conforms with the norms of descriptive taxonomic publications, and then finally preview and submit this manuscript to the publisher. From here is goes through a standard peer review process. Review recommendations are then incorporated in to a revised version of the manuscript, which is then sent back to the publisher and simultaneously published on both the Scratchpad and in the publisher’s journal. In addition to dramatically speeding up the construction of scientific papers, the user has the advantage that firstly the manuscript can be revised and updated and new information comes to light on the Scratchpad, while preserving the original version of the manuscript as part of the scholarly record. In addition the publish extracts relevant information from the submitted database which is them automatically submitted to the specialist data repositories. So in sum then, these are the four ways in which ViBRANT is facilitation scholarly communication of biodiversity information. 1) Through low cost journal infrastructure; 2) by community web publishing; 3) by biodiversity observation data publishing; and 4) by next generation publishing services producing scholarly articles from databases.
  • Scratchpads: past, present and future

    1. 1. <ul><li>Scratchpads </li></ul><ul><ul><li>Past, present & future </li></ul></ul>Kew eTaxonomy Workshop 26 May 2011
    2. 2. Scratchpads: the concept Fast Intuitive Fit for use Your data 1 “ Published” & reviewed on your site 3 Uploaded & tagged 2
    3. 3. Scratchpads: 2007-2011
    4. 4. Scratchpad usecases
    5. 5. Scratchpad usecases
    6. 6. Scratchpad usage 25 May, 2011 239 sites 3212 users 371,947 nodes Mar. 2007 - Dec. 2010
    7. 7. Scratchpads: the present Financial support until Dec. 2013 ViBRANT Virtual Biodiversity
    8. 8. Scratchpads 2 <ul><li>Scheduled for January 2012 (5 years after Scratchpad 1) </li></ul><ul><li>Move from Drupal 6 to Drupal 7 (4 year upgrade cycle, UI + entities) </li></ul><ul><li>67 Contributed modules (31 done, 14 untested, 22 to do) </li></ul><ul><li>52 Scratchpad modules (28 done, 14 untested, 10 to do) </li></ul><ul><li>Migrate current Scratchpads </li></ul><ul><li>New technical enhancements (hosting env., git, services, registry…) </li></ul><ul><li>- supporting sustainability </li></ul><ul><li>New user features (theme, workflow, spp. pages, mapping, services) </li></ul><ul><li>- supporting “publication” </li></ul>
    9. 9. Scratchpads 2: user enhancements <ul><li>Most sites use Garland (improves site looks) </li></ul><ul><li>Idiosyncratic (colours & layouts) </li></ul><ul><li>Sp2 more professional & scholarly </li></ul><ul><li>Some flexibility (site profiles) </li></ul>Consistent Scratchpad theme
    10. 10. Scratchpads 2: user enhancements <ul><li>Basic vs. advanced admin </li></ul><ul><li>Poor findability, hard to use </li></ul><ul><li>Hide unnecessary items </li></ul><ul><li>Workflow complex functions </li></ul><ul><ul><li>Integrate complex actions </li></ul></ul><ul><ul><li>Guide the user through each step </li></ul></ul>Improved administrative interface & workflows <ul><li>Adding users </li></ul><ul><li>Site setup functions </li></ul><ul><li>Groups & permissions </li></ul><ul><li>Edit / add content types </li></ul><ul><li>Importing content & taxonomy </li></ul><ul><li>Creating services </li></ul><ul><li>Creating views </li></ul><ul><li>“ Pages” menu concept </li></ul>1 2 3 4 5
    11. 11. Scratchpads 2: user enhancements <ul><li>New site management system (Aegir) </li></ul><ul><li>Support project specific profiles </li></ul><ul><ul><li>themes, modules & wokflows </li></ul></ul><ul><li>Example themes include </li></ul><ul><ul><li>eMonocot sites </li></ul></ul><ul><ul><li>GBIF Nodes Portal Toolkit </li></ul></ul><ul><ul><li>EOL LifeDesks? </li></ul></ul>Project themed scratchpads
    12. 12. Scratchpads 2: user enhancements <ul><li>Directly edit the page </li></ul><ul><li>Better presentation </li></ul><ul><li>Tabs (myspecies / ispecies) </li></ul><ul><li>Data inheritance (parent-child) </li></ul><ul><li>Better content controls </li></ul><ul><li>More intelligent ispecies filters </li></ul>New species pages
    13. 13. Scratchpads 2: user enhancements <ul><li>3 current map types </li></ul>Integrated mapping Areas (TDWG level 4) Specimen point localities GBIF occurrence records
    14. 14. Scratchpads 2: user enhancements <ul><li>3 current map types </li></ul>Integrated mapping Areas (TDWG level 4) TDWG area data (from Scratchpads) <ul><li>All integrated & editable in Scratchpads 2 </li></ul>Edit point metadata on map Point localities (from Scratchpads) Point localities (from GBIF) Point localities (from flickr)
    15. 15. Scratchpads: the future A natively digital scholarly communication system <ul><li>The article was (is) the unit of scholarly comm. (350yrs) </li></ul><ul><li>Research practices have moved on </li></ul><ul><ul><li>Highly collaborative, data intensive & networked </li></ul></ul><ul><li>Scholarly communication has not adapted (e.g. the PDF) </li></ul><ul><li>Published “knowledge” hides “dark data” </li></ul><ul><li>Need a natively digital scholarly communication system </li></ul><ul><ul><li>Must support end-to-end the lifecycle of data, information & knowledge </li></ul></ul>“ the future scholarly communication system should closely resemble—and be intertwined with—the scholarly endeavor itself, rather than being its after-thought or annex” Van de Sompel et al 2004.
    16. 16. Current publishing efforts on Scratchpads 1. Low cost journal infrastructure <ul><li>Scratchpads used to publish PDFs (1,000’s) </li></ul><ul><li>Independent editorial control & peer review </li></ul><ul><li>Free to publish, open access, no page limits </li></ul><ul><li>ISBN’s, but no doi’s, PubMed or ISI impact </li></ul><ul><li>No online submission, just static PDFs </li></ul>
    17. 17. Current publishing efforts on Scratchpads 2. Simple data publishing > GBIF Specimen records on Scratchpads Pushed to 3rd party specialist data publishers >18K specimen records (local small scale coverage) >276M specimen records (worldwide coverage)
    18. 18. Paper assembled from Scratchpad database XML submission, peer review & marked-up publication by Pensoft 5-step workflow for selecting data, adding metadata & previewing Published in Zookeys & Phytokeys (worldwide coverage) PDF HTML XML doi:10.3897/zookeys.50.539 Next generation article publishing 3. From prototype to operation
    19. 19. Dataset & metadata description assembled in Scratchpad Dataset description reviewed, published & citable from new journal Datasets built & described via a Scratchpad & dataset deposited in permanent repository Dataset description published in new data publication journal doi:10.3897/biodat.50.539 Next generation data publishing 4. Formally publishing metadata descriptions of datasets Dataset doi:xxxx.50.539
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.