Scratchpads are virtual research environments that allow researchers to collect, curate, analyze, and publish biodiversity data in a seamless workflow. They facilitate open access to digital data through standardized modular platforms that allow data sharing and interlinking. The new Biodiversity Data Journal will publish taxonomic treatments, checklists, keys, and datasets that have been generated using Scratchpads. This will integrate the processes of conducting research and publishing results within a single online environment.
Scientific discovery and innovation in an era of data-intensive science
William (Bill) Michener, Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico; DataONE Principal Investigator
The scope and nature of biological, environmental and earth sciences research are evolving rapidly in response to environmental challenges such as global climate change, invasive species and emergent diseases. Scientific studies are increasingly focusing on long-term, broad-scale, and complex questions that require massive amounts of diverse data collected by remote sensing platforms and embedded environmental sensor networks; collaborative, interdisciplinary science teams; and new tools that promote scientific data preservation, discovery, and innovation. This talk describes the challenges facing scientists as they transition into this new era of data intensive science, presents current solutions, and lays out a roadmap to the future where new information technologies significantly increase the pace of scientific discovery and innovation.
Exploring Process Barriers to Release Public Sector Information in Local Gove...Peter Conradie
Conradie, P. & Choenni, S., 2012. Exploring Process Barriers to Release Public Sector Information in Local Government. In 6th International Conference on Theory and Practice of Electronic Governance, Albany. NY. Albany, New York, pp. 5–13.
Data Equivalence
Mark Parsons, Lead Project Manager, Senior Associate Scientist, National Snow and Ice Data Center
Data citation, especially using persistent identifiers like Digital Object Identifiers (DOIs), is an increasingly accepted scientific practice. Recently, several, respected organizations have developed guidelines for data citation. The different guidelines are largely congruent in that they agree on the basic practice and elements of data citation, especially for relatively static, whole data collections. There is less agreement on the more subtle nuances of data citation that are sometimes necessary to ensure precise reference and scientific reproducibility--the core purpose of data citation. We need to be sure that if you follow a data reference you get to the precise data that were used or at least their scientific equivalent. Identifiers such as DOIs are necessary but not sufficient for the precise, detailed, references necessary. This talk discusses issues around data set versioning, micro-citation, and scientific equivalence. I propose some interim solutions and suggest research strategies for the future.
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
This document discusses making biodiversity data digital, openly accessible, and linked. It introduces Scratchpads, which are virtual research environments that allow taxonomists to make their work digital by uploading and tagging their data on a website. Scratchpads support the taxonomic workflow and allow for community revision. The document also introduces the Biodiversity Data Journal, a new open access journal that will publish small datasets to make them more accessible and link them together. It will have an online collaborative authoring tool and publish various types of biodiversity data papers.
Opening Keynote: The Many and the One: BCE themes in 21st century data curation
Allen Renear, Professor and Interim Dean, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Two scientists can be using "the same data" even though the computer files involved appear to be quite different. This is familiar enough, and for the most part, in small communities with shared practices and familiar datasets, raises few problems. But these informal understandings do not scale to 21st century data curation. To get full value from cyberinfrastructure we must support huge quantities of heterogeneous data developed by diverse communities and used by diverse communities -- often with widely varying methods, tools, and purposes. To accomplish this our informal practices and understandings much be replaced, or at least supplemented, by a shared framework of standard terminology for describing complex cascades of representational levels and relationships. Fundamental problems in data curation -- and in particular problems involving provenance, identifiers, and data citation — cannot be fully resolved without such a framework. Although the deepest problems here have ancient origins, useful practical measures are now within reach. Some recent work toward this end that is being carried out at the Center for Informatics Research in Science and Scholarship (CIRSS) at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign will be described.
DataCite and Campus Data Services
Paul Bracke, Associate Dean for Digital Programs and Information Services, Purdue University
Research libraries are increasingly interested in developing data services for their campuses. There are many perspectives, however, on how to develop services that are responsive to the many needs of scientists; sensitive to the concerns of scientists who are not always accustomed to sharing their data; and that are attractive to campus administrators. This presentation will discuss the development of campus-based data services programs, the centrality of data citation to these efforts, and the ways in which engagement with DataCite can enhance local programs.
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
RDAP13 Mark Leggott: Stewarding research data using the Islandora frameworkASIS&T
Mark Leggott, University of PEI/DiscoveryGarden
Islandora: Stewarding research data using the Islandora framework
Mark Leggott, Thornton Staples and Kathleen Van Ekris
Panel: Global scientific data infrastructure
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Scientific discovery and innovation in an era of data-intensive science
William (Bill) Michener, Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico; DataONE Principal Investigator
The scope and nature of biological, environmental and earth sciences research are evolving rapidly in response to environmental challenges such as global climate change, invasive species and emergent diseases. Scientific studies are increasingly focusing on long-term, broad-scale, and complex questions that require massive amounts of diverse data collected by remote sensing platforms and embedded environmental sensor networks; collaborative, interdisciplinary science teams; and new tools that promote scientific data preservation, discovery, and innovation. This talk describes the challenges facing scientists as they transition into this new era of data intensive science, presents current solutions, and lays out a roadmap to the future where new information technologies significantly increase the pace of scientific discovery and innovation.
Exploring Process Barriers to Release Public Sector Information in Local Gove...Peter Conradie
Conradie, P. & Choenni, S., 2012. Exploring Process Barriers to Release Public Sector Information in Local Government. In 6th International Conference on Theory and Practice of Electronic Governance, Albany. NY. Albany, New York, pp. 5–13.
Data Equivalence
Mark Parsons, Lead Project Manager, Senior Associate Scientist, National Snow and Ice Data Center
Data citation, especially using persistent identifiers like Digital Object Identifiers (DOIs), is an increasingly accepted scientific practice. Recently, several, respected organizations have developed guidelines for data citation. The different guidelines are largely congruent in that they agree on the basic practice and elements of data citation, especially for relatively static, whole data collections. There is less agreement on the more subtle nuances of data citation that are sometimes necessary to ensure precise reference and scientific reproducibility--the core purpose of data citation. We need to be sure that if you follow a data reference you get to the precise data that were used or at least their scientific equivalent. Identifiers such as DOIs are necessary but not sufficient for the precise, detailed, references necessary. This talk discusses issues around data set versioning, micro-citation, and scientific equivalence. I propose some interim solutions and suggest research strategies for the future.
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
This document discusses making biodiversity data digital, openly accessible, and linked. It introduces Scratchpads, which are virtual research environments that allow taxonomists to make their work digital by uploading and tagging their data on a website. Scratchpads support the taxonomic workflow and allow for community revision. The document also introduces the Biodiversity Data Journal, a new open access journal that will publish small datasets to make them more accessible and link them together. It will have an online collaborative authoring tool and publish various types of biodiversity data papers.
Opening Keynote: The Many and the One: BCE themes in 21st century data curation
Allen Renear, Professor and Interim Dean, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Two scientists can be using "the same data" even though the computer files involved appear to be quite different. This is familiar enough, and for the most part, in small communities with shared practices and familiar datasets, raises few problems. But these informal understandings do not scale to 21st century data curation. To get full value from cyberinfrastructure we must support huge quantities of heterogeneous data developed by diverse communities and used by diverse communities -- often with widely varying methods, tools, and purposes. To accomplish this our informal practices and understandings much be replaced, or at least supplemented, by a shared framework of standard terminology for describing complex cascades of representational levels and relationships. Fundamental problems in data curation -- and in particular problems involving provenance, identifiers, and data citation — cannot be fully resolved without such a framework. Although the deepest problems here have ancient origins, useful practical measures are now within reach. Some recent work toward this end that is being carried out at the Center for Informatics Research in Science and Scholarship (CIRSS) at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign will be described.
DataCite and Campus Data Services
Paul Bracke, Associate Dean for Digital Programs and Information Services, Purdue University
Research libraries are increasingly interested in developing data services for their campuses. There are many perspectives, however, on how to develop services that are responsive to the many needs of scientists; sensitive to the concerns of scientists who are not always accustomed to sharing their data; and that are attractive to campus administrators. This presentation will discuss the development of campus-based data services programs, the centrality of data citation to these efforts, and the ways in which engagement with DataCite can enhance local programs.
These are the slides for Robert H. McDonald for the Future Trends Panel Presentation at the the Inter-institutional Approaches to Supporting Scholarly Communication Symposium held on August 16, 2012 at the Georgia Institute of Technology.
RDAP13 Mark Leggott: Stewarding research data using the Islandora frameworkASIS&T
Mark Leggott, University of PEI/DiscoveryGarden
Islandora: Stewarding research data using the Islandora framework
Mark Leggott, Thornton Staples and Kathleen Van Ekris
Panel: Global scientific data infrastructure
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13
Scratchpads are hosted websites for biodiversity data that facilitate online research communities. They provide a standardized environment for entering and curating data to enable data sharing and linking. Scratchpads accelerate publication and dissemination by linking together taxonomic data in a digital, open, and interconnected way. There are currently over 450 Scratchpad communities created by over 6,000 registered users covering over 50,000 taxa.
The digital universe is booming, especially metadata and user-generated data. This raises strong challenges in order to identify the relevant portions of data which are relevant for a particular problem and to deal with the lifecycle of data. Finer grain problems include data evolution and the potential impact of change in the applications relying on the data, causing decay. The management of scientific data is especially sensitive to this. We present the Research Objects concept as the means to indentify and structure relevant data in scientific domains, addressing data as first-class citizens. We also identify and formally represent the main reasons for decay in this domain and propose methods and tools for their diagnosis and repair, based on provenance information. Finally, we discuss on the application of these concepts to the broader domain of the Web of Data: Data with a Purpose.
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Presentation at the Canadian Knowledge Mobilization Forum 2012, Ottawa, Ontario, http://www.kmbforum2012.org/
The document discusses the BiSciCol project, which aims to address challenges in managing and integrating biodiversity data. BiSciCol develops infrastructure to assign globally unique identifiers (GUIDs) to specimens and derivatives and link these identifiers using a linked data approach. This allows for tracking objects across scientific disciplines. The document outlines biodiversity data challenges and how BiSciCol and linked data techniques can help solve issues relating to distributed and changing data from multiple domains. It also examines the "Triplifier Simplifier" tool used to extract, link, and publish data as linked open data.
This document summarizes key points from a presentation given at the Entomological Collections Network meeting about the Biodiversity Information Standards (TDWG) Conference 2013. The presentation discussed iDigBio's goals of building an accessible database of US specimen data and facilitating digitization. It provided an overview of TDWG topics like data quality, semantics, and standards. Researchers, collections managers, and others were encouraged to get involved in TDWG to help bridge the gap between research data and databases and avoid duplicating efforts.
Preserving the Inputs and Outputs of Scholarshiptsbbbu
Tim Babbitt discusses the changing context of research and scholarship due to digitization and the internet. The inputs and outputs of research are increasingly digital and complex, including data, code, presentations, and more. ProQuest has a history of preserving scholarship through microfilming and is exploring how to preserve the full range of digital scholarly outputs and their linkages in a sustainable way. Key questions include balancing new and old preservation methods and moving beyond preserving individual objects to also preserving networks and linkages between scholarly works.
Making your data work for you: Scratchpads, publishing & the Biodiversity Dat...Vince Smith
This document discusses using Scratchpads as virtual research environments to make taxonomic data digital, openly accessible, and linked up. Scratchpads are websites that allow taxonomists and their communities to edit, publish, and review their research and data. They support the entire taxonomic workflow and integration with other databases. Over 450 Scratchpad sites have been created, with over 6,800 registered users contributing data on taxa, specimens, images, and more. The document outlines the features and capabilities of Scratchpads for collaborative work and publishing taxon data in an open and sustainable way.
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012Lee Dirks
An invited talk to 40+ directors of national libraries worldwide at the annual ExLibris member meeting at IFLA (Helsinki, Finland) on August 15th, 2012.
The document is a chapter from a textbook on data mining written by Akannsha A. Totewar, a professor at YCCE in Nagpur, India. It provides an introduction to data mining, including definitions of data mining, the motivation and evolution of the field, common data mining tasks, and major issues in data mining such as methodology, performance, and privacy.
The document discusses digital worlds and applications at both the enterprise and national scales in the United States healthcare system. It notes the massive scale of healthcare data sources, including hundreds of thousands of healthcare offices and databases containing information on hundreds of millions of patients. The critical importance of making sense of this vast amount of heterogeneous healthcare data to improve human lives and health outcomes is also emphasized.
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
I gave this presentation to the STM Publishers Association Innovation Conference in London, 4-December-2009. It frames the data citation problem and introduces DataCite - the international data citation initiative.
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
Research Data Management: What is it and why is the Library & Archives Servic...GarethKnight
This document summarizes research data management and the library and archives service's involvement. It defines research data, explains why data needs to be managed, and outlines the key drivers for data management and publication. It then describes the library and archives service's knowledge of data management, the research data management support service being established, and the guidance, training, and tools being developed to help researchers with data management.
intelligent assistant helps users find relevant data, formulate queries,
interpret results, and identify next steps
Share & Reuse: share queries, workflows, and results with colleagues; reuse
queries and components across projects
Track & Audit: track all data access, queries, and changes for regulatory
compliance and reproducibility
Integrate: open APIs enable integration with other systems and custom
development
...and more!
The Qiagram Framework provides a complete solution for translational data
informatics needs.
Qiagram Framework: Key Components
1. Qiagram Core: the visual query interface and underlying query engine
2. Data Services: import, clean, standardize, and manage data
3.
Scratchpads are hosted websites for biodiversity data that facilitate online research communities. They provide a standardized environment for entering and curating data to enable data sharing and linking. Scratchpads accelerate publication and dissemination by linking together taxonomic data in a digital, open, and interconnected way. There are currently over 450 Scratchpad communities created by over 6,000 registered users covering over 50,000 taxa.
The digital universe is booming, especially metadata and user-generated data. This raises strong challenges in order to identify the relevant portions of data which are relevant for a particular problem and to deal with the lifecycle of data. Finer grain problems include data evolution and the potential impact of change in the applications relying on the data, causing decay. The management of scientific data is especially sensitive to this. We present the Research Objects concept as the means to indentify and structure relevant data in scientific domains, addressing data as first-class citizens. We also identify and formally represent the main reasons for decay in this domain and propose methods and tools for their diagnosis and repair, based on provenance information. Finally, we discuss on the application of these concepts to the broader domain of the Web of Data: Data with a Purpose.
Albert Simard - Mobilizing Knowledge: Acquisition, Analysis, and Action
Presentation at the Canadian Knowledge Mobilization Forum 2012, Ottawa, Ontario, http://www.kmbforum2012.org/
The document discusses the BiSciCol project, which aims to address challenges in managing and integrating biodiversity data. BiSciCol develops infrastructure to assign globally unique identifiers (GUIDs) to specimens and derivatives and link these identifiers using a linked data approach. This allows for tracking objects across scientific disciplines. The document outlines biodiversity data challenges and how BiSciCol and linked data techniques can help solve issues relating to distributed and changing data from multiple domains. It also examines the "Triplifier Simplifier" tool used to extract, link, and publish data as linked open data.
This document summarizes key points from a presentation given at the Entomological Collections Network meeting about the Biodiversity Information Standards (TDWG) Conference 2013. The presentation discussed iDigBio's goals of building an accessible database of US specimen data and facilitating digitization. It provided an overview of TDWG topics like data quality, semantics, and standards. Researchers, collections managers, and others were encouraged to get involved in TDWG to help bridge the gap between research data and databases and avoid duplicating efforts.
Preserving the Inputs and Outputs of Scholarshiptsbbbu
Tim Babbitt discusses the changing context of research and scholarship due to digitization and the internet. The inputs and outputs of research are increasingly digital and complex, including data, code, presentations, and more. ProQuest has a history of preserving scholarship through microfilming and is exploring how to preserve the full range of digital scholarly outputs and their linkages in a sustainable way. Key questions include balancing new and old preservation methods and moving beyond preserving individual objects to also preserving networks and linkages between scholarly works.
Making your data work for you: Scratchpads, publishing & the Biodiversity Dat...Vince Smith
This document discusses using Scratchpads as virtual research environments to make taxonomic data digital, openly accessible, and linked up. Scratchpads are websites that allow taxonomists and their communities to edit, publish, and review their research and data. They support the entire taxonomic workflow and integration with other databases. Over 450 Scratchpad sites have been created, with over 6,800 registered users contributing data on taxa, specimens, images, and more. The document outlines the features and capabilities of Scratchpads for collaborative work and publishing taxon data in an open and sustainable way.
ExLibris National Library Meeting @ IFLA-Helsinki - Aug 15th 2012Lee Dirks
An invited talk to 40+ directors of national libraries worldwide at the annual ExLibris member meeting at IFLA (Helsinki, Finland) on August 15th, 2012.
The document is a chapter from a textbook on data mining written by Akannsha A. Totewar, a professor at YCCE in Nagpur, India. It provides an introduction to data mining, including definitions of data mining, the motivation and evolution of the field, common data mining tasks, and major issues in data mining such as methodology, performance, and privacy.
The document discusses digital worlds and applications at both the enterprise and national scales in the United States healthcare system. It notes the massive scale of healthcare data sources, including hundreds of thousands of healthcare offices and databases containing information on hundreds of millions of patients. The critical importance of making sense of this vast amount of heterogeneous healthcare data to improve human lives and health outcomes is also emphasized.
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
I gave this presentation to the STM Publishers Association Innovation Conference in London, 4-December-2009. It frames the data citation problem and introduces DataCite - the international data citation initiative.
This presentation sets out some of the challenges around citing and identifying datasets and introduces DataCite, the international data citation initiative. DataCite was founded on 1-December 2009 to support researchers by
providing methods for them to locate, identify, and cite
research datasets with confidence.
This presentation was given by Adam Farquhar at the STM Publishers Association Innovation Conference on 4-Dec-2009.
Research Data Management: What is it and why is the Library & Archives Servic...GarethKnight
This document summarizes research data management and the library and archives service's involvement. It defines research data, explains why data needs to be managed, and outlines the key drivers for data management and publication. It then describes the library and archives service's knowledge of data management, the research data management support service being established, and the guidance, training, and tools being developed to help researchers with data management.
intelligent assistant helps users find relevant data, formulate queries,
interpret results, and identify next steps
Share & Reuse: share queries, workflows, and results with colleagues; reuse
queries and components across projects
Track & Audit: track all data access, queries, and changes for regulatory
compliance and reproducibility
Integrate: open APIs enable integration with other systems and custom
development
...and more!
The Qiagram Framework provides a complete solution for translational data
informatics needs.
Qiagram Framework: Key Components
1. Qiagram Core: the visual query interface and underlying query engine
2. Data Services: import, clean, standardize, and manage data
3.
Similar to Publishing biodiversity: The interplay between Scratchpads and the new Biodiversity Data Journal (20)
This presentation was provided by Rebecca Benner, Ph.D., of the American Society of Anesthesiologists, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
Andreas Schleicher presents PISA 2022 Volume III - Creative Thinking - 18 Jun...EduSkills OECD
Andreas Schleicher, Director of Education and Skills at the OECD presents at the launch of PISA 2022 Volume III - Creative Minds, Creative Schools on 18 June 2024.
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...TechSoup
Whether you're new to SEO or looking to refine your existing strategies, this webinar will provide you with actionable insights and practical tips to elevate your nonprofit's online presence.
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.pptHenry Hollis
The History of NZ 1870-1900.
Making of a Nation.
From the NZ Wars to Liberals,
Richard Seddon, George Grey,
Social Laboratory, New Zealand,
Confiscations, Kotahitanga, Kingitanga, Parliament, Suffrage, Repudiation, Economic Change, Agriculture, Gold Mining, Timber, Flax, Sheep, Dairying,
Gender and Mental Health - Counselling and Family Therapy Applications and In...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
Publishing biodiversity: The interplay between Scratchpads and the new Biodiversity Data Journal
1. Publishing Biodiversity:
The interplay between Scratchpads and
the new Biodiversity Data Journal
Koureas D.N.1, Rycroft S. 1, Baker E. 1, Livermore L. 1, Scott B. 1,
Heaton A.1, Bouton K.1, Penev L.2, Roberts D.1 and Smith V.S.1
1
The Natural History Museum London
2
Pensoft Publishers
2. Our current taxonomic data production
• 15-20k new spp. described annually (2M total)1
• 30k nomenclatural acts (12M total) 1
• 20k phylogenies (750k total)2
• 31k taxa sequenced (360k taxa total)3
• 800k BioMed papers (40M total pp. of taxonomy) 4
• Countless specimens, images, maps, keys and datasets
Typically generated by small communities for
“local” research projects
Figures from 1) Zhang, Zootaxa 2011 4, 1-4; 2) Web-of-Science; 3) Genbank and 4) PubMed.
3. The four nodes of data workflow
1. We collect and generate data
2. We curate, link and structure data
3. We analyse data
4. We publish data
4. The four nodes of data workflow
What are the
bottlenecks
in the workflow? Data
Data
collection &
collection &
generation
generation
bottleneck
Data
Data Data
Data
publishing
publishing curation
curation
bottleneck
Data
Data
analysis
analysis
5. What we need is…
a
seamless
workflow Data
Data
collection &
collection &
generation
generation
Data
Data Data
Data
publishing
publishing curation
curation
Data
Data
analysis
analysis
6. To achieve this…
This requires data, information & knowledge
Link together
“ to be…
evolutionary •Digital
data… by developing Not printed paper
•Openly accessible
analytical tools and Not behind barriers (e.g. paywalls)
proper •Linked-up
documentation and Not in silos
then use this framework to
conduct comparative analyses,
studies of evolutionary process Global Systematics
and biodiversity analyses”
Cyndy Parr, Rob Guralnick, Nico Cellinese and Rod Page. TREE. doi:10.1016/j.tree.2011.11.001
9. What are Scratchpads?
• Hosted websites for biodiversity data
• Virtual research & publication platform
• Completely open access & open source
• Modular & flexible
10. What are Scratchpads?
facilitate
development of online research communities
through
standardized environment of entering and curating data
that allow
sharing and interlinking
and
dissemination of research products
11. The Scratchpads concept
A Scratchpad is a website that holds data for you and your community
Your data External data & services
13. Are Scratchpads sustainable?
464 Scratchpads Communities
by 6,407 active registered users
In total more than
covering 52,661 taxa
in 559,488 pages. 1,200,000 visitors
Per month unique visitors to Scratchpads sites
65000
unique visitors/month
17. The main features
Taxon pages
Overview of data related to taxon
Generated from tagged content
18. The main features
Bibliography management
An inbuilt Bibliography manager
Faceted browsing
Taxon tagging and free keywords
Import from and export to all major formats
19. The main features
Specimen/Observation data
Annotated full specimen/observation records
Linked to images and georeferenced
20. The main features
Distribution maps
Google maps based
Data layers
Occurrence data
Distribution data
TDWG regions
GBIF data
21. The main features
Character matrices – Key construction
Quantitative or qualitative characters
Auto generation of keys
Taxon based matrices
[Specimens based character matrices]
28. What will BDJ publish?
• Single taxon treatments and
nomenclatural acts
• Local or regional checklists
• Sampling reports and occasional
inventories
• Habitat-based checklists and inventories
• Ecological and biological observations of
species and communities?
• Single identification keys
• biodiversity-related databases, including
genomic, ecological and environmental
data (data papers)
• Biodiversity-related software tools
30. Working in a single environment
Allow submission of
datasets
for publication
without
reformatting and restructuring
based on standardised XML schema
31. The publication module
Data included in manuscript in a structured annotated format
Author names and affiliations
34. The publication module
Author names and affiliations
Taxon descriptions
Specimen data
Figures and Tables
XML
XML
Keys
References
Texts
35. The data workflow
XML
Community
submission
PENSOFT JOURNAL SYSTEM
SCRATCHPADS
(PJS 2.0)
MANUSCRIPT PUBLISHED
MANUSCRIPT PUBLISHED
(XML, PDF)
(XML, PDF)
Archive datasets Occurrence data Taxon treatments Taxon names
Plazi Wiki
36. The editorial workflow
Scratchpads Penso Peer-review op ons
Journal Public
Community
System Closed
(PJS)
Review
Review
Nominated reviewers
requests
Review
Editor
Collabora ve Panel reviewers
online wri ng Online edi ng
Review
Editorial
decision & feedback Public reviewers
Authors
Publica on & All reviews assembled into a
Online edi ng dissemina on single online version
Author’s revised
manuscript
37. Example papers via Scratchpads…
Blagoderov V, Hippa H, Nel A (2010). ZooKeys 50: 79–90. Faulwetter S, Chatzigeorgiou G, Galil BS, Nicolaidou A, Brake I, von Tschirnhaus M (2010). ZooKeys 50: 91–96.
doi: 10.3897/zookeys.50.506 Arvanitidis C (2011. ZooKeys 150: 327–345. doi: doi: 10.3897/zookeys.50.505
10.3897/zookeys.150.1877
http://sciaroidea.info/node/44428 http://polychaetes.marbigen.org/node/35 http://milichiidae.info/node/14995
Live (updated) versions of these papers
38.
39. Acknowledgements
Scratchpads technical development
- Simon Rycroft, Ben Scott, Ed Baker, Alice Heaton & Katherine Bouton
Scratchpads outreach
- Laurence Livermore, Isa van deVelde & Dimitris Koureas
e-Monocot
- Paul Wilkin & the Kew team, Charles Godfray & the Oxford team
ViBRANT
- Vince Smith, Dave Roberts & Lucy Reeve
Pensoft
- Lyobomir Penev and the team
Our 7000 users
40. Data
Data
collection &
collection &
generation
generation
Data Data
Data
Data
publishing
publishing Thank you curation
curation
Data
Data
analysis
analysis
41.
42. Authors and Contributors
Contributors
(mentor, linguis c editor, copy editor,
poten al reviewer, colleague/friend) Con
trib
u
ng
ite
Inv
Manuscript ready to submit
Taxon treatment
Template-
based Interac ve key
manuscript Checklist
Authoring
Lead author crea on
Data paper
Inv
ite
ing
hor
Aut
Co-authors