Challenges in developing names services - RDA

•Download as PPTX, PDF•

1 like•500 views

Explanation of how names data are gathered, structured, standardised and annotated - and how these data are mobilised using names services. Challenges are around credit and attribution, usage metrics on services. Presented at the Research Data Alliance plenary 5, 9-11 March 2015, San Diego.

Science

Challenges in developing names
services
Nicky Nicolson, RBG Kew
@nickynicolson
Research Data Alliance Plenary 5, San Diego 9-11 March 2015

How the data are assembled
- Dedicated, long running indexing effort
- Initiated by Darwin c 1885

Indexers scan journals (and books, and
e-journals...)

Summary
• Created structured, standardised, annotated
data
• Data can be the “fuel” for names services
• How to ensure that the creators / annotators
get credit?
– Attribution
– Usage metrics

Translate an attempt at recording a
name as text to an identifier
Schinus longifolius var. paraguariensis
(Hassler) F. Barkley
229196-2

We need to know how the data are
being used...
... To ensure the workers who scan, structure,
annotate the data get credit for their work,
and metrics on its downstream use.
Google Analytics for services?

Viewers also liked

829 tdwg-2015-nicolson-kew-strings-to-thingsnickyn

names-backbone-graph-TDWGnickyn

Kew at the pro-iBiosphere data hackathonnickyn

Rda p5-env-plenary-nnnickyn

Building a names backbonenickyn

Kaiso: Modeling Complex Class Hierarchies with Neo4j - David Szotten @ GraphC...Neo4j

Viewers also liked (6)

829 tdwg-2015-nicolson-kew-strings-to-things

names-backbone-graph-TDWG

Kew at the pro-iBiosphere data hackathon

Rda p5-env-plenary-nn

Building a names backbone

Kaiso: Modeling Complex Class Hierarchies with Neo4j - David Szotten @ GraphC...

Similar to Challenges in developing names services - RDA

SHARE Update for CNI, Fall 2014SHARE

RDA, Data Citation, and PIDs for DataOneResearch Data Alliance

FSCI Data DiscoveryARDC

Linking Open Government Data at Scale Bernadette Hyland-Wood

The Great Lakes: How to Approach a Big Data ImplementationInside Analysis

NISO access related projects (presented at the Charleston conference 2016)Christine Stohn

Data Mining – A Perspective ApproachIRJET Journal

Love Your Data LocallyErin D. Foster

Paving the way to open and interoperable research data service workflows Prog...ResearchSpace

Publishing Physical Sample Records on the WebAnusuriya Devaraju

WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016CLARIAH

Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...SEAD

Maximizing The Value of Your Structured and Unstructured Data with Data Catal...Molly Alexander

DsNourin Daudpoto

Paving the way to open and interoperable research data service workflowsThe University of Edinburgh

The state of global research data initiatives: observations from a life on th...Projeto RCAAP

Digital ScienceKaitlin Thaney

Make your data great nowDaniel JACOB

The Power of DataDataWorks Summit

ArchivesSpace: Building a Next-Generation Archives Management ToolMark Matienzo

Similar to Challenges in developing names services - RDA (20)

SHARE Update for CNI, Fall 2014

RDA, Data Citation, and PIDs for DataOne

FSCI Data Discovery

Linking Open Government Data at Scale

The Great Lakes: How to Approach a Big Data Implementation

NISO access related projects (presented at the Charleston conference 2016)

Data Mining – A Perspective Approach

Love Your Data Locally

Paving the way to open and interoperable research data service workflows Prog...

Publishing Physical Sample Records on the Web

WP4: overzicht van de voortgang van WP4 op de CLARIAH-dag 22 januari 2016

Changing the Curation Equation: A Data Lifecycle Approach to Lowering Costs a...

Maximizing The Value of Your Structured and Unstructured Data with Data Catal...

Paving the way to open and interoperable research data service workflows

The state of global research data initiatives: observations from a life on th...

Digital Science

Make your data great now

The Power of Data

ArchivesSpace: Building a Next-Generation Archives Management Tool

Recently uploaded

The Philosophy of ScienceUniversity of Hertfordshire

STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P

VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P

A relative description on Sonoporation.pdfnehabiju2046

Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6

Luciferase in rDNA technology (biotechnology).pptxAleenaTreesaSaji

Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk

Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1

Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani

G9 Science Q4- Week 1-2 Projectile Motion.pptMAESTRELLAMesa2

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝soniya singh

Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA

Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar

Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani

Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1

Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136

Biological Classification BioHack (3).pdfmuntazimhurra

Is RISC-V ready for HPC workload? Maybe?Patrick Diehl

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani

Recently uploaded (20)

The Philosophy of Science

STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE

VIRUSES structure and classification ppt by Dr.Prince C P

A relative description on Sonoporation.pdf

Biopesticide (2).pptx .This slides helps to know the different types of biop...

Luciferase in rDNA technology (biotechnology).pptx

Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx

Work, Energy and Power for class 10 ICSE Physics

Hubble Asteroid Hunter III. Physical properties of newly found asteroids

G9 Science Q4- Week 1-2 Projectile Motion.ppt

Call Girls in Munirka Delhi 💯Call Us 🔝8264348440🔝

Grafana in space: Monitoring Japan's SLIM moon lander in real time

Analytical Profile of Coleus Forskohlii | Forskolin .pdf

Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b

Recombinant DNA technology (Immunological screening)

Cultivation of KODO MILLET . made by Ghanshyam pptx

Biological Classification BioHack (3).pdf

Is RISC-V ready for HPC workload? Maybe?

All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...

Challenges in developing names services - RDA

1. Challenges in developing names services Nicky Nicolson, RBG Kew @nickynicolson Research Data Alliance Plenary 5, San Diego 9-11 March 2015

2. How the data are assembled - Dedicated, long running indexing effort - Initiated by Darwin c 1885

3. Indexers scan journals (and books, and e-journals...)

4. Locate nomenclatural acts...

6. Extract and structure the data

7. Data are annotated immediately

9. Summary • Created structured, standardised, annotated data • Data can be the “fuel” for names services • How to ensure that the creators / annotators get credit? – Attribution – Usage metrics

10. http://data1.kew.org/reconciliation

11. OpenRefine

12.

13.

14.

15.

16. Translate an attempt at recording a name as text to an identifier Schinus longifolius var. paraguariensis (Hassler) F. Barkley 229196-2

17.

18.

19.

20. We need to know how the data are being used... ... To ensure the workers who scan, structure, annotate the data get credit for their work, and metrics on its downstream use. Google Analytics for services?

Editor's Notes

Data are structured, stored, presented for machine to machine use in RDF.
Legalistic code governs how new names are brought into being. Editors interpret the code, apply it to the nomenclatural acts that they collect and annotate accordingly.
Web page view shows standardised data plus expert annotation
We now have a populated dataset – data have been extracted, structured, standardised, annotated. These data can be the fuel for names services, but how to get attribution and credit back to those who have structured, standardised, annotated.
Example of services run at organisation scale – primarily designed for our own researchers.
Data manipulation / cleaning program called OpenRefine used as our working environment. We can make simple queries to IPNI (for nomenclature) and The Plant List (for taxonomy). It can read data from lots of formats, like CSV and Excel. It looks like a spreadsheet, but it has lots of features for cleaning up messy data. The one we want is to query a reconciliation service – a special website that can be given a piece of text like a plant name, and returns an identifier, like an IPNI id.
Example – a dataset including a column of scientific plant names
Select a reconciliation service against IPNI
The text representations of scientific names are passed to the service (using JSON over HTTP), IPNI ids, with hyperlinks to IPNI, are brought back.
The dataset now augmented with an ID from the reconciled name
So we are here – passed in a name, got back an identifier. Now to query TPL for the taxonomic status.
We’ve populated our data resources that hold scientific names with the identifiers for those names, so we can now do the distributed equivalent of a database join. We’ve a separate resource (“TPL”) which organises the scientific names into a taxonomy. Our researcher can now Choose “Add columns from TPL…”
TPL gives us a list of “properties” – things that it knows about names. Our researcher has chosen “taxonomic status”, and there’s a preview on the right.
Our dataset is now augmented with name identifiers and we’ve used those identifiers to go to a separate resource (TPL) to get taxonomic the status from TPL.

Challenges in developing names services - RDA

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (6)

Similar to Challenges in developing names services - RDA

Similar to Challenges in developing names services - RDA (20)

Recently uploaded

Recently uploaded (20)

Challenges in developing names services - RDA

Editor's Notes