SlideShare a Scribd company logo
1 of 33
Download to read offline
Building a “names
    backbone”

    Nicky Nicolson, RBG Kew
A names backbone


== “an environment for the management of multiple
  overlapping classifications and tracking how these
  change over time”
Not a monolith:
   • Built on a layered view of the domain – clearly
     separating names and taxonomy
   • Names form the objective basis for higher layers
The current situation…
Many overlapping systems, few links
… and what we’re aiming for:
Authoritative data, reduced duplication, many more links
Names backbone: a layered environment
Name occurrence layer AKA
           “Nomen-clutter”

== any attempt
at the
transcription of
a name..
Names layer

Holds objective
published facts
about a name:
-Orthography
- Authorship
- Protologue
reference
- Type citation
- Objective
synonymy
Concepts layer

Hypotheses
draw names
together to form
concepts via
heterotypic
synonymy
The (current) problem:
Most people want
to operate at
concept level…
The (current) problem:




… but have
to start right
down at the
lowest level
The problem:
Solving the problem…

We need to provide ways to allow people to better
 navigate between the layers, and better focus their
 efforts – e.g. build classifications using the same
 objective bases.

We started with a blank sheet of paper – it’s hard to get
 existing systems to conform to the layering that we
 need
Drawbacks of data models used to
              date
• conflated the storage of names and concepts.
• store only a single classification
• store only the end product of a thought process, not
  work in progress
• are difficult to version
• are difficult to query effectively (for hierarchies etc)
A new (graph) model


• Stores data as graphs – composed of nodes and
  directed relationships
• Both nodes and relationships can hold data as
  properties
• Supports highly interconnected data
• Supports self-referential data
• Optimised for queries on relationships
Using a graph model to hold
         concept data: Attempt #1
Two nodes, with name
+ status properties,
and an “accepted_as”
link.
== a naïve use of the
graph model: status is
stored in 2 places
(explicitly in status
property, implicitly
by the participation
relationship)
Using a graph model to hold
         concept data: Attempt #2
More strict about the
separation of the
nomenclatural
information (the nodes)
and the taxonomic
information (the
relationships between
nodes), but the link
is still very sparse…
Using a graph model to hold
         concept data: Attempt #3
Add an attribute to
indicate which
classification asserts
this subjective
relationship:
Taxonomic status of a
name is inferred from
its participation
in a subjective
taxonomic relationship.
Links become more interesting
            than the nodes
Expand the data
held on the
subjective
relationship to allow
it to be
computationally
assessed
Multiple opinions – using the
           same name nodes
Reuse the name
nodes to store
multiple opinions
using the same
basic facts (name
nodes)
Relationships held

Objective, e.g.:
• Combination-basionym
• Later_homonym
• Alternative_name_for
• …
Subjective, e.g.:
• Parent_child (taxonomic placement)
• Synonym (heterotypic synonymy)
• …
Objective relationships “stronger” than
               subjective
Supporting versioning


We keep all relationships, modifications to the data just
  mark relationships as no longer current.
We can always resurrect the state of the graph
== persistent identification of taxon concepts
Versioning = name id +
      classification + state




We can always resurrect the state of the graph.
Versioning enables remote curation of the data
Versioning = name id +
      classification + state




We can always resurrect the state of the graph.
Versioning enables remote curation of the data
Versioning = name id +
             classification + state
State1, according to
WCS:
Xus yus Smith (A)
 = Aus bus Jones
(S)
State2, according to
WCS:
Xus zus White (A)
 = Xus yus Smith
(S)
 = Aus bus Jones
        We can always resurrect the state of the graph.
(S)
        Versioning enables remote curation of the data
What can be done with this kind of
          data model?
• Client systems can reliably connect to a version of a
  concept
• We can see how concepts change over time
• Researchers can query the data to compare
  classifications and identify areas of dispute
Longer term:
• Examine the “computed acceptance” rules used in
  TPL - could these be run on the relationships in the
  names backbone?
Building it: we first focussed on
      the top two layers…
… but we need a way to manage
    the name occurrences
Building the name occurrence layer:


Populating it:
• Seed it with authoritative set of names
• Add the version history of these names – how were
  these names transcribed in the past?
Using it:
• Load candidate name occurrences and match them,
  storing metrics on the match.
Reviewing – a “data improvement” team to:
• Verify the matches, focussing on ambiguity (that
  which can’t be done computationally) == annotation
Services: name occurrence layer


- Data input / output:
DwCA
-Linking and
reviewing links
-RSS feeds to
indicate activity
Services: names layer
- Data input / output:
TCS
-Propose addition /
edit of names
-RSS feeds to
indicate activity
Services: concepts layer
- Data input / output:
TCS
-Create
classifications using
names
-Propose
addition / edit of
names to names
layer
-RSS feeds
The names backbone is an
       extensible environment:
• Links “name occurrences” to names
• Separates curation of names and concepts
• Supports building concepts on the same objective
  basis: enables sharing and reuse of foundation data.
• Allow many relationships to form concepts – supports
  multiple overlapping classifications
• Allows distributed curation of the concepts.

More Related Content

What's hot

Relational Model in dbms & sql database
Relational Model in dbms & sql databaseRelational Model in dbms & sql database
Relational Model in dbms & sql databasegourav kottawar
 
Object relational and extended relational databases
Object relational and extended relational databasesObject relational and extended relational databases
Object relational and extended relational databasesSuhad Jihad
 
Object-Relational Database Systems(ORDBMSs)
Object-Relational Database Systems(ORDBMSs)Object-Relational Database Systems(ORDBMSs)
Object-Relational Database Systems(ORDBMSs)Sahan Walpitagamage
 
introduction of database in DBMS
introduction of database in DBMSintroduction of database in DBMS
introduction of database in DBMSAbhishekRajpoot8
 
SQL interview questions by jeetendra mandal - part 4
SQL interview questions by jeetendra mandal - part 4SQL interview questions by jeetendra mandal - part 4
SQL interview questions by jeetendra mandal - part 4jeetendra mandal
 
NOSQL IMPLEMENTATION OF A CONCEPTUAL DATA MODEL: UML CLASS DIAGRAM TO A DOCUM...
NOSQL IMPLEMENTATION OF A CONCEPTUAL DATA MODEL: UML CLASS DIAGRAM TO A DOCUM...NOSQL IMPLEMENTATION OF A CONCEPTUAL DATA MODEL: UML CLASS DIAGRAM TO A DOCUM...
NOSQL IMPLEMENTATION OF A CONCEPTUAL DATA MODEL: UML CLASS DIAGRAM TO A DOCUM...ijdms
 
DBMS - Relational Model
DBMS - Relational ModelDBMS - Relational Model
DBMS - Relational ModelOvais Imtiaz
 
Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...Vivian S. Zhang
 
Overview of Object-Oriented Concepts Characteristics by vikas jagtap
Overview of Object-Oriented Concepts Characteristics by vikas jagtapOverview of Object-Oriented Concepts Characteristics by vikas jagtap
Overview of Object-Oriented Concepts Characteristics by vikas jagtapVikas Jagtap
 
Relational Database Design
Relational Database DesignRelational Database Design
Relational Database DesignArchit Saxena
 

What's hot (20)

Relational Model in dbms & sql database
Relational Model in dbms & sql databaseRelational Model in dbms & sql database
Relational Model in dbms & sql database
 
SQL
SQL SQL
SQL
 
Object relational and extended relational databases
Object relational and extended relational databasesObject relational and extended relational databases
Object relational and extended relational databases
 
Sql – pocket guide
Sql – pocket guideSql – pocket guide
Sql – pocket guide
 
Object-Relational Database Systems(ORDBMSs)
Object-Relational Database Systems(ORDBMSs)Object-Relational Database Systems(ORDBMSs)
Object-Relational Database Systems(ORDBMSs)
 
Group Members
Group MembersGroup Members
Group Members
 
introduction of database in DBMS
introduction of database in DBMSintroduction of database in DBMS
introduction of database in DBMS
 
RDBMS.
RDBMS.RDBMS.
RDBMS.
 
SQL interview questions by jeetendra mandal - part 4
SQL interview questions by jeetendra mandal - part 4SQL interview questions by jeetendra mandal - part 4
SQL interview questions by jeetendra mandal - part 4
 
NOSQL IMPLEMENTATION OF A CONCEPTUAL DATA MODEL: UML CLASS DIAGRAM TO A DOCUM...
NOSQL IMPLEMENTATION OF A CONCEPTUAL DATA MODEL: UML CLASS DIAGRAM TO A DOCUM...NOSQL IMPLEMENTATION OF A CONCEPTUAL DATA MODEL: UML CLASS DIAGRAM TO A DOCUM...
NOSQL IMPLEMENTATION OF A CONCEPTUAL DATA MODEL: UML CLASS DIAGRAM TO A DOCUM...
 
DBMS - Relational Model
DBMS - Relational ModelDBMS - Relational Model
DBMS - Relational Model
 
D B M S Animate
D B M S AnimateD B M S Animate
D B M S Animate
 
Dbms Lecture Notes
Dbms Lecture NotesDbms Lecture Notes
Dbms Lecture Notes
 
Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...Data Science Academy Student Demo day--Michael blecher,the importance of clea...
Data Science Academy Student Demo day--Michael blecher,the importance of clea...
 
Overview of Object-Oriented Concepts Characteristics by vikas jagtap
Overview of Object-Oriented Concepts Characteristics by vikas jagtapOverview of Object-Oriented Concepts Characteristics by vikas jagtap
Overview of Object-Oriented Concepts Characteristics by vikas jagtap
 
Database management systems
Database management systemsDatabase management systems
Database management systems
 
Sql ppt
Sql pptSql ppt
Sql ppt
 
Relational Database Design
Relational Database DesignRelational Database Design
Relational Database Design
 
Relational model
Relational modelRelational model
Relational model
 
Programming in C
Programming in CProgramming in C
Programming in C
 

Similar to Building a names backbone

Services and Kew's (names) data
Services and Kew's (names) dataServices and Kew's (names) data
Services and Kew's (names) datanickyn
 
Dbms Lec Uog 02
Dbms Lec Uog 02Dbms Lec Uog 02
Dbms Lec Uog 02smelltulip
 
Data_base.pptx
Data_base.pptxData_base.pptx
Data_base.pptxMohit89650
 
Database System Concepts AND architecture [Autosaved].pptx
Database System Concepts AND architecture [Autosaved].pptxDatabase System Concepts AND architecture [Autosaved].pptx
Database System Concepts AND architecture [Autosaved].pptxKoteswari Kasireddy
 
MIT302 Lesson 2_Advanced Database Systems.pptx
MIT302 Lesson 2_Advanced Database Systems.pptxMIT302 Lesson 2_Advanced Database Systems.pptx
MIT302 Lesson 2_Advanced Database Systems.pptxelsagalgao
 
Database.ppt
Database.pptDatabase.ppt
Database.pptFaimHasan
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Mariana Damova, Ph.D
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsPyData
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Noemi Derzsy
 
database management system
database management systemdatabase management system
database management systemNivetha Ganesan
 
Kskv kutch university DBMS unit 1 basic concepts, data,information,database,...
Kskv kutch university DBMS unit 1  basic concepts, data,information,database,...Kskv kutch university DBMS unit 1  basic concepts, data,information,database,...
Kskv kutch university DBMS unit 1 basic concepts, data,information,database,...Dipen Parmar
 
Database system concepts
Database system conceptsDatabase system concepts
Database system conceptsKumar
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Merce Crosas
 
Data models and ro
Data models and roData models and ro
Data models and roDiana Diana
 

Similar to Building a names backbone (20)

Services and Kew's (names) data
Services and Kew's (names) dataServices and Kew's (names) data
Services and Kew's (names) data
 
Dbms Lec Uog 02
Dbms Lec Uog 02Dbms Lec Uog 02
Dbms Lec Uog 02
 
Data_base.pptx
Data_base.pptxData_base.pptx
Data_base.pptx
 
Database System Concepts AND architecture [Autosaved].pptx
Database System Concepts AND architecture [Autosaved].pptxDatabase System Concepts AND architecture [Autosaved].pptx
Database System Concepts AND architecture [Autosaved].pptx
 
Data models
Data modelsData models
Data models
 
Data models
Data modelsData models
Data models
 
NoSQL Basics - A Quick Tour
NoSQL Basics - A Quick TourNoSQL Basics - A Quick Tour
NoSQL Basics - A Quick Tour
 
MIT302 Lesson 2_Advanced Database Systems.pptx
MIT302 Lesson 2_Advanced Database Systems.pptxMIT302 Lesson 2_Advanced Database Systems.pptx
MIT302 Lesson 2_Advanced Database Systems.pptx
 
Database.ppt
Database.pptDatabase.ppt
Database.ppt
 
Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011Contextual Ontology Alignment - ESWC 2011
Contextual Ontology Alignment - ESWC 2011
 
DBMS
DBMS DBMS
DBMS
 
Data Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA DatasetsData Science Keys to Open Up OpenNASA Datasets
Data Science Keys to Open Up OpenNASA Datasets
 
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
Data Science Keys to Open Up OpenNASA Datasets - PyData New York 2017
 
Cs501 intro
Cs501 introCs501 intro
Cs501 intro
 
database management system
database management systemdatabase management system
database management system
 
Kskv kutch university DBMS unit 1 basic concepts, data,information,database,...
Kskv kutch university DBMS unit 1  basic concepts, data,information,database,...Kskv kutch university DBMS unit 1  basic concepts, data,information,database,...
Kskv kutch university DBMS unit 1 basic concepts, data,information,database,...
 
Database system concepts
Database system conceptsDatabase system concepts
Database system concepts
 
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
Addressing the New Challenges in Data Sharing: Large-Scale Data and Sensitive...
 
Data models and ro
Data models and roData models and ro
Data models and ro
 
Data Structures.ppt
Data Structures.pptData Structures.ppt
Data Structures.ppt
 

More from nickyn

829 tdwg-2015-nicolson-kew-strings-to-things
829 tdwg-2015-nicolson-kew-strings-to-things829 tdwg-2015-nicolson-kew-strings-to-things
829 tdwg-2015-nicolson-kew-strings-to-thingsnickyn
 
Rda p5-env-plenary-nn
Rda p5-env-plenary-nnRda p5-env-plenary-nn
Rda p5-env-plenary-nnnickyn
 
Challenges in developing names services - RDA
Challenges in developing names services - RDAChallenges in developing names services - RDA
Challenges in developing names services - RDAnickyn
 
Kew at the pro-iBiosphere data hackathon
Kew at the pro-iBiosphere data hackathonKew at the pro-iBiosphere data hackathon
Kew at the pro-iBiosphere data hackathonnickyn
 
names-backbone-graph-TDWG
names-backbone-graph-TDWGnames-backbone-graph-TDWG
names-backbone-graph-TDWGnickyn
 
A names backbone - a graph of taxonomy
A names backbone - a graph of taxonomyA names backbone - a graph of taxonomy
A names backbone - a graph of taxonomynickyn
 
IPNI PhytoKeys integration
IPNI PhytoKeys integrationIPNI PhytoKeys integration
IPNI PhytoKeys integrationnickyn
 
Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI) Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI) nickyn
 

More from nickyn (8)

829 tdwg-2015-nicolson-kew-strings-to-things
829 tdwg-2015-nicolson-kew-strings-to-things829 tdwg-2015-nicolson-kew-strings-to-things
829 tdwg-2015-nicolson-kew-strings-to-things
 
Rda p5-env-plenary-nn
Rda p5-env-plenary-nnRda p5-env-plenary-nn
Rda p5-env-plenary-nn
 
Challenges in developing names services - RDA
Challenges in developing names services - RDAChallenges in developing names services - RDA
Challenges in developing names services - RDA
 
Kew at the pro-iBiosphere data hackathon
Kew at the pro-iBiosphere data hackathonKew at the pro-iBiosphere data hackathon
Kew at the pro-iBiosphere data hackathon
 
names-backbone-graph-TDWG
names-backbone-graph-TDWGnames-backbone-graph-TDWG
names-backbone-graph-TDWG
 
A names backbone - a graph of taxonomy
A names backbone - a graph of taxonomyA names backbone - a graph of taxonomy
A names backbone - a graph of taxonomy
 
IPNI PhytoKeys integration
IPNI PhytoKeys integrationIPNI PhytoKeys integration
IPNI PhytoKeys integration
 
Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI) Advancing the International Plant Names Index (IPNI)
Advancing the International Plant Names Index (IPNI)
 

Building a names backbone

  • 1. Building a “names backbone” Nicky Nicolson, RBG Kew
  • 2. A names backbone == “an environment for the management of multiple overlapping classifications and tracking how these change over time” Not a monolith: • Built on a layered view of the domain – clearly separating names and taxonomy • Names form the objective basis for higher layers
  • 3. The current situation… Many overlapping systems, few links
  • 4. … and what we’re aiming for: Authoritative data, reduced duplication, many more links
  • 5. Names backbone: a layered environment
  • 6. Name occurrence layer AKA “Nomen-clutter” == any attempt at the transcription of a name..
  • 7. Names layer Holds objective published facts about a name: -Orthography - Authorship - Protologue reference - Type citation - Objective synonymy
  • 8. Concepts layer Hypotheses draw names together to form concepts via heterotypic synonymy
  • 9. The (current) problem: Most people want to operate at concept level…
  • 10. The (current) problem: … but have to start right down at the lowest level
  • 12. Solving the problem… We need to provide ways to allow people to better navigate between the layers, and better focus their efforts – e.g. build classifications using the same objective bases. We started with a blank sheet of paper – it’s hard to get existing systems to conform to the layering that we need
  • 13. Drawbacks of data models used to date • conflated the storage of names and concepts. • store only a single classification • store only the end product of a thought process, not work in progress • are difficult to version • are difficult to query effectively (for hierarchies etc)
  • 14. A new (graph) model • Stores data as graphs – composed of nodes and directed relationships • Both nodes and relationships can hold data as properties • Supports highly interconnected data • Supports self-referential data • Optimised for queries on relationships
  • 15. Using a graph model to hold concept data: Attempt #1 Two nodes, with name + status properties, and an “accepted_as” link. == a naïve use of the graph model: status is stored in 2 places (explicitly in status property, implicitly by the participation relationship)
  • 16. Using a graph model to hold concept data: Attempt #2 More strict about the separation of the nomenclatural information (the nodes) and the taxonomic information (the relationships between nodes), but the link is still very sparse…
  • 17. Using a graph model to hold concept data: Attempt #3 Add an attribute to indicate which classification asserts this subjective relationship: Taxonomic status of a name is inferred from its participation in a subjective taxonomic relationship.
  • 18. Links become more interesting than the nodes Expand the data held on the subjective relationship to allow it to be computationally assessed
  • 19. Multiple opinions – using the same name nodes Reuse the name nodes to store multiple opinions using the same basic facts (name nodes)
  • 20. Relationships held Objective, e.g.: • Combination-basionym • Later_homonym • Alternative_name_for • … Subjective, e.g.: • Parent_child (taxonomic placement) • Synonym (heterotypic synonymy) • …
  • 22. Supporting versioning We keep all relationships, modifications to the data just mark relationships as no longer current. We can always resurrect the state of the graph == persistent identification of taxon concepts
  • 23. Versioning = name id + classification + state We can always resurrect the state of the graph. Versioning enables remote curation of the data
  • 24. Versioning = name id + classification + state We can always resurrect the state of the graph. Versioning enables remote curation of the data
  • 25. Versioning = name id + classification + state State1, according to WCS: Xus yus Smith (A) = Aus bus Jones (S) State2, according to WCS: Xus zus White (A) = Xus yus Smith (S) = Aus bus Jones We can always resurrect the state of the graph. (S) Versioning enables remote curation of the data
  • 26. What can be done with this kind of data model? • Client systems can reliably connect to a version of a concept • We can see how concepts change over time • Researchers can query the data to compare classifications and identify areas of dispute Longer term: • Examine the “computed acceptance” rules used in TPL - could these be run on the relationships in the names backbone?
  • 27. Building it: we first focussed on the top two layers…
  • 28. … but we need a way to manage the name occurrences
  • 29. Building the name occurrence layer: Populating it: • Seed it with authoritative set of names • Add the version history of these names – how were these names transcribed in the past? Using it: • Load candidate name occurrences and match them, storing metrics on the match. Reviewing – a “data improvement” team to: • Verify the matches, focussing on ambiguity (that which can’t be done computationally) == annotation
  • 30. Services: name occurrence layer - Data input / output: DwCA -Linking and reviewing links -RSS feeds to indicate activity
  • 31. Services: names layer - Data input / output: TCS -Propose addition / edit of names -RSS feeds to indicate activity
  • 32. Services: concepts layer - Data input / output: TCS -Create classifications using names -Propose addition / edit of names to names layer -RSS feeds
  • 33. The names backbone is an extensible environment: • Links “name occurrences” to names • Separates curation of names and concepts • Supports building concepts on the same objective basis: enables sharing and reuse of foundation data. • Allow many relationships to form concepts – supports multiple overlapping classifications • Allows distributed curation of the concepts.

Editor's Notes

  1. DEFRA funded project – for Kew internal information management, but applicable wider.Staffed with a development team of 5, and a data improvement team of 4, plus people working on project management and business change.Names are crucial to Kew’s scientific work and day to day management of the collections.We have many systems which hold nomenclatural and taxonomic information
  2. Many systems few links.Huge overlap in data and functionalityA single scientific question can be answered in multiple different ways
  3. Name occurrence layer – any informal attempt at the transcription of a nameSome name occurrences are code governed names – eligible to appear in the next layer – the names layer – this holds all the objective published facts about a name – its orthography, authorship, protologue reference, type citation and objective synonymyConcepts layer – hypotheses draw these names together to form concepts via heterotypic synonymy.Most people are interested in working with concepts. Unfortunately most people are only armed with name occurrences.
  4. Name occurrence layer – any informal attempt at the transcription of a nameSome name occurrences are code governed names – eligible to appear in the next layer – the names layer – this holds all the objective published facts about a name – its orthography, authorship, protologue reference, type citation and objective synonymyConcepts layer – hypotheses draw these names together to form concepts via heterotypic synonymy.Most people are interested in working with concepts. Unfortunately most people are only armed with name occurrences.
  5. IPNI / IF / Zoobank
  6. WSCP etc
  7. Most scientific questions operate at the concept level...
  8. Name occurrence layer – any informal attempt at the transcription of a nameSome name occurrences are code governed names – eligible to appear in the next layer – the names layer – this holds all the objective published facts about a name – its orthography, authorship, protologue reference, type citation and objective synonymyConcepts layer – hypotheses draw these names together to form concepts via heterotypic synonymy.Most people are interested in working with concepts. Unfortunately most people are only armed with name occurrences.
  9. …Fun board game for a small child, big waste of effort when we are trying to do science.We need to provide ways to allow people to better navigate between the layers, and better focus their efforts – e.g. build classifications using the same objective bases.
  10. We’ve investigated using a different storage technology that stores data as graphs (structures composed of nodes and directed relationships between nodes) rather than in a relational structure. Both nodes and relationships can hold data in the form of properties. These are strongly typed, and indexed for retrieval performance.Drawbacks?A very different way of thinking about the dataNeeds an API to interact with the underlying storageBut:The graph model gets us a long way – it’s a natural way to represent the data.
  11. In the first use of a graph model, we imported some data from the plant list. We created two nodes, each with fullName and status properties, and created an “accepted_as” link between the two to represent the fact that one name is a synonym of the other.This is quite a naïve use of the graph model – and it repeats a problem seen with the WCS data structure, namely that the status is effectively stored in two places – explicitly in the status property on the name node, and implicitly by the participation of the name node in an accepted_as relationship.
  12. The second attempt was more strict about the separation of the nomenclatural information (the nodes) and the taxonomic information (the relationships between nodes). The benefit of a graph model is that information can be stored on the relationships between nodes – so we can have an “accordingTo” property on subjective relationships like “acceptedAs” and support many of these relationships to represent differing and potentially conflicting taxonomic opinions.
  13. Add an attribute to indicate which classification asserts this subjective relationshipTaxonomic status of a name is inferred from its participation in a subjective taxonomic relationship. We can query the graph database for the treatment of the name “Cus bus Jones” according to WCS and see that it is accepted (it has an incoming accepted_as link). Similarly, according to WCS, the name “Aus bus (L.) K.” is a synonym as it has an outgoing accepted_as link.
  14. Expand the data held on the subjective relationship to allow it to be computationally assessed
  15. “Nomenclutter”