SlideShare a Scribd company logo
1 of 19
Linked Open Data and
Systemic Taxonomy
Joel Richard
Smithsonian Libraries
richardjm@si.edu
A tale of two publications
In three acts
Who are the Smithsonian Libraries?
• 20 Libraries in the U.S. and Panama
• Supports research of staff and the public
• Strong effort to digitize pre-1923 texts
• Index Animalium and Taxonomic
Literature II are two examples
Joel Richard,
Disclaimer
We are still learning.
We are still building.
Joel Richard,
Joel Richard,
Act I: The Players
(or, identifying the data with which
we are working and their meaning
and usefulness to the scientific
community.)
Taxonomic Literature II
Essential Reference
Tool for Botanists
Botanists/Authors
and Publications
from 1753–1940
Multiple indexes, “unique identifiers”
It is a “database in book form”
Joel Richard,
Joel Richard,
Joel Richard,
Joel Richard,
Index Animalium
Genus name, author
& citation for
430,000 animals
Covers Publications
from 1758–1850
Also a database, but
many challenges
still exist in the data.
Joel Richard,
Joel Richard,
Act II: The Linking
(or, identifying those data elements to
be linked, inherent challenges of
parsing OCR text, and identifying
linkable remote data sources)
Joel Richard,
Linkable Data Elements
Joel Richard,
foaf:lastName, foaf:familyName
foaf:firstName, foaf:givenName
foaf:name, skos:prefLabel
bio:birth
bio:death
skos:definition
tl2:personAbbreviation
tl2:titleNumber
dc:title
event:place
dc:publisher
dc:created
tl2:titleAbbreviation
http://library.si.edu/tl2/author/darwin
RDF Type = foaf:Person
http://library.si.edu/tl2/title/origin…
RDF Type = bibo:Book
Joel Richard,
Challenges with Our Data
• Errors in the Corrected OCR
• Challenges in Parsing Citations
• The 80/20 rule: manually making
connections unable to be made by
automated means
• Finding suitable sources of data to
link to. (DBPedia? VIAF? EOL? Others?)
Joel Richard,
Linked Data Sources
Low-Hanging Fruit:
• DBPedia
• OCLC WorldCat
• Biodiversity Heritage Library
• Virtual International Authority File
• Encyclopedia of Life
• Library of Congress Subject Headings
• GeoNames
• Open Library
Joel Richard,
Act III: The Sum of the Parts
(or, our goals and desires for this
data, what it means to the linked
data world and the scientific
community in general)
Joel Richard,
What’s the point?
• This data may already exist online.
• It may also not always be as accurate
as needed for science.
• We are in a position to be the
authoritative source for this
information.
• Linked Data allows it to be easily
reused and shared.
Joel Richard,
Danaus plexippus
Index Animalium Systema Naturae, etc
Aimeé Antoinette
Camus
(botanist)
Your Local Library
( )
Joel Richard,
One Example of Reuse
Ryan Schenk
http://synynyms.com/
Thank you!
Joel Richard
RichardJM@si.edu
http://library.si.edu/staff/joel-richard
http://slideshare.net/joelrichard

More Related Content

What's hot

De walt ecn_2012
De walt ecn_2012De walt ecn_2012
De walt ecn_2012ECNOfficer
 
LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards Dr. Starr Hoffman
 
The Case for Stable VIVO URIs
The Case for Stable VIVO URIsThe Case for Stable VIVO URIs
The Case for Stable VIVO URIsVioleta Ilik
 
Lis415 ranganathan
Lis415 ranganathanLis415 ranganathan
Lis415 ranganathanMridul Maity
 
LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships Dr. Starr Hoffman
 
Taxonomies and Folksonomies
Taxonomies and FolksonomiesTaxonomies and Folksonomies
Taxonomies and FolksonomiesK.G. Schneider
 
Data Management Open House
Data Management Open HouseData Management Open House
Data Management Open HouseJackie Wirz, PhD
 
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...marcosmartinezromero
 
Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web Morgan Briles
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinSimon Jupp
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4jSimon Jupp
 
Citations needed for the sum of all human knowledge: Wikidata as the missing ...
Citations needed for the sum of all human knowledge: Wikidata as the missing ...Citations needed for the sum of all human knowledge: Wikidata as the missing ...
Citations needed for the sum of all human knowledge: Wikidata as the missing ...Dario Taraborelli
 

What's hot (20)

De walt ecn_2012
De walt ecn_2012De walt ecn_2012
De walt ecn_2012
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
 
LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards LIS 653, Session 3: Principles and Standards
LIS 653, Session 3: Principles and Standards
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific ExperimentsAn Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
 
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
The CEDAR Workbench: An Ontology-Assisted Environment for Authoring Metadata ...
 
The Case for Stable VIVO URIs
The Case for Stable VIVO URIsThe Case for Stable VIVO URIs
The Case for Stable VIVO URIs
 
Lis415 ranganathan
Lis415 ranganathanLis415 ranganathan
Lis415 ranganathan
 
LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships LIS 653, Session 6: FRBR & Relationships
LIS 653, Session 6: FRBR & Relationships
 
Taxonomies and Folksonomies
Taxonomies and FolksonomiesTaxonomies and Folksonomies
Taxonomies and Folksonomies
 
Tassonomia E Folksonomia
Tassonomia E FolksonomiaTassonomia E Folksonomia
Tassonomia E Folksonomia
 
Data Management Open House
Data Management Open HouseData Management Open House
Data Management Open House
 
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
ICBO2017 - Supporting Ontology-Based Standardization of Biomedical Metadata i...
 
Educ Sept2010
Educ Sept2010Educ Sept2010
Educ Sept2010
 
Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web Linked data 101: Getting Caught in the Semantic Web
Linked data 101: Getting Caught in the Semantic Web
 
Adol 668
Adol 668Adol 668
Adol 668
 
Social Work Subject Guide
Social Work Subject GuideSocial Work Subject Guide
Social Work Subject Guide
 
Ontologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlinOntologies neo4j-graph-workshop-berlin
Ontologies neo4j-graph-workshop-berlin
 
Importing life science at a into Neo4j
Importing life science at a into Neo4jImporting life science at a into Neo4j
Importing life science at a into Neo4j
 
Citations needed for the sum of all human knowledge: Wikidata as the missing ...
Citations needed for the sum of all human knowledge: Wikidata as the missing ...Citations needed for the sum of all human knowledge: Wikidata as the missing ...
Citations needed for the sum of all human knowledge: Wikidata as the missing ...
 

Viewers also liked

Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access agosti
 
Open Research Data: Taxonomy
Open Research Data: TaxonomyOpen Research Data: Taxonomy
Open Research Data: Taxonomyagosti
 
The role of product category for brand relationships
The role of product category for brand relationships The role of product category for brand relationships
The role of product category for brand relationships CBR Conference
 
Category Management Project
Category Management ProjectCategory Management Project
Category Management ProjectElias Polymeros
 
Brand As A Category Not A Product
Brand As A Category Not A ProductBrand As A Category Not A Product
Brand As A Category Not A ProductJohn Oyakhilome
 
Taxonomies for E-commerce
Taxonomies for E-commerceTaxonomies for E-commerce
Taxonomies for E-commerceHeather Hedden
 

Viewers also liked (6)

Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access Nothing in taxonomy makes sense except in the light of Open Access
Nothing in taxonomy makes sense except in the light of Open Access
 
Open Research Data: Taxonomy
Open Research Data: TaxonomyOpen Research Data: Taxonomy
Open Research Data: Taxonomy
 
The role of product category for brand relationships
The role of product category for brand relationships The role of product category for brand relationships
The role of product category for brand relationships
 
Category Management Project
Category Management ProjectCategory Management Project
Category Management Project
 
Brand As A Category Not A Product
Brand As A Category Not A ProductBrand As A Category Not A Product
Brand As A Category Not A Product
 
Taxonomies for E-commerce
Taxonomies for E-commerceTaxonomies for E-commerce
Taxonomies for E-commerce
 

Similar to Linked Open Data and Systematic Taxonomy

Unlocking Taxonomic Literature II using Linked Open Data
Unlocking Taxonomic Literature II using Linked Open DataUnlocking Taxonomic Literature II using Linked Open Data
Unlocking Taxonomic Literature II using Linked Open DataJoel Richard
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...Alison Hitchens
 
Digitalización de literatura de Biodiversidad: an Overview of the Biodiversit...
Digitalización de literatura de Biodiversidad: an Overview of the Biodiversit...Digitalización de literatura de Biodiversidad: an Overview of the Biodiversit...
Digitalización de literatura de Biodiversidad: an Overview of the Biodiversit...Martin Kalfatovic
 
The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of Life
The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of LifeThe Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of Life
The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of LifeMartin Kalfatovic
 
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of LifeBiodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of LifeMartin Kalfatovic
 
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...Martin Kalfatovic
 
3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage Library3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage LibraryMartin Kalfatovic
 
The Biodiversity Heritage Library
The Biodiversity Heritage LibraryThe Biodiversity Heritage Library
The Biodiversity Heritage LibraryMartin Kalfatovic
 
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...Martin Kalfatovic
 
The Biodiversity Heritage Library: 30 Million Pages of Taxonomic Literature &...
The Biodiversity Heritage Library: 30 Million Pages of Taxonomic Literature &...The Biodiversity Heritage Library: 30 Million Pages of Taxonomic Literature &...
The Biodiversity Heritage Library: 30 Million Pages of Taxonomic Literature &...Becky Morin
 
Smithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in ResearchSmithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in ResearchSCPilsk
 
Pratt Sils Knowledge Organization Fall 2008
Pratt Sils Knowledge Organization Fall 2008Pratt Sils Knowledge Organization Fall 2008
Pratt Sils Knowledge Organization Fall 2008PrattSILS
 
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...Martin Kalfatovic
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Sherborn: Pilsk, Joel Richard & Kalfatovic - Unlocking the Index Animalium: F...
Sherborn: Pilsk, Joel Richard & Kalfatovic - Unlocking the Index Animalium: F...Sherborn: Pilsk, Joel Richard & Kalfatovic - Unlocking the Index Animalium: F...
Sherborn: Pilsk, Joel Richard & Kalfatovic - Unlocking the Index Animalium: F...ICZN
 
Unlocking indexanimaliumstatic
Unlocking indexanimaliumstaticUnlocking indexanimaliumstatic
Unlocking indexanimaliumstaticSCPilsk
 
Global Library of Life: The Biodiversity Heritage Library
Global Library of Life: The Biodiversity Heritage LibraryGlobal Library of Life: The Biodiversity Heritage Library
Global Library of Life: The Biodiversity Heritage LibraryMartin Kalfatovic
 
An Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage LibraryAn Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage LibraryMartin Kalfatovic
 
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of LifeBiodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of LifeMartin Kalfatovic
 

Similar to Linked Open Data and Systematic Taxonomy (20)

Unlocking Taxonomic Literature II using Linked Open Data
Unlocking Taxonomic Literature II using Linked Open DataUnlocking Taxonomic Literature II using Linked Open Data
Unlocking Taxonomic Literature II using Linked Open Data
 
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
What is #LODLAM?! Understanding linked open data in libraries, archives [and ...
 
Digitalización de literatura de Biodiversidad: an Overview of the Biodiversit...
Digitalización de literatura de Biodiversidad: an Overview of the Biodiversit...Digitalización de literatura de Biodiversidad: an Overview of the Biodiversit...
Digitalización de literatura de Biodiversidad: an Overview of the Biodiversit...
 
The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of Life
The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of LifeThe Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of Life
The Biodiversity Heritage Library: A Cornerstone of the Encyclopedia of Life
 
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of LifeBiodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
 
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
Biodiversity Heritage Library: A Conversation About A Collaborative Digitizin...
 
3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage Library3 Years On: The Biodiversity Heritage Library
3 Years On: The Biodiversity Heritage Library
 
The Biodiversity Heritage Library
The Biodiversity Heritage LibraryThe Biodiversity Heritage Library
The Biodiversity Heritage Library
 
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
 
The Biodiversity Heritage Library: 30 Million Pages of Taxonomic Literature &...
The Biodiversity Heritage Library: 30 Million Pages of Taxonomic Literature &...The Biodiversity Heritage Library: 30 Million Pages of Taxonomic Literature &...
The Biodiversity Heritage Library: 30 Million Pages of Taxonomic Literature &...
 
Smithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in ResearchSmithsonian Libraries Partnering in Research
Smithsonian Libraries Partnering in Research
 
The Open Access Community, and OAIster
The Open Access Community, and OAIsterThe Open Access Community, and OAIster
The Open Access Community, and OAIster
 
Pratt Sils Knowledge Organization Fall 2008
Pratt Sils Knowledge Organization Fall 2008Pratt Sils Knowledge Organization Fall 2008
Pratt Sils Knowledge Organization Fall 2008
 
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...An International Cooperative Digital Library for Taxonomic Literature: The Bi...
An International Cooperative Digital Library for Taxonomic Literature: The Bi...
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Sherborn: Pilsk, Joel Richard & Kalfatovic - Unlocking the Index Animalium: F...
Sherborn: Pilsk, Joel Richard & Kalfatovic - Unlocking the Index Animalium: F...Sherborn: Pilsk, Joel Richard & Kalfatovic - Unlocking the Index Animalium: F...
Sherborn: Pilsk, Joel Richard & Kalfatovic - Unlocking the Index Animalium: F...
 
Unlocking indexanimaliumstatic
Unlocking indexanimaliumstaticUnlocking indexanimaliumstatic
Unlocking indexanimaliumstatic
 
Global Library of Life: The Biodiversity Heritage Library
Global Library of Life: The Biodiversity Heritage LibraryGlobal Library of Life: The Biodiversity Heritage Library
Global Library of Life: The Biodiversity Heritage Library
 
An Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage LibraryAn Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage Library
 
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of LifeBiodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
Biodiversity Heritage Library: Cornerstone of the Encyclopedia of Life
 

Recently uploaded

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingWSO2
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 

Recently uploaded (20)

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Quantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation ComputingQuantum Leap in Next-Generation Computing
Quantum Leap in Next-Generation Computing
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 

Linked Open Data and Systematic Taxonomy

  • 1. Linked Open Data and Systemic Taxonomy Joel Richard Smithsonian Libraries richardjm@si.edu A tale of two publications In three acts
  • 2. Who are the Smithsonian Libraries? • 20 Libraries in the U.S. and Panama • Supports research of staff and the public • Strong effort to digitize pre-1923 texts • Index Animalium and Taxonomic Literature II are two examples Joel Richard,
  • 3. Disclaimer We are still learning. We are still building. Joel Richard,
  • 4. Joel Richard, Act I: The Players (or, identifying the data with which we are working and their meaning and usefulness to the scientific community.)
  • 5. Taxonomic Literature II Essential Reference Tool for Botanists Botanists/Authors and Publications from 1753–1940 Multiple indexes, “unique identifiers” It is a “database in book form” Joel Richard,
  • 8. Joel Richard, Index Animalium Genus name, author & citation for 430,000 animals Covers Publications from 1758–1850 Also a database, but many challenges still exist in the data.
  • 10. Joel Richard, Act II: The Linking (or, identifying those data elements to be linked, inherent challenges of parsing OCR text, and identifying linkable remote data sources)
  • 12. Joel Richard, foaf:lastName, foaf:familyName foaf:firstName, foaf:givenName foaf:name, skos:prefLabel bio:birth bio:death skos:definition tl2:personAbbreviation tl2:titleNumber dc:title event:place dc:publisher dc:created tl2:titleAbbreviation http://library.si.edu/tl2/author/darwin RDF Type = foaf:Person http://library.si.edu/tl2/title/origin… RDF Type = bibo:Book
  • 13. Joel Richard, Challenges with Our Data • Errors in the Corrected OCR • Challenges in Parsing Citations • The 80/20 rule: manually making connections unable to be made by automated means • Finding suitable sources of data to link to. (DBPedia? VIAF? EOL? Others?)
  • 14. Joel Richard, Linked Data Sources Low-Hanging Fruit: • DBPedia • OCLC WorldCat • Biodiversity Heritage Library • Virtual International Authority File • Encyclopedia of Life • Library of Congress Subject Headings • GeoNames • Open Library
  • 15. Joel Richard, Act III: The Sum of the Parts (or, our goals and desires for this data, what it means to the linked data world and the scientific community in general)
  • 16. Joel Richard, What’s the point? • This data may already exist online. • It may also not always be as accurate as needed for science. • We are in a position to be the authoritative source for this information. • Linked Data allows it to be easily reused and shared.
  • 17. Joel Richard, Danaus plexippus Index Animalium Systema Naturae, etc Aimeé Antoinette Camus (botanist) Your Local Library ( )
  • 18. Joel Richard, One Example of Reuse Ryan Schenk http://synynyms.com/

Editor's Notes

  1. Originally this presentation was going to center around a discussion of our conversion of TL2 to linked data and what we learned, but I felt that it would be better to use it as an example of things to keep in mind when creating your own data sets.
  2. Situated at the center of the world's largest museum complex, the Smithsonian Libraries forms a vital part of the research, exhibition, and educational enterprise of the Institution. The Libraries unites 20 libraries into one system supported by central collections support services. We maintain publication exchanges with more than 4,000 institutions worldwide that supply Smithsonian scientists and curators with current periodicals, exhibition catalogs, and professional society publications. Through preservation treatments, experts work to save the Smithsonian's 1.5 million printed books and manuscripts for future generations. Our Digital Library creates electronic versions of rare books and other distinctive collections, as well as exhibitions and specialized finding aids. We can be found on the web at http://library.si.edu
  3. I dislike disclaimers, but we’re still new to linked open data and are learning as we go. The idea of LOD has been around for several years now, so we are also playing a bit of catch-up.Our first goals are to get some data online and then start linking our dataout to other sources, and encourage others to link to us. We don’t yet know how our data relates to others. It’s not scientific datacreated as part of a research project per se, but initially we see it as valuable, useful information at least for some segements of the research world.
  4. So as an example of how to create a data set, I’ll use Taxonomic Literature II. It is a fifteen volumes guide to the literature of systemic botany published between 1753 and 1940. It contains almost 10,000 authors and about 37,000 publications.The reason to focus on TL2 is that we aim to be the authority on the web for this information. We have received permission from the IAPT (Intl Assoc for Plant Taxonomy) to digitze and release this information on the web under an open license. TL-2 is used by most? botanists and their work is made easier by this data being online. Prior to 2012 this information was either located in a library or locked behind a paywall of sorts.
  5. This is a page of TL-2 showing Charles Darwin and On the Origin of Species with those items that are immediately visible that can be parsed and turned into Linked Data.There is other data in the page that could be turned into linked data, but at this time, we have only parsed the data that is highlighted on this page.Clearly, moving from something such as a printed book to a Linked Open Data set is an arduous task. If you are working on creating your own data sets, your experiences will differ depending on the source(s) of your data.One important things to note here are the “Darwin” in parentheses, which is a unique abbreviation for an author. Each author has one. Another important item is the “1313” identifying the title, On the Origin of Species. Each publication in TL-2 has its own number. There are about 9,900 authors and 37,000 titles in all.
  6. This is the current website that we have that shows a sample of the search results for Charles Darwin. This is not Linked Data.You can find this page at: http://www.sil.si.edu/digitalcollections/tl-2/
  7. Index Animalium, published in the late 1800s and early 1900s, contains 430,000 species names for 7000 scientific volumes published between 1758 and 1840. Charles Davies Sherborn dedicated much of his life to this work. The volumes consist of the index to species with one species + citation per line and a bibliography listing the titles that Sherborn read. Challenges in the data include inconsistent citation formats, two kinds of abbreviations, both in the index and in the bibliography, as well as errors introduced during the printing process.
  8. This is one example of a page from Index Animalium for Papilio (Danaus) plexippus, AKA the Monarch Butterfly. The abbreviations:Linnaeus: Carl LinnaeusSyst. Nat.: SystemaNaturaeEd 10: 10th edition1758: Publication Year471: Page 471Also 12th Edition, published in 1767, page 767.
  9. Identified here are the “easy” to identify data elements that can be brought to linked data. We still need to contend with the challenges associated with the parsing of these into actual citations. The TL-2 data at the top has already been parsed and loaded into a database. Index Animalium is posing a greater challenge and will take longer to complete.
  10. A further breakdown of our data for TL-2 into linked data showing the predicates we might use for each. Again, the items in orange are specific to TL2 and may not exist in other LOD data sets. For example, the FOAF vocabulary has date of birth, but can we use only a year in that field? Will that foul up other computers? FOAF also doesn’t include date of death, which we definitely have. What predicate do we use? Do we create our own ontology and publish it? (probably)Finally, we haven’t yet begun a formal analysis of which existing ontologies might fit our needs.
  11. 80/20 Rule: You spend 20% of your time on 80% of the work and 80% of your time on the 20% of the work. We are at that point with Index Animalium. We would like to do further parsing of data with TL-2 but it will pose similar challenges to that of Index Animalium.
  12. Some potential sources of data that we can link to. We’d like to one day have some of these link back to us, thereby competing the circuit for a linked data web of knowledge.
  13. This is what we would like to do:A researcher enters a botanist name or a species name and is taken directly to the page in the book referenced by that entry. If the book is not known to be digitized and online, then we can redirect them to OCLC worldcat to find a copy of that book in their local library.This is a great improvement for those who wouldn’t normally have access to these books in their local library.