SlideShare a Scribd company logo
1 of 43
Download to read offline
STANFORD UNIVERSITY LIBRARIES
The Linked Data Snowball or
Why We Need Reconciliation
April 4th, 2016
TH E AAC / G E T T Y WO R K S H O P O N
R E C O N C I L I AT I O N O F L I N K E D OP E N D ATA
Rob Sanderson / azaroth@stanford.edu / @azaroth42
STANFORD UNIVERSITY LIBRARIES
The Linked Data Snowball or
Why We Need Reconciliation
April 4th, 2016
TH E AAC / G E T T Y WO R K S H O P O N
R E C O N C I L I AT I O N O F L I N K E D OP E N D ATA
Rob Sanderson / azaroth@stanford.edu / @azaroth42
web.stanford.edu/~azaroth/#me
azaroth42@gmail.com / +azaroth42
orcid: 0000-0003-4441-6852
STANFORD UNIVERSITY LIBRARIES
The Linked Data Snowball or
Why We Need Reconciliation
April 4th, 2016
T H E A A C / G E T T Y W O R K S H O P O N
Rob Sanderson / azaroth@stanford.edu / @azaroth42
web.stanford.edu/~azaroth/#me
azaroth42@gmail.com / +azaroth42
orcid: 0000-0003-4441-6852
http://www.informatik.uni-trier.de/~ley/pers/hd/s/Sanderson:Robert
http://academic.research.microsoft.com/Author/2765999
http://www.scopus.com/authid/detail.url?authorId=8988953600
www.researchgate.net/profile/Rob_Sanderson
facebook.com/rob.sanderson / linkedin.com/pub/robert-sanderson/1/172/5a6/
rsanderson@lanl.gov / azaroth@liv.ac.uk
public.lanl.gov/rsanderson / gondolin.hist.liv.ac.uk/~azaroth
rds23@student.canterbury.ac.nz / azaroth@es-net.co.nz
R E C O N C I L I AT I O N O F L I N K E D O P E N D ATA
Linked Data?
1.  Use URIs as names for things
2.  Use HTTP URIs so that people can look up those names
3.  When someone looks up a URI, provide useful
information, using the standards
4.  Include links to other URIs, so they can discover
more things
Linked Data?
1.  Use URIs as names for things
2.  Use HTTP URIs so that people can look up those names
3.  When someone looks up a URI, provide useful
information, using the standards
4.  Include links to other URIs, so they can discover more
things
5.  Link your data to other people's data to provide
context
Why So Many?
Do I know the URI, or can I find it?
URI
No
Why So Many?
Do I know the URI, or can I find it?
No
Understand and agree with the model used?
No
URI
Why So Many?
Do I know the URI, or can I find it?
No
Understand and agree with the model used?
No
Understand and agree with the description?
No
URI
Why So Many?
Do I know the URI, or can I find it?
No
Understand and agree with the model used?
No
Understand and agree with the description?
No
Agree the URI identifies the same entity?
No
URI
Why So Many?
Do I know the URI, or can I find it?
No
Understand and agree with the model used?
No
Understand and agree with the description?
No
Agree the URI identifies the same entity?
No
Agree description is complete?
No
URI
Why So Many?
Do I know the URI, or can I find it?
No
Understand and agree with the model used?
No
Understand and agree with the description?
No
Agree the URI identifies the same entity?
No
Agree description is complete?
No
Hooray, you reused a URI! URI
Yes
Why So Many?
Do I know the URI, or can I find it?
No
Understand and agree with the model used?
No
Understand and agree with the description?
No
Agree the URI identifies the same entity?
No
Agree description is complete?
No
Hooray, you reused a URI!
Now start again with the next entity :(
URI
Yes
Many Special and Unique Snowflakes
Become a Huge Snowball of Technical Debt
Option 1: Balance the Equation
Cost(Create URI)!
+!
Cost(Maintain URI) !
!
Cost(Find Good URI)+
Cost(Understand Model)+
Cost(Understand Content)

+!
min( Risk(Reliability)+!
Cost(Network Latency),!
Risk(Out of Date)+!
Cost(Cache Content))

-!
Value(Connected Graph)!
<=
Option 1 Likelihood
Option 1 Likelihood
Botticelli: http://vocab.getty.edu/ulan/500015254!
Option 1 Likelihood
Botticelli: http://vocab.getty.edu/ulan/500015254 :)!
Option 1 Likelihood
Botticelli: http://vocab.getty.edu/ulan/500015254!
:(
Option 2: Reconciliation
YCBA's URIs Princeton's URIs
Option 2: Reconciliation
YCBA's
Entities
Princeton's
Entities
Shared Entities but not Shared URIs
Option 2: Reconciliation
1. Algorithmically discover this intersection
given the descriptions of the entities
Option 2: Reconciliation
2. Assert that the entity which two URIs identify
is actually the same entity
=
Option 2: Reconciliation
Option 2a: Reconciliation
(distributed authority)
Option 2b: Reconciliation
(centralized authority)
Benefits of Reconciliation
End User:
•  Has access to more information, more easily, improving research,
discovery and navigation
•  Potential for new UIs, new research questions, reasoning
Institution:
•  Efficiency (= reduced cost) and improved quality of description
•  Increased prestige when descriptions are reused
•  Usage across the network is valuable business intelligence
Community:
•  Network effects spread faster and further, increasing awareness of
cultural heritage
•  Gives easier access to other communities' data
Real Benefit of Reconciliation
Reconciliation is a network damage limiting step
towards balancing Equation 1
By linking entity descriptions together:
•  the cost of discovery and understanding is reduced
•  the costs of creating and maintaining the resources are shared
across the community, not duplicated
•  the value of the connected graph is increased
•  the likelihood of new entities (requiring reconciliation) is reduced
But How Can A Machine Know??
Algorithms won't be perfect, but can be good enough.
•  What use cases will the reconciled data be used to fulfill?
•  What is the cost of a false positive for those use cases?
Precision: What % of matches are correct?
Recall: What % of the possible matches were found?
Can make trade-offs of precision vs recall for different use cases.
Machine can record its certainty, and policy can provide a threshold.
How Can We Improve It?
Several different relationships to express similarity:
•  owl:sameAs – always exactly the same (transitive)
•  skos:exactMatch – the same for most purposes (transitive)
•  skos:closeMatch – the same for some purposes (intransitive)
The context of resource in the network is important
•  Starting simple with high precision gives a better context to use the
results to iteratively and incrementally bootstrap
Trust and Community
"Efficiency (= reduced cost) and improved quality of description"
•  Efficiency comes from not duplicating descriptive effort...
•  Which requires trusting other institutions in the community
•  We need to work together, not...
Trust and Community
"Efficiency (= reduced cost) and improved quality of description"
•  Efficiency comes from not duplicating descriptive effort...
•  Which requires trusting other institutions in the community
•  We need to work together, not...
Entities to Reconcile
As a community, we need to pick where to start.
Suggest starting with least controversial / most unique:
•  Physical objects
•  People
•  Places
•  Events (specific, like Exhibitions)
A small sub-domain (by time?) to make overlap more likely
Q. Can I Reconcile a String?
Named Entity Recognition
"snowflake" = .
strings to things
Reconciliation
. = .
things to things
The Hard Question
How can we be more useful than DBPedia
for our own entities?
The Hard Question
How can we be more useful than DBPedia
for our own entities?
•  Focus on unique selling points
•  Demonstrate value early,
both internally and to the broader community
•  By working together to increase the value of the network
STANFORD UNIVERSITY LIBRARIES
Thank You!
April 4th, 2016
Rob Sanderson / azaroth@stanford.edu / @azaroth42
web.stanford.edu/~azaroth/#me
azaroth42@gmail.com / +azaroth42
orcid: 0000-0003-4441-6852
http://www.informatik.uni-trier.de/~ley/pers/hd/s/Sanderson:Robert
http://academic.research.microsoft.com/Author/2765999
http://www.scopus.com/authid/detail.url?authorId=8988953600
www.researchgate.net/profile/Rob_Sanderson
facebook.com/rob.sanderson / linkedin.com/pub/robert-sanderson/1/172/5a6/
rsanderson@lanl.gov / azaroth@liv.ac.uk
public.lanl.gov/rsanderson / gondolin.hist.liv.ac.uk/~azaroth
rds23@student.canterbury.ac.nz / azaroth@es-net.co.nz
STANFORD UNIVERSITY LIBRARIES
Thank You!
April 4th, 2016
Rob Sanderson / azaroth@stanford.edu / @azaroth42
web.stanford.edu/~azaroth/#me
azaroth42@gmail.com / +azaroth42
orcid: 0000-0003-4441-6852
http://www.informatik.uni-trier.de/~ley/pers/hd/s/Sanderson:Robert
http://academic.research.microsoft.com/Author/2765999
http://www.scopus.com/authid/detail.url?authorId=8988953600
www.researchgate.net/profile/Rob_Sanderson
facebook.com/rob.sanderson / linkedin.com/pub/robert-sanderson/1/172/5a6/
rsanderson@lanl.gov / azaroth@liv.ac.uk
public.lanl.gov/rsanderson / gondolin.hist.liv.ac.uk/~azaroth
rds23@student.canterbury.ac.nz / azaroth@es-net.co.nz
STANFORD UNIVERSITY LIBRARIES
Thank You!
April 4th, 2016
azaroth@stanford.edu
STANFORD UNIVERSITY LIBRARIES
Thank You!
April 4th, 2016
azaroth@stanford.edu
Thank	You!	
rsanderson@ge*y.edu		
April 25th, 2016
Thank	You!	
rsanderson@ge*y.edu		
Based on my slides from Andrew W. Mellon Foundation Reconciliation Workshop
With recognition and thanks to all of the participants

More Related Content

What's hot

Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?OCLC
 
Identifying The Benefit of Linked Data
Identifying The Benefit of Linked DataIdentifying The Benefit of Linked Data
Identifying The Benefit of Linked DataRichard Wallis
 
Web Driven Revolution For Library Data
Web Driven Revolution For Library DataWeb Driven Revolution For Library Data
Web Driven Revolution For Library DataRichard Wallis
 
Signposting for Repositories
Signposting for RepositoriesSignposting for Repositories
Signposting for RepositoriesMartin Klein
 
Schema.org - Extending Benefits
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending BenefitsRichard Wallis
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in LibrariesRichard Wallis
 
Telling the World and Our Users What We Have
Telling the World and Our Users What We HaveTelling the World and Our Users What We Have
Telling the World and Our Users What We HaveRichard Wallis
 
Schema.org - An Extending Influence
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending InfluenceRichard Wallis
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Asuncion Gomez-Perez
 
Schema.org: What It Means For You and Your Library
Schema.org: What It Means For You and Your LibrarySchema.org: What It Means For You and Your Library
Schema.org: What It Means For You and Your LibraryRichard Wallis
 
Linked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the FutureLinked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the FutureEmily Nimsakont
 
WorldCat, Works, and Schema.org
WorldCat, Works, and Schema.orgWorldCat, Works, and Schema.org
WorldCat, Works, and Schema.orgRichard Wallis
 
LD4L OCLC Data Strategy
LD4L OCLC Data StrategyLD4L OCLC Data Strategy
LD4L OCLC Data StrategyRichard Wallis
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosEUCLID project
 
Brief Introduction to Linked Data
Brief Introduction to Linked DataBrief Introduction to Linked Data
Brief Introduction to Linked DataRobert Sanderson
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesRichard Wallis
 

What's hot (20)

Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?Linked Data Implementations—Who, What and Why?
Linked Data Implementations—Who, What and Why?
 
Identifying The Benefit of Linked Data
Identifying The Benefit of Linked DataIdentifying The Benefit of Linked Data
Identifying The Benefit of Linked Data
 
Web Driven Revolution For Library Data
Web Driven Revolution For Library DataWeb Driven Revolution For Library Data
Web Driven Revolution For Library Data
 
Signposting for Repositories
Signposting for RepositoriesSignposting for Repositories
Signposting for Repositories
 
Schema.org - Extending Benefits
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending Benefits
 
Extending Schema.org
Extending Schema.orgExtending Schema.org
Extending Schema.org
 
Linked Data in Libraries
Linked Data in LibrariesLinked Data in Libraries
Linked Data in Libraries
 
Telling the World and Our Users What We Have
Telling the World and Our Users What We HaveTelling the World and Our Users What We Have
Telling the World and Our Users What We Have
 
Schema.org - An Extending Influence
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending Influence
 
Linked Open Data
Linked Open DataLinked Open Data
Linked Open Data
 
Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data Maximising (Re)Usability of Library metadata using Linked Data
Maximising (Re)Usability of Library metadata using Linked Data
 
Schema.org: What It Means For You and Your Library
Schema.org: What It Means For You and Your LibrarySchema.org: What It Means For You and Your Library
Schema.org: What It Means For You and Your Library
 
Linked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the FutureLinked Data, Library Users, and the Discovery Tools of the Future
Linked Data, Library Users, and the Discovery Tools of the Future
 
Linked Data and OCLC
Linked Data and OCLCLinked Data and OCLC
Linked Data and OCLC
 
WorldCat, Works, and Schema.org
WorldCat, Works, and Schema.orgWorldCat, Works, and Schema.org
WorldCat, Works, and Schema.org
 
LD4L OCLC Data Strategy
LD4L OCLC Data StrategyLD4L OCLC Data Strategy
LD4L OCLC Data Strategy
 
Usage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application ScenariosUsage of Linked Data: Introduction and Application Scenarios
Usage of Linked Data: Introduction and Application Scenarios
 
Brief Introduction to Linked Data
Brief Introduction to Linked DataBrief Introduction to Linked Data
Brief Introduction to Linked Data
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of Entities
 
Gonzalez-8-jun15
Gonzalez-8-jun15Gonzalez-8-jun15
Gonzalez-8-jun15
 

Viewers also liked

Enhancing a library OPAC with linked data
Enhancing a library OPAC with linked dataEnhancing a library OPAC with linked data
Enhancing a library OPAC with linked dataMichael Cummings
 
Linked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE ProjectLinked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE Projectariadnenetwork
 
Measuring impact: Impact Factor, h-index, and altmetrics
Measuring impact: Impact Factor, h-index, and altmetricsMeasuring impact: Impact Factor, h-index, and altmetrics
Measuring impact: Impact Factor, h-index, and altmetricsChealsye Bowley
 
Looking at Libraries, collections & technology
Looking at Libraries, collections & technologyLooking at Libraries, collections & technology
Looking at Libraries, collections & technologylisld
 
Linked data in libraries: another fad or paradigm shift?
Linked data in libraries: another fad or paradigm shift?Linked data in libraries: another fad or paradigm shift?
Linked data in libraries: another fad or paradigm shift?Amber Billey
 
Methodologies to evaluate the responsiveness of library systems
Methodologies to evaluate the responsiveness of library systemsMethodologies to evaluate the responsiveness of library systems
Methodologies to evaluate the responsiveness of library systemsJoe Matthews
 
Primo Usability: What Texas Tech Discovered When Implementing Primo
Primo Usability: What Texas Tech Discovered When Implementing PrimoPrimo Usability: What Texas Tech Discovered When Implementing Primo
Primo Usability: What Texas Tech Discovered When Implementing PrimoLynne Edgar
 
Curating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research LibrariesCurating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research LibrariesKeith Webster
 
Makerspaces: a great opportunity to enhance academic libraries, Stellenbosch...
Makerspaces:  a great opportunity to enhance academic libraries, Stellenbosch...Makerspaces:  a great opportunity to enhance academic libraries, Stellenbosch...
Makerspaces: a great opportunity to enhance academic libraries, Stellenbosch...Fers
 
Beyond MARC: BIBFRAME and the Future of Bibliographic Data
Beyond MARC: BIBFRAME and the Future of Bibliographic DataBeyond MARC: BIBFRAME and the Future of Bibliographic Data
Beyond MARC: BIBFRAME and the Future of Bibliographic DataEmily Nimsakont
 
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Michael Levine-Clark
 

Viewers also liked (12)

Enhancing a library OPAC with linked data
Enhancing a library OPAC with linked dataEnhancing a library OPAC with linked data
Enhancing a library OPAC with linked data
 
Linked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE ProjectLinked Open Data Approaches within the ARIADNE Project
Linked Open Data Approaches within the ARIADNE Project
 
Measuring impact: Impact Factor, h-index, and altmetrics
Measuring impact: Impact Factor, h-index, and altmetricsMeasuring impact: Impact Factor, h-index, and altmetrics
Measuring impact: Impact Factor, h-index, and altmetrics
 
Looking at Libraries, collections & technology
Looking at Libraries, collections & technologyLooking at Libraries, collections & technology
Looking at Libraries, collections & technology
 
Linked data in libraries: another fad or paradigm shift?
Linked data in libraries: another fad or paradigm shift?Linked data in libraries: another fad or paradigm shift?
Linked data in libraries: another fad or paradigm shift?
 
Methodologies to evaluate the responsiveness of library systems
Methodologies to evaluate the responsiveness of library systemsMethodologies to evaluate the responsiveness of library systems
Methodologies to evaluate the responsiveness of library systems
 
LibQUAL+ Survey Introduction
LibQUAL+ Survey IntroductionLibQUAL+ Survey Introduction
LibQUAL+ Survey Introduction
 
Primo Usability: What Texas Tech Discovered When Implementing Primo
Primo Usability: What Texas Tech Discovered When Implementing PrimoPrimo Usability: What Texas Tech Discovered When Implementing Primo
Primo Usability: What Texas Tech Discovered When Implementing Primo
 
Curating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research LibrariesCurating the Scholarly Record: Data Management and Research Libraries
Curating the Scholarly Record: Data Management and Research Libraries
 
Makerspaces: a great opportunity to enhance academic libraries, Stellenbosch...
Makerspaces:  a great opportunity to enhance academic libraries, Stellenbosch...Makerspaces:  a great opportunity to enhance academic libraries, Stellenbosch...
Makerspaces: a great opportunity to enhance academic libraries, Stellenbosch...
 
Beyond MARC: BIBFRAME and the Future of Bibliographic Data
Beyond MARC: BIBFRAME and the Future of Bibliographic DataBeyond MARC: BIBFRAME and the Future of Bibliographic Data
Beyond MARC: BIBFRAME and the Future of Bibliographic Data
 
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
Levine-Clark, Michael, Jane Burke, and Henning Schönenberger, “Assessing the ...
 

Similar to Linked Data Snowball, or Why We Need Reconciliation

Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppttpoelzer
 
Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppttpoelzer
 
Please follow these steps1. Choose a topic from the subject lis.docx
Please follow these steps1. Choose a topic from the subject lis.docxPlease follow these steps1. Choose a topic from the subject lis.docx
Please follow these steps1. Choose a topic from the subject lis.docxmattjtoni51554
 
Digital Literacy -- What Students Should Know
Digital Literacy -- What Students Should KnowDigital Literacy -- What Students Should Know
Digital Literacy -- What Students Should Knowkkraemer25
 
2018 GIS in Development: Semantic Web
2018 GIS in Development: Semantic Web2018 GIS in Development: Semantic Web
2018 GIS in Development: Semantic WebGIS in the Rockies
 
Assessing websites
Assessing websitesAssessing websites
Assessing websitesIngelesa
 
Social Work Masters Literature Review: Practical Searching
Social Work Masters Literature Review: Practical SearchingSocial Work Masters Literature Review: Practical Searching
Social Work Masters Literature Review: Practical SearchingElizabeth Moll-Willard
 
Website Evaluation
Website EvaluationWebsite Evaluation
Website Evaluationknightama
 
Class 1-become-an-online-sleuth
Class 1-become-an-online-sleuthClass 1-become-an-online-sleuth
Class 1-become-an-online-sleuthWheeler School
 
Ib ee presentation
Ib ee presentationIb ee presentation
Ib ee presentationDioLibrary
 
Digital portfolio 1_v2
Digital portfolio 1_v2Digital portfolio 1_v2
Digital portfolio 1_v2mustafaalinike
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataAndy Stretton
 

Similar to Linked Data Snowball, or Why We Need Reconciliation (20)

Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppt
 
Digifoot 2012 ppt
Digifoot 2012 pptDigifoot 2012 ppt
Digifoot 2012 ppt
 
Website evaluation
Website evaluationWebsite evaluation
Website evaluation
 
Evaluating websites
Evaluating websitesEvaluating websites
Evaluating websites
 
howtoresearch colai
howtoresearch colaihowtoresearch colai
howtoresearch colai
 
Evaluation Websites
Evaluation WebsitesEvaluation Websites
Evaluation Websites
 
Evaluation Websites
Evaluation WebsitesEvaluation Websites
Evaluation Websites
 
Evaluation Websites
Evaluation WebsitesEvaluation Websites
Evaluation Websites
 
Please follow these steps1. Choose a topic from the subject lis.docx
Please follow these steps1. Choose a topic from the subject lis.docxPlease follow these steps1. Choose a topic from the subject lis.docx
Please follow these steps1. Choose a topic from the subject lis.docx
 
Digital Literacy -- What Students Should Know
Digital Literacy -- What Students Should KnowDigital Literacy -- What Students Should Know
Digital Literacy -- What Students Should Know
 
Engl1101 Spring 2013-- Nagel
Engl1101 Spring 2013-- NagelEngl1101 Spring 2013-- Nagel
Engl1101 Spring 2013-- Nagel
 
2018 GIS in Development: Semantic Web
2018 GIS in Development: Semantic Web2018 GIS in Development: Semantic Web
2018 GIS in Development: Semantic Web
 
Assessing websites
Assessing websitesAssessing websites
Assessing websites
 
13 11-07 charleston bill-m
13 11-07 charleston bill-m13 11-07 charleston bill-m
13 11-07 charleston bill-m
 
Social Work Masters Literature Review: Practical Searching
Social Work Masters Literature Review: Practical SearchingSocial Work Masters Literature Review: Practical Searching
Social Work Masters Literature Review: Practical Searching
 
Website Evaluation
Website EvaluationWebsite Evaluation
Website Evaluation
 
Class 1-become-an-online-sleuth
Class 1-become-an-online-sleuthClass 1-become-an-online-sleuth
Class 1-become-an-online-sleuth
 
Ib ee presentation
Ib ee presentationIb ee presentation
Ib ee presentation
 
Digital portfolio 1_v2
Digital portfolio 1_v2Digital portfolio 1_v2
Digital portfolio 1_v2
 
Ordering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect dataOrdering the chaos: Creating websites with imperfect data
Ordering the chaos: Creating websites with imperfect data
 

More from Robert Sanderson

LUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at YaleLUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at YaleRobert Sanderson
 
Zoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable DataZoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable DataRobert Sanderson
 
Provenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked ArtProvenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked ArtRobert Sanderson
 
Data is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD SustainabilityData is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD SustainabilityRobert Sanderson
 
A Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityA Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityRobert Sanderson
 
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataLinked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataRobert Sanderson
 
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataIllusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataRobert Sanderson
 
Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)Robert Sanderson
 
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemSanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemRobert Sanderson
 
Tiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingTiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingRobert Sanderson
 
The Importance of being LOUD
The Importance of being LOUDThe Importance of being LOUD
The Importance of being LOUDRobert Sanderson
 
Introduction to Linked Art Model
Introduction to Linked Art ModelIntroduction to Linked Art Model
Introduction to Linked Art ModelRobert Sanderson
 
Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Robert Sanderson
 
Strong Opinions, Weakly Held
Strong Opinions, Weakly HeldStrong Opinions, Weakly Held
Strong Opinions, Weakly HeldRobert Sanderson
 
IIIF Discovery Walkthrough
IIIF Discovery WalkthroughIIIF Discovery Walkthrough
IIIF Discovery WalkthroughRobert Sanderson
 
Linked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRMLinked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRMRobert Sanderson
 
Euromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeEuromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeRobert Sanderson
 
Linked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data ModelLinked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data ModelRobert Sanderson
 
EuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUDEuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUDRobert Sanderson
 

More from Robert Sanderson (20)

Understanding Linked Art
Understanding Linked ArtUnderstanding Linked Art
Understanding Linked Art
 
LUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at YaleLUX - Cross Collections Cultural Heritage at Yale
LUX - Cross Collections Cultural Heritage at Yale
 
Zoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable DataZoom as a Paradigm for Linked Open Usable Data
Zoom as a Paradigm for Linked Open Usable Data
 
Provenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked ArtProvenance and Uncertainty in Linked Art
Provenance and Uncertainty in Linked Art
 
Data is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD SustainabilityData is our Product: Thoughts on LOD Sustainability
Data is our Product: Thoughts on LOD Sustainability
 
A Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and UsabilityA Perspective on Wikidata: Ecosystems, Trust, and Usability
A Perspective on Wikidata: Ecosystems, Trust, and Usability
 
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable DataLinked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
Linked Art: Sustainable Cultural Knowledge through Linked Open Usable Data
 
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open DataIllusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
Illusions of Grandeur: Trust and Belief in Cultural Heritage Linked Open Data
 
Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)Structural Metadata in RDF (IS575)
Structural Metadata in RDF (IS575)
 
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data EcosystemSanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
Sanderson CNI 2020 Keynote - Cultural Heritage Research Data Ecosystem
 
Tiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data ModelingTiers of Abstraction and Audience in Cultural Heritage Data Modeling
Tiers of Abstraction and Audience in Cultural Heritage Data Modeling
 
The Importance of being LOUD
The Importance of being LOUDThe Importance of being LOUD
The Importance of being LOUD
 
Introduction to Linked Art Model
Introduction to Linked Art ModelIntroduction to Linked Art Model
Introduction to Linked Art Model
 
Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...Standards and Communities: Connected People, Consistent Data, Usable Applicat...
Standards and Communities: Connected People, Consistent Data, Usable Applicat...
 
Strong Opinions, Weakly Held
Strong Opinions, Weakly HeldStrong Opinions, Weakly Held
Strong Opinions, Weakly Held
 
IIIF Discovery Walkthrough
IIIF Discovery WalkthroughIIIF Discovery Walkthrough
IIIF Discovery Walkthrough
 
Linked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRMLinked Art: An Art Museum Profile for CIDOC-CRM
Linked Art: An Art Museum Profile for CIDOC-CRM
 
Euromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over CommitteeEuromed2018 Keynote: Usability over Completeness, Community over Committee
Euromed2018 Keynote: Usability over Completeness, Community over Committee
 
Linked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data ModelLinked Art - Our Linked Open Usable Data Model
Linked Art - Our Linked Open Usable Data Model
 
EuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUDEuropeanaTech Keynote: Shout it out LOUD
EuropeanaTech Keynote: Shout it out LOUD
 

Recently uploaded

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 

Recently uploaded (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 

Linked Data Snowball, or Why We Need Reconciliation

  • 1. STANFORD UNIVERSITY LIBRARIES The Linked Data Snowball or Why We Need Reconciliation April 4th, 2016 TH E AAC / G E T T Y WO R K S H O P O N R E C O N C I L I AT I O N O F L I N K E D OP E N D ATA Rob Sanderson / azaroth@stanford.edu / @azaroth42
  • 2. STANFORD UNIVERSITY LIBRARIES The Linked Data Snowball or Why We Need Reconciliation April 4th, 2016 TH E AAC / G E T T Y WO R K S H O P O N R E C O N C I L I AT I O N O F L I N K E D OP E N D ATA Rob Sanderson / azaroth@stanford.edu / @azaroth42 web.stanford.edu/~azaroth/#me azaroth42@gmail.com / +azaroth42 orcid: 0000-0003-4441-6852
  • 3. STANFORD UNIVERSITY LIBRARIES The Linked Data Snowball or Why We Need Reconciliation April 4th, 2016 T H E A A C / G E T T Y W O R K S H O P O N Rob Sanderson / azaroth@stanford.edu / @azaroth42 web.stanford.edu/~azaroth/#me azaroth42@gmail.com / +azaroth42 orcid: 0000-0003-4441-6852 http://www.informatik.uni-trier.de/~ley/pers/hd/s/Sanderson:Robert http://academic.research.microsoft.com/Author/2765999 http://www.scopus.com/authid/detail.url?authorId=8988953600 www.researchgate.net/profile/Rob_Sanderson facebook.com/rob.sanderson / linkedin.com/pub/robert-sanderson/1/172/5a6/ rsanderson@lanl.gov / azaroth@liv.ac.uk public.lanl.gov/rsanderson / gondolin.hist.liv.ac.uk/~azaroth rds23@student.canterbury.ac.nz / azaroth@es-net.co.nz R E C O N C I L I AT I O N O F L I N K E D O P E N D ATA
  • 4. Linked Data? 1.  Use URIs as names for things 2.  Use HTTP URIs so that people can look up those names 3.  When someone looks up a URI, provide useful information, using the standards 4.  Include links to other URIs, so they can discover more things
  • 5. Linked Data? 1.  Use URIs as names for things 2.  Use HTTP URIs so that people can look up those names 3.  When someone looks up a URI, provide useful information, using the standards 4.  Include links to other URIs, so they can discover more things 5.  Link your data to other people's data to provide context
  • 6.
  • 7. Why So Many? Do I know the URI, or can I find it? URI No
  • 8. Why So Many? Do I know the URI, or can I find it? No Understand and agree with the model used? No URI
  • 9. Why So Many? Do I know the URI, or can I find it? No Understand and agree with the model used? No Understand and agree with the description? No URI
  • 10. Why So Many? Do I know the URI, or can I find it? No Understand and agree with the model used? No Understand and agree with the description? No Agree the URI identifies the same entity? No URI
  • 11. Why So Many? Do I know the URI, or can I find it? No Understand and agree with the model used? No Understand and agree with the description? No Agree the URI identifies the same entity? No Agree description is complete? No URI
  • 12. Why So Many? Do I know the URI, or can I find it? No Understand and agree with the model used? No Understand and agree with the description? No Agree the URI identifies the same entity? No Agree description is complete? No Hooray, you reused a URI! URI Yes
  • 13. Why So Many? Do I know the URI, or can I find it? No Understand and agree with the model used? No Understand and agree with the description? No Agree the URI identifies the same entity? No Agree description is complete? No Hooray, you reused a URI! Now start again with the next entity :( URI Yes
  • 14. Many Special and Unique Snowflakes
  • 15. Become a Huge Snowball of Technical Debt
  • 16. Option 1: Balance the Equation Cost(Create URI)! +! Cost(Maintain URI) ! ! Cost(Find Good URI)+ Cost(Understand Model)+ Cost(Understand Content)
 +! min( Risk(Reliability)+! Cost(Network Latency),! Risk(Out of Date)+! Cost(Cache Content))
 -! Value(Connected Graph)! <=
  • 18. Option 1 Likelihood Botticelli: http://vocab.getty.edu/ulan/500015254!
  • 19. Option 1 Likelihood Botticelli: http://vocab.getty.edu/ulan/500015254 :)!
  • 20. Option 1 Likelihood Botticelli: http://vocab.getty.edu/ulan/500015254! :(
  • 21. Option 2: Reconciliation YCBA's URIs Princeton's URIs
  • 23. Option 2: Reconciliation 1. Algorithmically discover this intersection given the descriptions of the entities
  • 24. Option 2: Reconciliation 2. Assert that the entity which two URIs identify is actually the same entity =
  • 28. Benefits of Reconciliation End User: •  Has access to more information, more easily, improving research, discovery and navigation •  Potential for new UIs, new research questions, reasoning Institution: •  Efficiency (= reduced cost) and improved quality of description •  Increased prestige when descriptions are reused •  Usage across the network is valuable business intelligence Community: •  Network effects spread faster and further, increasing awareness of cultural heritage •  Gives easier access to other communities' data
  • 29. Real Benefit of Reconciliation Reconciliation is a network damage limiting step towards balancing Equation 1 By linking entity descriptions together: •  the cost of discovery and understanding is reduced •  the costs of creating and maintaining the resources are shared across the community, not duplicated •  the value of the connected graph is increased •  the likelihood of new entities (requiring reconciliation) is reduced
  • 30. But How Can A Machine Know?? Algorithms won't be perfect, but can be good enough. •  What use cases will the reconciled data be used to fulfill? •  What is the cost of a false positive for those use cases? Precision: What % of matches are correct? Recall: What % of the possible matches were found? Can make trade-offs of precision vs recall for different use cases. Machine can record its certainty, and policy can provide a threshold.
  • 31. How Can We Improve It? Several different relationships to express similarity: •  owl:sameAs – always exactly the same (transitive) •  skos:exactMatch – the same for most purposes (transitive) •  skos:closeMatch – the same for some purposes (intransitive) The context of resource in the network is important •  Starting simple with high precision gives a better context to use the results to iteratively and incrementally bootstrap
  • 32. Trust and Community "Efficiency (= reduced cost) and improved quality of description" •  Efficiency comes from not duplicating descriptive effort... •  Which requires trusting other institutions in the community •  We need to work together, not...
  • 33. Trust and Community "Efficiency (= reduced cost) and improved quality of description" •  Efficiency comes from not duplicating descriptive effort... •  Which requires trusting other institutions in the community •  We need to work together, not...
  • 34. Entities to Reconcile As a community, we need to pick where to start. Suggest starting with least controversial / most unique: •  Physical objects •  People •  Places •  Events (specific, like Exhibitions) A small sub-domain (by time?) to make overlap more likely
  • 35. Q. Can I Reconcile a String? Named Entity Recognition "snowflake" = . strings to things Reconciliation . = . things to things
  • 36. The Hard Question How can we be more useful than DBPedia for our own entities?
  • 37. The Hard Question How can we be more useful than DBPedia for our own entities? •  Focus on unique selling points •  Demonstrate value early, both internally and to the broader community •  By working together to increase the value of the network
  • 38. STANFORD UNIVERSITY LIBRARIES Thank You! April 4th, 2016 Rob Sanderson / azaroth@stanford.edu / @azaroth42 web.stanford.edu/~azaroth/#me azaroth42@gmail.com / +azaroth42 orcid: 0000-0003-4441-6852 http://www.informatik.uni-trier.de/~ley/pers/hd/s/Sanderson:Robert http://academic.research.microsoft.com/Author/2765999 http://www.scopus.com/authid/detail.url?authorId=8988953600 www.researchgate.net/profile/Rob_Sanderson facebook.com/rob.sanderson / linkedin.com/pub/robert-sanderson/1/172/5a6/ rsanderson@lanl.gov / azaroth@liv.ac.uk public.lanl.gov/rsanderson / gondolin.hist.liv.ac.uk/~azaroth rds23@student.canterbury.ac.nz / azaroth@es-net.co.nz
  • 39. STANFORD UNIVERSITY LIBRARIES Thank You! April 4th, 2016 Rob Sanderson / azaroth@stanford.edu / @azaroth42 web.stanford.edu/~azaroth/#me azaroth42@gmail.com / +azaroth42 orcid: 0000-0003-4441-6852 http://www.informatik.uni-trier.de/~ley/pers/hd/s/Sanderson:Robert http://academic.research.microsoft.com/Author/2765999 http://www.scopus.com/authid/detail.url?authorId=8988953600 www.researchgate.net/profile/Rob_Sanderson facebook.com/rob.sanderson / linkedin.com/pub/robert-sanderson/1/172/5a6/ rsanderson@lanl.gov / azaroth@liv.ac.uk public.lanl.gov/rsanderson / gondolin.hist.liv.ac.uk/~azaroth rds23@student.canterbury.ac.nz / azaroth@es-net.co.nz
  • 40. STANFORD UNIVERSITY LIBRARIES Thank You! April 4th, 2016 azaroth@stanford.edu
  • 41. STANFORD UNIVERSITY LIBRARIES Thank You! April 4th, 2016 azaroth@stanford.edu
  • 43. Thank You! rsanderson@ge*y.edu Based on my slides from Andrew W. Mellon Foundation Reconciliation Workshop With recognition and thanks to all of the participants