SlideShare a Scribd company logo
1 of 25
Transforming Access to Culture
& History with Connected Data
The case of Europeana
Netherlands, Public Domain
1660 - 1625, Rijksmuseum
Anonymous
Arrival of a Portuguese ship
Netherlands, Public Domain
1615, Rijksmuseum
Anonymous
Elegant Party on a Terrace of a Venetian-
inspired Setting
Who we are
Europeana: Transforming the World with Culture
Europeana: Cultural Heritage Metadata
and Content from across Europe
• We aggregate data from Cultural Heritage organisations across
Europe
• predominantly, but not only, EU member states
• In most cases we only harvest metadata
• increasingly, however, we are also hosting content as well as
metadata
• Make it available through our portal site:
https://www.europeana.eu/portal/en
• ultimately linking back to the originating institution
Transforming Access to Culture & History with Connected DataCC BY-SA
Europeana in numbers
• 53 million+ items
• 30+ languages
• 4500+ GLAM (Galleries, Libraries, Archives, and Museums)
institutions
CC BY-SA
Transforming Access to Culture & History with Connected Data
Europeana as ‘Big Data’
• Volume: relatively low, by Big Data standards (< 2TB metadata)
• Velocity: continuous updating, flushed to datastore every 15 minutes
• Veracity: significant issues of data quality
• Variety: immense
• multiple languages
• multiple formats
• different institutions
• etc … extremely heterogeneous
CC BY-SA
Transforming Access to Culture & History with Connected Data
Analysed as the four ‘V’s …
Norway, CC BY-SA
1921, Oslo Museum
Ernest Rude
Ernest Marini - dancer in a costume
Who they are
Users: what they want and what they do
Who are they?
• "Culture Vultures"
• Academic researchers
• Teachers and students
• Visual artists
• Graphic designers
• Amateurs (in the original sense of the word)
• "Culture snackers”
• casual browsers looking for entertainment
CC BY-SA
Transforming Access to Culture & History with Connected Data
What are they looking for?
• Query pattern is extremely flat
• analysis of logs shows no search term shared by > 6 users
• further analysis needed here
• “serendipity search” is important: users are trying to surprise
themselves
CC BY-SA
Transforming Access to Culture & History with Connected Data
It seems literally impossible to say ….
What are they like?
• Culture vultures
• engagement is extremely high
• mean rank of clicked items: 82 (!)
• session length once an item is clicked in the SERP can stretch into
hours
• Culture snackers
• bounce rate difficult to estimate, but high (> 85%)
CC BY-SA
Transforming Access to Culture & History with Connected Data
User engagement
What are they doing?
• school reports
• university essays
• presentations
• exhibitions
• research papers
• new artworks
CC BY-SA
Transforming Access to Culture & History with Connected Data
Making new stuff!
United Kingdom, CC BY
The Wellcome Library
Luigi Garzi
The birth of Adonis and
the transformation of Myrrha
Where we’ve been, where
we’re going
Visions for cultural heritage and connected data, past
and present
Original Vision: as Linked Open Data
provider
CC BY-SA
Transforming Access to Culture & History with Connected Data
Linking Open Data cloud diagram 2011, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/
The original vision, today
• Ontological modelling
• Europeana Data Model (EDM)
• expressed in RDF for data-model mediation
• internationally shared (DPLA, BBC, etc.)
• Served on our SPARQL endpoint
• … but more frequently as JSON-LD over our APIs
• plug plug: received API World award this year for best Data API
CC BY-SA
Transforming Access to Culture & History with Connected Data
Continued contributions
LOD: New Directions
• “Entity-fication”
• 70%-80% of our searches are for named entities
• People
• Places
• Concepts (subject headings)
• Information on these can be harvested from:
• DBPedia
• Wikidata
• Geonames
• …
CC BY-SA
Transforming Access to Culture & History with Connected Data
Structuring content through LOD harvesting (i)
LOD: New Directions
• “Workification” (FRBR data model)
• creating abstract artistic or intellectual entities from numerous
instantiations
• for example, the novel “Oliver Twist” from its many printed editions
and translations
• Harvested (or at least seeded) from OCLC and VIAF
CC BY-SA
Transforming Access to Culture & History with Connected Data
Structuring content through LOD harvesting (ii)
LOD: New Directions
• Knowledge Graphs linking …
• authors to works
• artists to their paintings, and other artists
• concepts to concepts
• …
• Obvious applications
• educational
• research
• “serendipity”
• improved “snacker” engagement
CC BY-SA
Transforming Access to Culture & History with Connected Data
Structuring content through LOD harvesting (iii)
Case Study
CC BY-SA
Transforming Access to Culture & History with Connected Data
Linking Rembrandt to Jahangir
• https://www.thetimes.co.uk/article/from-rhinos-to-rembrandt-how-india-
inspired-the-world-hdsr8kls5
“Self-portrait” (Rembrandt van Rijn), “The Great-Mughal Jahangir” (Rembrandt van Rijn),
and “Prince Salim, the future Jahangir, Enthroned” (Anonymous), all in the public domain.
How we do it
Technical stack
France, Public Domain
1914, National Library of France
Agence de presse Meurisse
Concours de cycles nautiques sur le lac
d’Enghien : Berregent piloté par Austerling
The webapp stack
• Data ingestion: Java + XSLT behemoth
• Data enrichment: Java
• Source-of-truth datastore: MongoDB
• Information retrieval: Solr + Neo4J
• API: Swagger with Java
• UI: JS, variety of libraries
• SPARQL endpoint: Virtuoso
CC BY-SA
Transforming Access to Culture & History with Connected Data
France, Public Domain
1588, Bibliothèque municipale de Lyon
Hendrik Goltzius
Le dragon dévorant les compagnons
de Cadmus
Reality check
Where we are and how fast we can go
Dirty Data (i)
• getting from things to strings is a non-trivial process
• Named Entity Recognition technology relatively unhelpful in this
domain
• exact-string matching only: precision good, but recall poor
• multilinguality strong
• Limited number of tools to help with cleaning, enhancing, validating this
data
• OpenRefine potentially helpful
• ShEx, SHACL not yet fully mature
CC BY-SA
Transforming Access to Culture & History with Connected Data
Source data
Dirty Data (ii)
• Irregular data models
• Large number of Wikidata, DBpedia properties applied irregularly
• “defensive querying”
• Incorrect data
• more often questions of structure than inaccurate field values
• e.g. Geonames hierarchies
• Uncurated or aggregated data
• e.g., many variants provided by VIAF
CC BY-SA
Transforming Access to Culture & History with Connected Data
Linked Data resources
Directions forward
• Manual or at least heavily-supervised curation a requirement for the
foreseeable future
• Tools to aid NER and entity-matching are the focus of two US efforts:
• Institute of Museum and Library Services (IMLS) Local Authority Files
project
• Linked Data for Libraries Reconciliation Service Group
• Work division
• devolution to partners
• crowdsourcing
CC BY-SA
Transforming Access to Culture & History with Connected Data
Dealing with dirty data
16 November 2017
PANEL: LINKED OPEN DATA - IS IT FAILING
OR JUST GETTING OUT OF THE BLOCKS?
Tweet your questions via Direct Message to @Connected_Data or #ConnectedData
MODERATOR
James Phare
Connected Data London
PANELIST
Chris Taggart
CEO
OpenCorporates
@CountCulture
PANELIST
Chris Gutteridge
Linked Open Data
Architect
University of Southampton
PANELIST
Leigh Dodds
Data Infrastructure
Programme Lead
Open Data Institute
@ldodds
PANELIST
Sebastian Hellmann
Executive Director and
Board Member
DBpedia

More Related Content

What's hot

The Biodiversity Heritage Library 10 Years and More!
The Biodiversity Heritage Library 10 Years and More!The Biodiversity Heritage Library 10 Years and More!
The Biodiversity Heritage Library 10 Years and More!Martin Kalfatovic
 
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...PrattSILS
 
The Smithsonian Institution: Diffusing Knowledge in Partnership with the DPLA
The Smithsonian Institution: Diffusing Knowledge in Partnership with the DPLAThe Smithsonian Institution: Diffusing Knowledge in Partnership with the DPLA
The Smithsonian Institution: Diffusing Knowledge in Partnership with the DPLAMartin Kalfatovic
 
Knowledge Organization | LIS653 | Fall 2017
Knowledge Organization | LIS653 | Fall 2017Knowledge Organization | LIS653 | Fall 2017
Knowledge Organization | LIS653 | Fall 2017PrattSILS
 
Expanding Access for the Local and Global Increasing Access & Empowering Glob...
Expanding Access for the Local and Global Increasing Access & Empowering Glob...Expanding Access for the Local and Global Increasing Access & Empowering Glob...
Expanding Access for the Local and Global Increasing Access & Empowering Glob...Martin Kalfatovic
 
BHL Africa Workshop: An Overview of the BHL
BHL Africa Workshop: An Overview of the BHLBHL Africa Workshop: An Overview of the BHL
BHL Africa Workshop: An Overview of the BHLMartin Kalfatovic
 
The Biodiversity Heritage Library. 10+1 and Beyond: Looking Forward
The Biodiversity Heritage Library. 10+1 and Beyond: Looking ForwardThe Biodiversity Heritage Library. 10+1 and Beyond: Looking Forward
The Biodiversity Heritage Library. 10+1 and Beyond: Looking ForwardMartin Kalfatovic
 
Library mangement system for schools levels
Library mangement system for schools levelsLibrary mangement system for schools levels
Library mangement system for schools levelsLiaquat Rahoo
 
Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...
Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...
Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...Martin Kalfatovic
 
MOSAICA: Semantically Enhanced Multifaceted Collaborative Access to Cultural ...
MOSAICA: Semantically Enhanced Multifaceted Collaborative Access to Cultural ...MOSAICA: Semantically Enhanced Multifaceted Collaborative Access to Cultural ...
MOSAICA: Semantically Enhanced Multifaceted Collaborative Access to Cultural ...Dov Winer
 
The facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentThe facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentlisld
 
Gujranwala medical collge digital library access
Gujranwala medical collge digital library accessGujranwala medical collge digital library access
Gujranwala medical collge digital library accessAsif Iqbal
 
Increasing Access, Promoting Progress: Empowering Global Research through the...
Increasing Access, Promoting Progress: Empowering Global Research through the...Increasing Access, Promoting Progress: Empowering Global Research through the...
Increasing Access, Promoting Progress: Empowering Global Research through the...Martin Kalfatovic
 
How Did BHL Get to Big Data?
How Did BHL Get to Big Data?How Did BHL Get to Big Data?
How Did BHL Get to Big Data?Martin Kalfatovic
 
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...Martin Kalfatovic
 
An Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage LibraryAn Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage LibraryMartin Kalfatovic
 
Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...lisld
 

What's hot (20)

The Biodiversity Heritage Library 10 Years and More!
The Biodiversity Heritage Library 10 Years and More!The Biodiversity Heritage Library 10 Years and More!
The Biodiversity Heritage Library 10 Years and More!
 
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
LIS 653 Knowledge Organization | Pratt Institute School of Information | Fall...
 
The Smithsonian Institution: Diffusing Knowledge in Partnership with the DPLA
The Smithsonian Institution: Diffusing Knowledge in Partnership with the DPLAThe Smithsonian Institution: Diffusing Knowledge in Partnership with the DPLA
The Smithsonian Institution: Diffusing Knowledge in Partnership with the DPLA
 
Knowledge Organization | LIS653 | Fall 2017
Knowledge Organization | LIS653 | Fall 2017Knowledge Organization | LIS653 | Fall 2017
Knowledge Organization | LIS653 | Fall 2017
 
Ecdl2004
Ecdl2004Ecdl2004
Ecdl2004
 
Expanding Access for the Local and Global Increasing Access & Empowering Glob...
Expanding Access for the Local and Global Increasing Access & Empowering Glob...Expanding Access for the Local and Global Increasing Access & Empowering Glob...
Expanding Access for the Local and Global Increasing Access & Empowering Glob...
 
BHL Africa Workshop: An Overview of the BHL
BHL Africa Workshop: An Overview of the BHLBHL Africa Workshop: An Overview of the BHL
BHL Africa Workshop: An Overview of the BHL
 
Cbhl apr2014
Cbhl apr2014Cbhl apr2014
Cbhl apr2014
 
The Biodiversity Heritage Library. 10+1 and Beyond: Looking Forward
The Biodiversity Heritage Library. 10+1 and Beyond: Looking ForwardThe Biodiversity Heritage Library. 10+1 and Beyond: Looking Forward
The Biodiversity Heritage Library. 10+1 and Beyond: Looking Forward
 
Library mangement system for schools levels
Library mangement system for schools levelsLibrary mangement system for schools levels
Library mangement system for schools levels
 
Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...
Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...
Enabling Progress in Global Biodiversity Research: The Biodiversity Heritage ...
 
Internet Archive and Open Library
Internet Archive and Open LibraryInternet Archive and Open Library
Internet Archive and Open Library
 
MOSAICA: Semantically Enhanced Multifaceted Collaborative Access to Cultural ...
MOSAICA: Semantically Enhanced Multifaceted Collaborative Access to Cultural ...MOSAICA: Semantically Enhanced Multifaceted Collaborative Access to Cultural ...
MOSAICA: Semantically Enhanced Multifaceted Collaborative Access to Cultural ...
 
The facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environmentThe facilitated collection: collections and collecting in a network environment
The facilitated collection: collections and collecting in a network environment
 
Gujranwala medical collge digital library access
Gujranwala medical collge digital library accessGujranwala medical collge digital library access
Gujranwala medical collge digital library access
 
Increasing Access, Promoting Progress: Empowering Global Research through the...
Increasing Access, Promoting Progress: Empowering Global Research through the...Increasing Access, Promoting Progress: Empowering Global Research through the...
Increasing Access, Promoting Progress: Empowering Global Research through the...
 
How Did BHL Get to Big Data?
How Did BHL Get to Big Data?How Did BHL Get to Big Data?
How Did BHL Get to Big Data?
 
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
Free and Open Access to Biodiversity Literature: An Introduction to the Biodi...
 
An Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage LibraryAn Introduction to the Biodiversity Heritage Library
An Introduction to the Biodiversity Heritage Library
 
Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...Library futures: converging and diverging directions for public and academic ...
Library futures: converging and diverging directions for public and academic ...
 

Similar to Tim Hill

The Semantic Web and the Digital Archaeological Workflow: A Case Study from S...
The Semantic Web and the Digital Archaeological Workflow: A Case Study from S...The Semantic Web and the Digital Archaeological Workflow: A Case Study from S...
The Semantic Web and the Digital Archaeological Workflow: A Case Study from S...Marcus Smith
 
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...Marcus Smith
 
Crowdsourcing and Cultural Heritage Collections
Crowdsourcing and Cultural Heritage CollectionsCrowdsourcing and Cultural Heritage Collections
Crowdsourcing and Cultural Heritage CollectionsOurDigitalWorld
 
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...Digital Classicist Seminar Berlin
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsJon Voss
 
Culture GRID - Connecting UK Collections
Culture GRID - Connecting UK CollectionsCulture GRID - Connecting UK Collections
Culture GRID - Connecting UK CollectionsEuropeanaLocal Project
 
Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018Antoine Isaac
 
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015Antoine Isaac
 
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)Vladimir Alexiev, PhD, PMP
 
What a difference 10 years makes | But where to from here?
What a difference 10 years makes | But where to from here?What a difference 10 years makes | But where to from here?
What a difference 10 years makes | But where to from here?Adrian Kingston
 
The Benefits Of Doing Things Differently
The Benefits Of Doing Things DifferentlyThe Benefits Of Doing Things Differently
The Benefits Of Doing Things DifferentlyMike Ellis
 
Linking American Art to the Cloud
Linking American Art to the CloudLinking American Art to the Cloud
Linking American Art to the CloudGeorgina Goodlander
 
Linked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approachLinked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approachValentine Charles
 
American Art Collaborative Linked Open Data presentation to "The Networked Cu...
American Art Collaborative Linked Open Data presentation to "The Networked Cu...American Art Collaborative Linked Open Data presentation to "The Networked Cu...
American Art Collaborative Linked Open Data presentation to "The Networked Cu...American Art Collaborative
 
OpenGLAM in museums: Linked Open Data and Wikipedia
OpenGLAM in museums: Linked Open Data and WikipediaOpenGLAM in museums: Linked Open Data and Wikipedia
OpenGLAM in museums: Linked Open Data and WikipediaGeorgina Goodlander
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012lljohnston
 
Defining collections and creating their descriptions
Defining collections and creating their descriptionsDefining collections and creating their descriptions
Defining collections and creating their descriptionsValentine Charles
 
Enriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaEnriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaAntoine Isaac
 

Similar to Tim Hill (20)

The Semantic Web and the Digital Archaeological Workflow: A Case Study from S...
The Semantic Web and the Digital Archaeological Workflow: A Case Study from S...The Semantic Web and the Digital Archaeological Workflow: A Case Study from S...
The Semantic Web and the Digital Archaeological Workflow: A Case Study from S...
 
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
 
Crowdsourcing and Cultural Heritage Collections
Crowdsourcing and Cultural Heritage CollectionsCrowdsourcing and Cultural Heritage Collections
Crowdsourcing and Cultural Heritage Collections
 
Digitization and public libraries
Digitization and public librariesDigitization and public libraries
Digitization and public libraries
 
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
[DCSB] Dr Gabriel Bodard (KCL) “A View on Digital Classics Collaboration: fro...
 
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & MuseumsALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
ALIAOnline Practical Linked (Open) Data for Libraries, Archives & Museums
 
Culture GRID - Connecting UK Collections
Culture GRID - Connecting UK CollectionsCulture GRID - Connecting UK Collections
Culture GRID - Connecting UK Collections
 
Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018Designing a multilingual knowledge graph - DCMI2018
Designing a multilingual knowledge graph - DCMI2018
 
OpenGLAM: LOD and American Art
OpenGLAM: LOD and American ArtOpenGLAM: LOD and American Art
OpenGLAM: LOD and American Art
 
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
Wikidata, a target for Europeana's semantic strategy - GLAM-WIKI 2015
 
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
 
What a difference 10 years makes | But where to from here?
What a difference 10 years makes | But where to from here?What a difference 10 years makes | But where to from here?
What a difference 10 years makes | But where to from here?
 
The Benefits Of Doing Things Differently
The Benefits Of Doing Things DifferentlyThe Benefits Of Doing Things Differently
The Benefits Of Doing Things Differently
 
Linking American Art to the Cloud
Linking American Art to the CloudLinking American Art to the Cloud
Linking American Art to the Cloud
 
Linked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approachLinked Data for EuropeanaCultural Heritage: the Europeana approach
Linked Data for EuropeanaCultural Heritage: the Europeana approach
 
American Art Collaborative Linked Open Data presentation to "The Networked Cu...
American Art Collaborative Linked Open Data presentation to "The Networked Cu...American Art Collaborative Linked Open Data presentation to "The Networked Cu...
American Art Collaborative Linked Open Data presentation to "The Networked Cu...
 
OpenGLAM in museums: Linked Open Data and Wikipedia
OpenGLAM in museums: Linked Open Data and WikipediaOpenGLAM in museums: Linked Open Data and Wikipedia
OpenGLAM in museums: Linked Open Data and Wikipedia
 
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
Leslie Johnston: Library Big Data Repository Services, Open Repositories 2012
 
Defining collections and creating their descriptions
Defining collections and creating their descriptionsDefining collections and creating their descriptions
Defining collections and creating their descriptions
 
Enriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpediaEnriching Cultural Heritage Data with DBpedia
Enriching Cultural Heritage Data with DBpedia
 

More from Connected Data World

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenConnected Data World
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaConnected Data World
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Connected Data World
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine LearningConnected Data World
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is hereConnected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2Connected Data World
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3Connected Data World
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data ModelConnected Data World
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseConnected Data World
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Connected Data World
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Connected Data World
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleConnected Data World
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Connected Data World
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the WebConnected Data World
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsConnected Data World
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsConnected Data World
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...Connected Data World
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGOConnected Data World
 

More from Connected Data World (20)

Systems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van HarmelenSystems that learn and reason | Frank Van Harmelen
Systems that learn and reason | Frank Van Harmelen
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora Lassila
 
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
Κnowledge Architecture: Combining Strategy, Data Science and Information Arch...
 
How to get started with Graph Machine Learning
How to get started with Graph Machine LearningHow to get started with Graph Machine Learning
How to get started with Graph Machine Learning
 
Graphs in sustainable finance
Graphs in sustainable financeGraphs in sustainable finance
Graphs in sustainable finance
 
The years of the graph: The future of the future is here
The years of the graph: The future of the future is hereThe years of the graph: The future of the future is here
The years of the graph: The future of the future is here
 
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
From Taxonomies and Schemas to Knowledge Graphs: Parts 1 & 2
 
From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3From Taxonomies and Schemas to Knowledge Graphs: Part 3
From Taxonomies and Schemas to Knowledge Graphs: Part 3
 
In Search of the Universal Data Model
In Search of the Universal Data ModelIn Search of the Universal Data Model
In Search of the Universal Data Model
 
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph DatabaseGraph in Apache Cassandra. The World’s Most Scalable Graph Database
Graph in Apache Cassandra. The World’s Most Scalable Graph Database
 
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
Enterprise Data Governance: Leveraging Knowledge Graph & AI in support of a d...
 
Graph Realities
Graph RealitiesGraph Realities
Graph Realities
 
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
Powering Question-Driven Problem Solving to Improve the Chances of Finding Ne...
 
Semantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scaleSemantic similarity for faster Knowledge Graph delivery at scale
Semantic similarity for faster Knowledge Graph delivery at scale
 
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
Knowledge Graphs and AI to Hyper-Personalise the Fashion Retail Experience at...
 
Schema, Google & The Future of the Web
Schema, Google & The Future of the WebSchema, Google & The Future of the Web
Schema, Google & The Future of the Web
 
RAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needsRAPIDS cuGraph – Accelerating all your Graph needs
RAPIDS cuGraph – Accelerating all your Graph needs
 
Elegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property GraphsElegant and Scalable Code Querying with Code Property Graphs
Elegant and Scalable Code Querying with Code Property Graphs
 
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
From Knowledge Graphs to AI-powered SEO: Using taxonomies, schemas and knowle...
 
Graph for Good: Empowering your NGO
Graph for Good: Empowering your NGOGraph for Good: Empowering your NGO
Graph for Good: Empowering your NGO
 

Recently uploaded

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Tim Hill

  • 1. Transforming Access to Culture & History with Connected Data The case of Europeana Netherlands, Public Domain 1660 - 1625, Rijksmuseum Anonymous Arrival of a Portuguese ship
  • 2. Netherlands, Public Domain 1615, Rijksmuseum Anonymous Elegant Party on a Terrace of a Venetian- inspired Setting Who we are Europeana: Transforming the World with Culture
  • 3. Europeana: Cultural Heritage Metadata and Content from across Europe • We aggregate data from Cultural Heritage organisations across Europe • predominantly, but not only, EU member states • In most cases we only harvest metadata • increasingly, however, we are also hosting content as well as metadata • Make it available through our portal site: https://www.europeana.eu/portal/en • ultimately linking back to the originating institution Transforming Access to Culture & History with Connected DataCC BY-SA
  • 4. Europeana in numbers • 53 million+ items • 30+ languages • 4500+ GLAM (Galleries, Libraries, Archives, and Museums) institutions CC BY-SA Transforming Access to Culture & History with Connected Data
  • 5. Europeana as ‘Big Data’ • Volume: relatively low, by Big Data standards (< 2TB metadata) • Velocity: continuous updating, flushed to datastore every 15 minutes • Veracity: significant issues of data quality • Variety: immense • multiple languages • multiple formats • different institutions • etc … extremely heterogeneous CC BY-SA Transforming Access to Culture & History with Connected Data Analysed as the four ‘V’s …
  • 6. Norway, CC BY-SA 1921, Oslo Museum Ernest Rude Ernest Marini - dancer in a costume Who they are Users: what they want and what they do
  • 7. Who are they? • "Culture Vultures" • Academic researchers • Teachers and students • Visual artists • Graphic designers • Amateurs (in the original sense of the word) • "Culture snackers” • casual browsers looking for entertainment CC BY-SA Transforming Access to Culture & History with Connected Data
  • 8. What are they looking for? • Query pattern is extremely flat • analysis of logs shows no search term shared by > 6 users • further analysis needed here • “serendipity search” is important: users are trying to surprise themselves CC BY-SA Transforming Access to Culture & History with Connected Data It seems literally impossible to say ….
  • 9. What are they like? • Culture vultures • engagement is extremely high • mean rank of clicked items: 82 (!) • session length once an item is clicked in the SERP can stretch into hours • Culture snackers • bounce rate difficult to estimate, but high (> 85%) CC BY-SA Transforming Access to Culture & History with Connected Data User engagement
  • 10. What are they doing? • school reports • university essays • presentations • exhibitions • research papers • new artworks CC BY-SA Transforming Access to Culture & History with Connected Data Making new stuff!
  • 11. United Kingdom, CC BY The Wellcome Library Luigi Garzi The birth of Adonis and the transformation of Myrrha Where we’ve been, where we’re going Visions for cultural heritage and connected data, past and present
  • 12. Original Vision: as Linked Open Data provider CC BY-SA Transforming Access to Culture & History with Connected Data Linking Open Data cloud diagram 2011, by Andrejs Abele, John P. McCrae, Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/
  • 13. The original vision, today • Ontological modelling • Europeana Data Model (EDM) • expressed in RDF for data-model mediation • internationally shared (DPLA, BBC, etc.) • Served on our SPARQL endpoint • … but more frequently as JSON-LD over our APIs • plug plug: received API World award this year for best Data API CC BY-SA Transforming Access to Culture & History with Connected Data Continued contributions
  • 14. LOD: New Directions • “Entity-fication” • 70%-80% of our searches are for named entities • People • Places • Concepts (subject headings) • Information on these can be harvested from: • DBPedia • Wikidata • Geonames • … CC BY-SA Transforming Access to Culture & History with Connected Data Structuring content through LOD harvesting (i)
  • 15. LOD: New Directions • “Workification” (FRBR data model) • creating abstract artistic or intellectual entities from numerous instantiations • for example, the novel “Oliver Twist” from its many printed editions and translations • Harvested (or at least seeded) from OCLC and VIAF CC BY-SA Transforming Access to Culture & History with Connected Data Structuring content through LOD harvesting (ii)
  • 16. LOD: New Directions • Knowledge Graphs linking … • authors to works • artists to their paintings, and other artists • concepts to concepts • … • Obvious applications • educational • research • “serendipity” • improved “snacker” engagement CC BY-SA Transforming Access to Culture & History with Connected Data Structuring content through LOD harvesting (iii)
  • 17. Case Study CC BY-SA Transforming Access to Culture & History with Connected Data Linking Rembrandt to Jahangir • https://www.thetimes.co.uk/article/from-rhinos-to-rembrandt-how-india- inspired-the-world-hdsr8kls5 “Self-portrait” (Rembrandt van Rijn), “The Great-Mughal Jahangir” (Rembrandt van Rijn), and “Prince Salim, the future Jahangir, Enthroned” (Anonymous), all in the public domain.
  • 18. How we do it Technical stack France, Public Domain 1914, National Library of France Agence de presse Meurisse Concours de cycles nautiques sur le lac d’Enghien : Berregent piloté par Austerling
  • 19. The webapp stack • Data ingestion: Java + XSLT behemoth • Data enrichment: Java • Source-of-truth datastore: MongoDB • Information retrieval: Solr + Neo4J • API: Swagger with Java • UI: JS, variety of libraries • SPARQL endpoint: Virtuoso CC BY-SA Transforming Access to Culture & History with Connected Data
  • 20. France, Public Domain 1588, Bibliothèque municipale de Lyon Hendrik Goltzius Le dragon dévorant les compagnons de Cadmus Reality check Where we are and how fast we can go
  • 21. Dirty Data (i) • getting from things to strings is a non-trivial process • Named Entity Recognition technology relatively unhelpful in this domain • exact-string matching only: precision good, but recall poor • multilinguality strong • Limited number of tools to help with cleaning, enhancing, validating this data • OpenRefine potentially helpful • ShEx, SHACL not yet fully mature CC BY-SA Transforming Access to Culture & History with Connected Data Source data
  • 22. Dirty Data (ii) • Irregular data models • Large number of Wikidata, DBpedia properties applied irregularly • “defensive querying” • Incorrect data • more often questions of structure than inaccurate field values • e.g. Geonames hierarchies • Uncurated or aggregated data • e.g., many variants provided by VIAF CC BY-SA Transforming Access to Culture & History with Connected Data Linked Data resources
  • 23. Directions forward • Manual or at least heavily-supervised curation a requirement for the foreseeable future • Tools to aid NER and entity-matching are the focus of two US efforts: • Institute of Museum and Library Services (IMLS) Local Authority Files project • Linked Data for Libraries Reconciliation Service Group • Work division • devolution to partners • crowdsourcing CC BY-SA Transforming Access to Culture & History with Connected Data Dealing with dirty data
  • 25. PANEL: LINKED OPEN DATA - IS IT FAILING OR JUST GETTING OUT OF THE BLOCKS? Tweet your questions via Direct Message to @Connected_Data or #ConnectedData MODERATOR James Phare Connected Data London PANELIST Chris Taggart CEO OpenCorporates @CountCulture PANELIST Chris Gutteridge Linked Open Data Architect University of Southampton PANELIST Leigh Dodds Data Infrastructure Programme Lead Open Data Institute @ldodds PANELIST Sebastian Hellmann Executive Director and Board Member DBpedia