From Ambition to Go Live

Richard Wallis
Richard WallisSchema.org & Structured Web Data Consultant - Developer Advocate at Google via Adecco UK
From Ambition to Go Live
The National Library Board of Singapore’s Journey
to an operational Linked Data Management
and Discovery System
Richard Wallis
Evangelist and Founder
Data Liberate
richard.wallis@dataliberate.com
@dataliberate
15th Semantic Web in Libraries Conference
SWIB23 - 12th September 2023 - Berlin
Independent Consultant, Evangelist & Founder
W3C Community Groups:
• Bibframe2Schema (Chair) – Standardised conversion path(s)
• Schema Bib Extend (Chair) - Bibliographic data
• Schema Architypes (Chair) - Archives
• Financial Industry Business Ontology – Financial schema.org
• Tourism Structured Web Data (Co-Chair)
• Schema Course Extension
• Schema IoT Community
• Educational & Occupational Credentials in Schema.org
richard.wallis@dataliberate.com — @dataliberate
40+ Years – Computing
30+ Years – Cultural Heritage technology
20+ Years – Semantic Web & Linked Data
Worked With:
• Google – Schema.org vocabulary, site, extensions. documentation and community
• OCLC – Global library cooperative
• FIBO – Financial Industry Business Ontology Group
• Various Clients – Implementing/understanding Linked Data, Schema.org:
National Library Board Singapore
British Library — Stanford University — Europeana
2
3
National Library Board Singapore
Public Libraries
Network of 28 Public Libraries,
including 2 partner libraries*
Reading Programmes and Initiatives
Programmes and Exhibitions
targeted at Singapore communities
*Partner libraries are libraries which are partner owned
and
funded but managed by NLB/NLB’s subsidiary Libraries
and Archives Solutions Pte Ltd. Library@Chinatown and
the Lifelong Learning Institute Library are Partner libraries.
National Archives
Transferred from NHB to NLB in Nov
2012
Custodian of Singapore’s
Collective Memory: Responsible
for Collection, Preservation and
Management of
Singapore’s Public and Private
Archival Records
Promotes Public Interest in our
Nation’s
History and Heritage
National Library
Preserving Singapore’s Print
and Literary Heritage, and
Intellectual memory
Reference Collections
Legal Deposit (including
electronic)
4
Over
560,000
Singapore &
SEA items
Over 147,000
Chinese, Malay &
Tamil Languages
items
Reference Collection
Over 62,000
Social Sciences
& Humanities
items
Over 39,000
Science &
Technology
items
Over
53,000
Arts items
Over 19,000
Rare Materials
items
Archival Materials
Over 290,000
Government files &
Parliament papers
Over 190,000
Audiovisual & sound
recordings
Over 70,000
Maps & building
plans
Over
1.14m
Photographs
Over 35,000
Oral history
interviews
Over 55,000
Speeches & press
releases
Over
7,000
Posters
National Library Board Singapore
Over 5m
print collection
Over 2.4m
music tracks
78
databases
Over 7,400
e-newspapers and
e-magazines titles
Over
8,000
e-learning
courses
Over 1.7m
e-books and
audio books
Lending Collection
5
National Library Board Online Services
7
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
• To bring together diverse systems across the National Library,
National Archives, and Public Libraries in a Linked Data Environment
• To provide a staff interface to view and manage all entities, their
descriptions and relationships
8
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
• To bring together diverse systems across the National Library,
National Archives, and Public Libraries in a Linked Data Environment
• To provide a staff interface to view and manage all entities, their
descriptions and relationships
9
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
• To bring together diverse systems across the National Library,
National Archives, and Public Libraries in a Linked Data Environment
• To provide a staff interface to view and manage all entities, their
descriptions and relationships
10
The Ambition
• To enable the discovery & display of entitles from different sources
in a combined interface
• To bring together resources physical and digital
• To bring together diverse systems across the National Library,
National Archives, and Public Libraries in a Linked Data Environment
• To provide a staff interface to view and manage all entities, their
descriptions and relationships
11
The Journey Starts(March 2021)
A Linked Data Management System (LDMS)
• Cloud-based System
• Management Interface
• Source data curated at source
– ILS, Authorities, CMS, NAS
• Linked Data
• Public Interface – Google friendly
– Schema.org
• Shared standards
– Bibframe, Schema.org
12
The Journey Starts(March 2021)
A Linked Data Management System (LDMS)
• Cloud-based System
• Management Interface
• Source data curated at source
– ILS, Authorities, CMS, NAS
• Linked Data
• Public Interface – Google friendly
– Schema.org
• Shared standards
– Bibframe, Schema.org
13
The Journey Starts(March 2021)
A Linked Data Management System (LDMS)
• Cloud-based System
• Management Interface
• Source data curated at source
– ILS, Authorities, CMS, NAS
• Linked Data
• Public Interface – Google friendly
– Schema.org
• Shared standards
– Bibframe, Schema.org
14
The Journey Starts(March 2021)
A Linked Data Management System (LDMS)
• Cloud-based System
• Management Interface
• Source data curated at source
– ILS, Authorities, CMS, NAS
• Linked Data
• Public Interface – Google friendly
– Schema.org
• Shared standards
– Bibframe, Schema.org
15
The Journey Starts(March 2021)
A Linked Data Management System (LDMS)
• Cloud-based System
• Management Interface
• Source data curated at source
– ILS, Authorities, CMS, NAS
• Linked Data
• Public Interface – Google friendly
– Schema.org
• Shared standards
– Bibframe, Schema.org
16
The Journey Starts(March 2021)
A Linked Data Management System (LDMS)
• Cloud-based System
• Management Interface
• Source data curated at source
– ILS, Authorities, CMS, NAS
• Linked Data
• Public Interface – Google friendly
– Schema.org
• Shared standards
– Bibframe, Schema.org
17
The Journey Starts(March 2021)
A Linked Data Management System (LDMS)
• Cloud-based System
• Management Interface
• Source data curated at source
– ILS, Authorities, CMS, NAS
• Linked Data
• Public Interface – Google friendly
– Schema.org
• Shared standards
– Bibframe, Schema.org
18
Contract Awarded
metaphactory platform
Low-code knowledge graph platform
Semantic knowledge modeling
Semantic search & discovery
AWS Partner
Public sector partner
Singapore based
Linked Data, Structured data, Semantic
Web, bibliographic meta data, Schema.org
and management systems consultant
19
Initial Challenges
• Data – lots of it!
– Record-based formats: MARC-XML, DC-XML, CSV
• Entities described in the data – lots and lots!
• Regular updates – daily, weekly, monthly
• Automatic ingestion
• Not the single source of truth
• Manual update capability
• Entity-based search & discovery
• Web friendly – Schema.org output
20
Initial Challenges
• Data – lots of it!
– Record-based formats: MARC-XML, DC-XML, CSV
• Entities described in the data – lots and lots!
• Regular updates – daily, weekly, monthly
• Automatic ingestion
• Not the single source of truth
• Manual update capability
• Entity-based search & discovery
• Web friendly – Schema.org output
21
Initial Challenges
• Data – lots of it!
– Record-based formats: MARC-XML, DC-XML, CSV
• Entities described in the data – lots and lots!
• Regular updates – daily, weekly, monthly
• Automatic ingestion
• Not the single source of truth
• Manual update capability
• Entity-based search & discovery
• Web friendly – Schema.org output
22
Initial Challenges
• Data – lots of it!
– Record-based formats: MARC-XML, DC-XML, CSV
• Entities described in the data – lots and lots!
• Regular updates – daily, weekly, monthly
• Automatic ingestion
• Not the single source of truth
• Manual update capability
• Entity-based search & discovery
• Web friendly – Schema.org output
23
Initial Challenges
• Data – lots of it!
– Record-based formats: MARC-XML, DC-XML, CSV
• Entities described in the data – lots and lots!
• Regular updates – daily, weekly, monthly
• Automatic ingestion
• Not the single source of truth
• Manual update capability
• Entity-based search & discovery
• Web friendly – Schema.org output
24
Initial Challenges
• Data – lots of it!
– Record-based formats: MARC-XML, DC-XML, CSV
• Entities described in the data – lots and lots!
• Regular updates – daily, weekly, monthly
• Automatic ingestion
• Not the single source of truth
• Manual update capability
• Entity-based search & discovery
• Web friendly – Schema.org output
25
Initial Challenges
• Data – lots of it!
– Record-based formats: MARC-XML, DC-XML, CSV
• Entities described in the data – lots and lots!
• Regular updates – daily, weekly, monthly
• Automatic ingestion
• Not the single source of truth
• Manual update capability
• Entity-based search & discovery
• Web friendly – Schema.org output
26
Initial Challenges
• Data – lots of it!
– Record-based formats: MARC-XML, DC-XML, CSV
• Entities described in the data – lots and lots!
• Regular updates – daily, weekly, monthly
• Automatic ingestion
• Not the single source of truth
• Manual update capability
• Entity-based search & discovery
• Web friendly – Schema.org output
27
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
• Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
– All entities described using Schema.org as a minimum.
28
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
• Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
– All entities described using Schema.org as a minimum.
29
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
• Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
– All entities described using Schema.org as a minimum.
30
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
• Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
– All entities described using Schema.org as a minimum.
31
Basic Data Model
• Linked Data
– BIBFRAME to capture detail of bibliographic records
– Schema.org to deliver structured data for search engines
– Schema.org representation of CMS, NAS, TTE data
– Schema.org enrichment of BIBFRAME
• Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph
– All entities described using Schema.org as a minimum.
32
Data Data Data!
Data Source Source Records Entity Count Update Frequency
ILS 1.4m 56.8m Daily
CMS 82k 228k Weekly
NAS 1.6m 6.7m Monthly
TTE 3k 317k Monthly
3.1m 70.4m
33
Data Ingest Pipelines – ILS
• Source data in MARC-XML files
• Step 1: marc2bibframe2 scripts
– Open source – shared by Library of Congress
– “standard” approach
– Output BIBFRAME as individual RDF-XML files
• Step 2: bibframe2schema.org script
– Open source – Bibframe2Schema.org Community Group
– SPARQL-based script
– Output Schema.org enriched individual BIBFRAME RDF files for loading into
Knowledge graph triplestore
34
Data Ingest Pipelines – ILS
• Source data in MARC-XML files
• Step 1: marc2bibframe2 scripts
– Open source – shared by Library of Congress
– “standard” approach
– Output BIBFRAME as individual RDF-XML files
• Step 2: bibframe2schema.org script
– Open source – Bibframe2Schema.org Community Group
– SPARQL-based script
– Output Schema.org enriched individual BIBFRAME RDF files for loading into
Knowledge graph triplestore
35
Data Ingest Pipelines – ILS
• Source data in MARC-XML files
• Step 1: marc2bibframe2 scripts
– Open source – shared by Library of Congress
– “standard” approach
– Output BIBFRAME as individual RDF-XML files
• Step 2: bibframe2schema.org script
– Open source – Bibframe2Schema.org Community Group
– SPARQL-based script
– Output Schema.org enriched individual BIBFRAME RDF files for loading into
Knowledge graph triplestore
36
Data Ingestion Pipelines – TTE (Authorities)
• Source data in bespoke CSV format
• Exported in collection-based files
– People, Places, organisations, time periods, etc.
• Interrelated references so needed to be considered as a whole
• Bespoke python script
• Creating Schema.org entity descriptions
• Output RDF format file for loading into Knowledge graph
triplestore
37
Data Ingestion Pipelines – TTE (Authorities)
• Source data in bespoke CSV format
• Exported in collection-based files
– People, Places, organisations, time periods, etc.
• Interrelated references so needed to be considered as a whole
• Bespoke python script
• Creating Schema.org entity descriptions
• Output RDF format file for loading into Knowledge graph
triplestore
38
Data Ingestion Pipelines – TTE (Authorities)
• Source data in bespoke CSV format
• Exported in collection-based files
– People, Places, organisations, time periods, etc.
• Interrelated references so needed to be considered as a whole
• Bespoke python script
• Creating Schema.org entity descriptions
• Output RDF format file for loading into Knowledge graph
triplestore
39
Data Ingestion Pipelines – NAS & CMS
• Source data in DC-XML files
• Step 1: DC-XML to DC-Terms RDF
– Bespoke XSLT script to translate from DC record to DCT-RDF
• Step 2: TTE lookup
– Lookup identified values against TTE entities in Knowledge graph
• use URI if matched
• Step 3: Schema.org entity creation
– SPARQL script create schema representation of DCT entities
• Output individual RDF format files for loading into Knowledge graph
40
Data Ingestion Pipelines – NAS & CMS
• Source data in DC-XML files
• Step 1: DC-XML to DC-Terms RDF
– Bespoke XSLT script to translate from DC record to DCT-RDF
• Step 2: TTE lookup
– Lookup identified values against TTE entities in Knowledge graph
• use URI if matched
• Step 3: Schema.org entity creation
– SPARQL script create schema representation of DCT entities
• Output individual RDF format files for loading into Knowledge graph
41
Data Ingestion Pipelines – NAS & CMS
• Source data in DC-XML files
• Step 1: DC-XML to DC-Terms RDF
– Bespoke XSLT script to translate from DC record to DCT-RDF
• Step 2: TTE lookup
– Lookup identified values against TTE entities in Knowledge graph
• use URI if matched
• Step 3: Schema.org entity creation
– SPARQL script create schema representation of DCT entities
• Output individual RDF format files for loading into Knowledge graph
42
Data Ingestion Pipelines – NAS & CMS
• Source data in DC-XML files
• Step 1: DC-XML to DC-Terms RDF
– Bespoke XSLT script to translate from DC record to DCT-RDF
• Step 2: TTE lookup
– Lookup identified values against TTE entities in Knowledge graph
• use URI if matched
• Step 3: Schema.org entity creation
– SPARQL script create schema representation of DCT entities
• Output individual RDF format files for loading into Knowledge graph
43
Technical Architecture (simplified)
Hosted on Amazon Web Services
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
44
Technical Architecture (simplified)
Hosted on Amazon Web Services
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
45
Technical Architecture (simplified)
Hosted on Amazon Web Services
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
46
PUBLIC ACCESS
Technical Architecture (simplified)
Hosted on Amazon Web Services
DI
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
47
CURATOR ACCESS
PUBLIC ACCESS
Technical Architecture (simplified)
Hosted on Amazon Web Services
DI
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
GraphDB
Cluster
Pipeline
processing
Batch Scripts
import control
Etc.
SOURCE DATA
IMPORT
DMI
48
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 70.4 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
– Singapore Art Museum
• Entities from source data
• 21 CMS, 1 NAS, 66 ILS, 1 TTE
• Users only want 1 of each!
49
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 70.4 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
– Singapore Art Museum
• Entities from source data
• 21 CMS, 1 NAS, 66 ILS, 1 TTE
• Users only want 1 of each!
50
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 70.4 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
– Singapore Art Museum
• Entities from source data
• 21 CMS, 1 NAS, 66 ILS, 1 TTE
• Users only want 1 of each!
51
A need for entity reconciliation …..
• Lots (and lots and lots) of source entities – 70.4 million entities
• Lots of duplication
– Lee, Kuan Yew – 1st Prime Minister of Singapore
• 160 individual entities in ILS source data
– Singapore Art Museum
• Entities from source data
• 21 CMS, 1 NAS, 66 ILS, 1 TTE
• Users only want 1 of each!
52
70.4(million) into 6.1(million) entities does go!
• We know that now - but how did we get there?
• Requirements hurdles to be cleared:
– Not a single source of truth
– Regular automatic updates – add / update / delete
– Manual management of combined entities
• Suppression of incorrect or ’private’ attributes from display
• Addition of attributes not in source data
• Creating / breaking relationships between entities
• Near real-time updates
53
70.4(million) into 6.1(million) entities does go!
• We know that now - but how did we get there?
• Requirements hurdles to be cleared:
– Not a single source of truth
– Regular automatic updates – add / update / delete
– Manual management of combined entities
• Suppression of incorrect or ’private’ attributes from display
• Addition of attributes not in source data
• Creating / breaking relationships between entities
• Near real-time updates
54
70.4(million) into 6.1(million) entities does go!
• We know that now - but how did we get there?
• Requirements hurdles to be cleared:
– Not a single source of truth
– Regular automatic updates – add / update / delete
– Manual management of combined entities
• Suppression of incorrect or ’private’ attributes from display
• Addition of attributes not in source data
• Creating / breaking relationships between entities
• Near real-time updates
55
70.4(million) into 6.1(million) entities does go!
• We know that now - but how did we get there?
• Requirements hurdles to be cleared:
– Not a single source of truth
– Regular automatic updates – add / update / delete
– Manual management of combined entities
• Suppression of incorrect or ’private’ attributes from display
• Addition of attributes not in source data
• Creating / breaking relationships between entities
• Near real-time updates
56
70.4(million) into 6.1(million) entities does go!
• We know that now - but how did we get there?
• Requirements hurdles to be cleared:
– Not a single source of truth
– Regular automatic updates – add / update / delete
– Manual management of combined entities
• Suppression of incorrect or ’private’ attributes from display
• Addition of attributes not in source data
• Creating / breaking relationships between entities
• Near real-time updates
57
Adaptive Data Model Concepts
• Source entitles
– Individual representation of source data
• Aggregation entities
– Tracking relationships between source entities for the same thing
– No copying of attributes
• Primary Entities
– Searchable by users
– Displayable to users
– Consolidation of aggregated source data & managed attributes
58
Adaptive Data Model Concepts
• Source entitles
– Individual representation of source data
• Aggregation entities
– Tracking relationships between source entities for the same thing
– No copying of attributes
• Primary Entities
– Searchable by users
– Displayable to users
– Consolidation of aggregated source data & managed attributes
59
Adaptive Data Model Concepts
• Source entitles
– Individual representation of source data
• Aggregation entities
– Tracking relationships between source entities for the same thing
– No copying of attributes
• Primary Entities
– Searchable by users
– Displayable to users
– Consolidation of aggregated source data & managed attributes
60
61
62
63
64
65
66
67
68
69
Entity Matching
• Candidate matches based on schema:names
– Lucene indexes matching
– Levenshtein similarity refined
– Same entity type
– Entity type specific rules eg:
• Work: name + creator / contributor / author / sameAs
• Person: name + birthDate / deathDate / sameAs
70
Entity Matching
• Candidate matches based on schema:names
– Lucene indexes matching
– Levenshtein similarity refined
– Same entity type
– Entity type specific rules eg:
• Work: name + creator / contributor / author / sameAs
• Person: name + birthDate / deathDate / sameAs
71
Entity Matching
• Candidate matches based on schema:names
– Lucene indexes matching
– Levenshtein similarity refined
– Same entity type
– Entity type specific rules eg:
• Work: name + creator / contributor / author / sameAs
• Person: name + birthDate / deathDate / sameAs
72
Entity Matching
• Candidate matches based on schema:names
– Lucene indexes matching
– Levenshtein similarity refined
– Same entity type
– Entity type specific rules eg:
• Work: name + creator / contributor / author / sameAs
• Person: name + birthDate / deathDate / sameAs
73
Entity Matching
• Candidate matches based on schema:names
– Lucene indexes matching
– Levenshtein similarity refined
– Same entity type
– Entity type specific rules eg:
• Work: name + creator / contributor / author / sameAs
• Person: name + birthDate / deathDate / sameAs
74
The entity iceberg
75
The entity iceberg
Primary
76
The entity iceberg
Primary
Discovery
77
The entity iceberg
Primary
Aggregation
Discovery
78
The entity iceberg
Primary
Aggregation
Source
Ingestion
Pipelines
Discovery
79
The entity iceberg
Primary
Aggregation
Source
Ingestion
Pipelines
Discovery
Management
80
Data Management Interface
• Rapidly developed – easily modified
• Using mataphactory’s semantic visual interface
• A collaboration environment for data curators
• Pre-publish entity management
• Search & discovery – emulates DI
• Dashboard view
• Entity management / Creation
81
Data Management Interface
• Rapidly developed – easily modified
• Using mataphactory’s semantic visual interface
• A collaboration environment for data curators
• Pre-publish entity management
• Search & discovery – emulates DI
• Dashboard view
• Entity management / Creation
82
Data Management Interface
• Rapidly developed – easily modified
• Using mataphactory’s semantic visual interface
• A collaboration environment for data curators
• Pre-publish entity management
• Search & discovery – emulates DI
• Dashboard view
• Entity management / Creation
83
Data Management Interface
• Rapidly developed – easily modified
• Using mataphactory’s semantic visual interface
• A collaboration environment for data curators
• Pre-publish entity management
• Search & discovery – emulates DI
• Dashboard view
• Entity management / Creation
84
Data Management Interface
• Rapidly developed – easily modified
• Using mataphactory’s semantic visual interface
• A collaboration environment for data curators
• Pre-publish entity management
• Search & discovery – emulates DI
• Dashboard view
• Entity management / Creation
85
Data Management Interface
• Rapidly developed – easily modified
• Using mataphactory’s semantic visual interface
• A collaboration environment for data curators
• Pre-publish entity management
• Search & discovery – emulates DI
• Dashboard view
• Entity management / Creation
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
Discovery Interface (DI)
104
105
106
107
108
109
110
111
LDMS System Live – December 2022
• Live – data curation team using DMI
– Entity curation
– Manually merging & splitting aggregations
• Live – data ingest pipelines
– Daily / Weekly / Monthly
– Delta update exports from source systems
• Live – Discovery Interface ready for deployment
• Live – Support
– Issue resolution & updates
112
LDMS System Live – December 2022
• Live – data curation team using DMI
– Entity curation
– Manually merging & splitting aggregations
• Live – data ingest pipelines
– Daily / Weekly / Monthly
– Delta update exports from source systems
• Live – Discovery Interface ready for deployment
• Live – Support
– Issue resolution & updates
113
LDMS System Live – December 2022
• Live – data curation team using DMI
– Entity curation
– Manually merging & splitting aggregations
• Live – data ingest pipelines
– Daily / Weekly / Monthly
– Delta update exports from source systems
• Live – Discovery Interface ready for deployment
• Live – Support
– Issue resolution & updates
114
LDMS System Live – December 2022
• Live – data curation team using DMI
– Entity curation
– Manually merging & splitting aggregations
• Live – data ingest pipelines
– Daily / Weekly / Monthly
– Delta update exports from source systems
• Live – Discovery Interface ready for deployment
• Live – Support
– Issue resolution & updates
115
Discovery Interface
• Live and ready for deployment
– Performance, load and user tested
• Designed & developed by NLB Data Team to demonstrate
characteristics and benefits of a Linked Data service
• NLB services rolled-out by National Library & Public Library divisions
– Recently identified a service for Linked Data integration
• Currently undergoing fine tuning to meet service’s UI requirements
116
Discovery Interface
• Live and ready for deployment
– Performance, load and user tested
• Designed & developed by NLB Data Team to demonstrate
characteristics and benefits of a Linked Data service
• NLB services rolled-out by National Library & Public Library divisions
– Recently identified a service for Linked Data integration
• Currently undergoing fine tuning to meet service’s UI requirements
117
Discovery Interface
• Live and ready for deployment
– Performance, load and user tested
• Designed & developed by NLB Data Team to demonstrate
characteristics and benefits of a Linked Data service
• NLB services rolled-out by National Library & Public Library divisions
– Recently identified a service for Linked Data integration
• Currently undergoing fine tuning to meet service’s UI requirements
118
Daily Delta Ingestion & Reconciliation
119
• Inaccurate source data
– Not apparent in source systems
– Very obvious when aggregated in LDMS
• Eg. Subject and Person URIs duplicated
– Identified and analysed in DMI
– Reported to source curators and fixed
– Updated in next delta update
– Data issues identified in LDMS – source data quality enhanced
Some Challenges Addressed on the Way
120
• Inaccurate source data
– Not apparent in source systems
– Very obvious when aggregated in LDMS
• Eg. Subject and Person URIs duplicated
– Identified and analysed in DMI
– Reported to source curators and fixed
– Updated in next delta update
– Data issues identified in LDMS – source data quality enhanced
Some Challenges Addressed on the Way
121
• Asian name matching
– Often consisting of several short 3-4 letter names
– Lucene matches not of much help
– Introduction of Levenshtein similarity matching helped
– Tuning is challenging – trading off European & Asian names
Some Challenges Addressed on the Way
122
• Asian name matching
– Often consisting of several short 3-4 letter names
– Lucene matches not of much help
– Introduction of Levenshtein similarity matching helped
– Tuning is challenging – trading off European & Asian names
Some Challenges Addressed on the Way
137
From Ambition to Go Live – A Summary
• Ambition based on previous years of trials, experimentation, and understanding of
Linked Data potential
• Ambition to deliver a production LDMS to provide Linked Data curation and
management capable of delivering a public discovery service
• With experienced commercial partners & tools as part of a globally distributed team
– Benefiting from Open Source community developments where appropriate
• Unique and challenging requirements
– Distributed sources of truth in disparate data formats
– Auto updating, consolidated entity view – plus individual entity management
– Web friendly – providing structured data for search engines
• Live and operational actively delivering benefits from December 2022
138
From Ambition to Go Live – A Summary
• Ambition based on previous years of trials, experimentation, and understanding of
Linked Data potential
• Ambition to deliver a production LDMS to provide Linked Data curation and
management capable of delivering a public discovery service
• With experienced commercial partners & tools as part of a globally distributed team
– Benefiting from Open Source community developments where appropriate
• Unique and challenging requirements
– Distributed sources of truth in disparate data formats
– Auto updating, consolidated entity view – plus individual entity management
– Web friendly – providing structured data for search engines
• Live and operational actively delivering benefits from December 2022
139
From Ambition to Go Live – A Summary
• Ambition based on previous years of trials, experimentation, and understanding of
Linked Data potential
• Ambition to deliver a production LDMS to provide Linked Data curation and
management capable of delivering a public discovery service
• With experienced commercial partners & tools as part of a globally distributed team
– Benefiting from Open Source community developments where appropriate
• Unique and challenging requirements
– Distributed sources of truth in disparate data formats
– Auto updating, consolidated entity view – plus individual entity management
– Web friendly – providing structured data for search engines
• Live and operational actively delivering benefits from December 2022
140
From Ambition to Go Live – A Summary
• Ambition based on previous years of trials, experimentation, and understanding of
Linked Data potential
• Ambition to deliver a production LDMS to provide Linked Data curation and
management capable of delivering a public discovery service
• With experienced commercial partners & tools as part of a globally distributed team
– Benefiting from Open Source community developments where appropriate
• Unique and challenging requirements
– Distributed sources of truth in disparate data formats
– Auto updating, consolidated entity view – plus individual entity management
– Web friendly – providing structured data for search engines
• Live and operational actively delivering benefits from December 2022
141
From Ambition to Go Live – A Summary
• Ambition based on previous years of trials, experimentation, and understanding of
Linked Data potential
• Ambition to deliver a production LDMS to provide Linked Data curation and
management capable of delivering a public discovery service
• With experienced commercial partners & tools as part of a globally distributed team
– Benefiting from Open Source community developments where appropriate
• Unique and challenging requirements
– Distributed sources of truth in disparate data formats
– Auto updating, consolidated entity view – plus individual entity management
– Web friendly – providing structured data for search engines
• Live and operational actively delivering benefits from December 2022
From Ambition to Go Live
The National Library Board of Singapore’s Journey
to an operational Linked Data Management
and Discovery System
Richard Wallis
Evangelist and Founder
Data Liberate
richard.wallis@dataliberate.com
@dataliberate
15th Semantic Web in Libraries Conference
SWIB23 - 12th September 2023 - Berlin
1 of 127

Recommended

5 Ways Healthcare Brands Can Stand out to HCPs in the Digital Ecosystem by
5 Ways Healthcare Brands Can Stand out to HCPs in the Digital Ecosystem5 Ways Healthcare Brands Can Stand out to HCPs in the Digital Ecosystem
5 Ways Healthcare Brands Can Stand out to HCPs in the Digital EcosystemDRG Digital
5.6K views37 slides
service experience summit shanghai 2014 stefan moritz by
service experience summit shanghai 2014 stefan moritzservice experience summit shanghai 2014 stefan moritz
service experience summit shanghai 2014 stefan moritzStefan Moritz
43.7K views62 slides
Connecting to the NOMIS API in PowerBI by
Connecting to the NOMIS API in PowerBIConnecting to the NOMIS API in PowerBI
Connecting to the NOMIS API in PowerBIOffice for National Statistics
79 views4 slides
How to Use Geospatial Data to Identify CPG Demnd Hotspots by
How to Use Geospatial Data to Identify CPG Demnd HotspotsHow to Use Geospatial Data to Identify CPG Demnd Hotspots
How to Use Geospatial Data to Identify CPG Demnd HotspotsCARTO
323 views30 slides
Sustainability in Telecoms by
Sustainability in TelecomsSustainability in Telecoms
Sustainability in TelecomsLuis Orozco
7.3K views37 slides
Tesla : sales and distribution by
Tesla : sales and distributionTesla : sales and distribution
Tesla : sales and distributionaseel m
6.5K views21 slides

More Related Content

What's hot

BUSM 4850-individual assignment by
BUSM 4850-individual assignmentBUSM 4850-individual assignment
BUSM 4850-individual assignmentYing DONG
547 views17 slides
4) Placing a digital platform at the heart of organisational change with Oxfa... by
4) Placing a digital platform at the heart of organisational change with Oxfa...4) Placing a digital platform at the heart of organisational change with Oxfa...
4) Placing a digital platform at the heart of organisational change with Oxfa...Code Computerlove
1.2K views72 slides
Salesforce Basecamp Helsinki 8.5.2018 - Boston Consulting Group by
Salesforce Basecamp Helsinki 8.5.2018 - Boston Consulting GroupSalesforce Basecamp Helsinki 8.5.2018 - Boston Consulting Group
Salesforce Basecamp Helsinki 8.5.2018 - Boston Consulting GroupSalesforce Finland
1.9K views17 slides
Telling stories with data slideshare by
Telling stories with data   slideshareTelling stories with data   slideshare
Telling stories with data slideshareCathie Howe
2.9K views32 slides
Thailand Medical Hub of Asia by
Thailand Medical Hub of AsiaThailand Medical Hub of Asia
Thailand Medical Hub of AsiaThailand Board of Investment North America
6.6K views18 slides
Data lake ppt by
Data lake pptData lake ppt
Data lake pptSwarnaLatha177
130 views15 slides

What's hot(6)

Similar to From Ambition to Go Live

Contextual Computing: Laying a Global Data Foundation by
Contextual Computing: Laying a Global Data FoundationContextual Computing: Laying a Global Data Foundation
Contextual Computing: Laying a Global Data FoundationRichard Wallis
868 views129 slides
Contextual Computing - Knowledge Graphs & Web of Entities by
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesRichard Wallis
2.8K views135 slides
Schema.org: Where did that come from! by
Schema.org: Where did that come from!Schema.org: Where did that come from!
Schema.org: Where did that come from!Richard Wallis
1.7K views90 slides
CILIP Conference - x metadata evolution the final mile - Richard Wallis by
CILIP Conference - x metadata evolution the final mile - Richard WallisCILIP Conference - x metadata evolution the final mile - Richard Wallis
CILIP Conference - x metadata evolution the final mile - Richard WallisCILIP
361 views30 slides
Marc and beyond: 3 Linked Data Choices by
 Marc and beyond: 3 Linked Data Choices  Marc and beyond: 3 Linked Data Choices
Marc and beyond: 3 Linked Data Choices Richard Wallis
1.4K views20 slides
LD4L OCLC Data Strategy by
LD4L OCLC Data StrategyLD4L OCLC Data Strategy
LD4L OCLC Data StrategyRichard Wallis
2.2K views80 slides

Similar to From Ambition to Go Live(20)

Contextual Computing: Laying a Global Data Foundation by Richard Wallis
Contextual Computing: Laying a Global Data FoundationContextual Computing: Laying a Global Data Foundation
Contextual Computing: Laying a Global Data Foundation
Richard Wallis868 views
Contextual Computing - Knowledge Graphs & Web of Entities by Richard Wallis
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of Entities
Richard Wallis2.8K views
Schema.org: Where did that come from! by Richard Wallis
Schema.org: Where did that come from!Schema.org: Where did that come from!
Schema.org: Where did that come from!
Richard Wallis1.7K views
CILIP Conference - x metadata evolution the final mile - Richard Wallis by CILIP
CILIP Conference - x metadata evolution the final mile - Richard WallisCILIP Conference - x metadata evolution the final mile - Richard Wallis
CILIP Conference - x metadata evolution the final mile - Richard Wallis
CILIP361 views
Marc and beyond: 3 Linked Data Choices by Richard Wallis
 Marc and beyond: 3 Linked Data Choices  Marc and beyond: 3 Linked Data Choices
Marc and beyond: 3 Linked Data Choices
Richard Wallis1.4K views
Linked Data: from Library Entities to the Web of Data by Richard Wallis
Linked Data: from Library Entities to the Web of DataLinked Data: from Library Entities to the Web of Data
Linked Data: from Library Entities to the Web of Data
Richard Wallis2.4K views
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio... by Marcus Smith
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Marcus Smith715 views
Scaling up Linked Data by EUCLID project
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
EUCLID project18.5K views
Telling the World and Our Users What We Have by Richard Wallis
Telling the World and Our Users What We HaveTelling the World and Our Users What We Have
Telling the World and Our Users What We Have
Richard Wallis1.7K views
Three Linked Data choices for Libraries by Richard Wallis
Three Linked Data choices for LibrariesThree Linked Data choices for Libraries
Three Linked Data choices for Libraries
Richard Wallis636 views
WorldCat, Works, and Schema.org by Richard Wallis
WorldCat, Works, and Schema.orgWorldCat, Works, and Schema.org
WorldCat, Works, and Schema.org
Richard Wallis2.3K views
Schema.org where did that come from? by Richard Wallis
Schema.org where did that come from?Schema.org where did that come from?
Schema.org where did that come from?
Richard Wallis735 views
Entification: The Route to 'Useful' Library Data by Richard Wallis
Entification: The Route to 'Useful' Library DataEntification: The Route to 'Useful' Library Data
Entification: The Route to 'Useful' Library Data
Richard Wallis1.7K views
The Web of Data is Our Opportunity by Richard Wallis
The Web of Data is Our OpportunityThe Web of Data is Our Opportunity
The Web of Data is Our Opportunity
Richard Wallis1.6K views
Introduction to APIs and Linked Data by Adrian Stevenson
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked Data
Adrian Stevenson1.1K views
Achille Felicetti "Introduction to the Ariadne winter school and to the ARIAD... by ariadnenetwork
Achille Felicetti "Introduction to the Ariadne winter school and to the ARIAD...Achille Felicetti "Introduction to the Ariadne winter school and to the ARIAD...
Achille Felicetti "Introduction to the Ariadne winter school and to the ARIAD...
ariadnenetwork314 views
Structured Data: It's All About the Graph! by Richard Wallis
Structured Data: It's All About the Graph!Structured Data: It's All About the Graph!
Structured Data: It's All About the Graph!
Richard Wallis692 views

More from Richard Wallis

Schema.org Structured data the What, Why, & How by
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & HowRichard Wallis
2.8K views88 slides
Structured data: Where did that come from & why are Google asking for it by
Structured data: Where did that come from & why are Google asking for itStructured data: Where did that come from & why are Google asking for it
Structured data: Where did that come from & why are Google asking for itRichard Wallis
2K views115 slides
FIBO & Schema.org by
FIBO & Schema.orgFIBO & Schema.org
FIBO & Schema.orgRichard Wallis
2K views50 slides
Schema.org - An Extending Influence by
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending InfluenceRichard Wallis
1.8K views125 slides
Schema.org - Extending Benefits by
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending BenefitsRichard Wallis
5.4K views121 slides
Identifying The Benefit of Linked Data by
Identifying The Benefit of Linked DataIdentifying The Benefit of Linked Data
Identifying The Benefit of Linked DataRichard Wallis
3.5K views110 slides

More from Richard Wallis(17)

Schema.org Structured data the What, Why, & How by Richard Wallis
Schema.org Structured data the What, Why, & HowSchema.org Structured data the What, Why, & How
Schema.org Structured data the What, Why, & How
Richard Wallis2.8K views
Structured data: Where did that come from & why are Google asking for it by Richard Wallis
Structured data: Where did that come from & why are Google asking for itStructured data: Where did that come from & why are Google asking for it
Structured data: Where did that come from & why are Google asking for it
Richard Wallis2K views
Schema.org - An Extending Influence by Richard Wallis
Schema.org - An Extending InfluenceSchema.org - An Extending Influence
Schema.org - An Extending Influence
Richard Wallis1.8K views
Schema.org - Extending Benefits by Richard Wallis
Schema.org - Extending BenefitsSchema.org - Extending Benefits
Schema.org - Extending Benefits
Richard Wallis5.4K views
Identifying The Benefit of Linked Data by Richard Wallis
Identifying The Benefit of Linked DataIdentifying The Benefit of Linked Data
Identifying The Benefit of Linked Data
Richard Wallis3.5K views
Web Driven Revolution For Library Data by Richard Wallis
Web Driven Revolution For Library DataWeb Driven Revolution For Library Data
Web Driven Revolution For Library Data
Richard Wallis1.4K views
Schema.org: What It Means For You and Your Library by Richard Wallis
Schema.org: What It Means For You and Your LibrarySchema.org: What It Means For You and Your Library
Schema.org: What It Means For You and Your Library
Richard Wallis2.3K views
Designing Linked Data Software & Services for Libraries by Richard Wallis
Designing Linked Data Software & Services for LibrariesDesigning Linked Data Software & Services for Libraries
Designing Linked Data Software & Services for Libraries
Richard Wallis2.1K views
The Power of Sharing Linked Data: Bibliothekartag 2014 by Richard Wallis
The Power of Sharing Linked Data: Bibliothekartag 2014The Power of Sharing Linked Data: Bibliothekartag 2014
The Power of Sharing Linked Data: Bibliothekartag 2014
Richard Wallis991 views
The Power of Sharing Linked Data - ELAG 2014 Workshop by Richard Wallis
The Power of Sharing Linked Data - ELAG 2014 WorkshopThe Power of Sharing Linked Data - ELAG 2014 Workshop
The Power of Sharing Linked Data - ELAG 2014 Workshop
Richard Wallis1.8K views
The Simple Power of the Link - ELAG 2014 Workshop by Richard Wallis
The Simple Power of the Link - ELAG 2014 WorkshopThe Simple Power of the Link - ELAG 2014 Workshop
The Simple Power of the Link - ELAG 2014 Workshop
Richard Wallis1.2K views

Recently uploaded

Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language... by
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...patiladiti752
9 views15 slides
DGIQ East 2023 AI Ethics SIG by
DGIQ East 2023 AI Ethics SIGDGIQ East 2023 AI Ethics SIG
DGIQ East 2023 AI Ethics SIGKaren Lopez
6 views7 slides
Business administration Project File.pdf by
Business administration Project File.pdfBusiness administration Project File.pdf
Business administration Project File.pdfKiranPrajapati91
11 views36 slides
Listed Instruments Survey 2022.pptx by
Listed Instruments Survey  2022.pptxListed Instruments Survey  2022.pptx
Listed Instruments Survey 2022.pptxsecretariat4
148 views12 slides
DGST Methodology Presentation.pdf by
DGST Methodology Presentation.pdfDGST Methodology Presentation.pdf
DGST Methodology Presentation.pdfmaddierlegum
8 views9 slides
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx by
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptxDataScienceConferenc1
5 views15 slides

Recently uploaded(20)

Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language... by patiladiti752
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
Enhancing Financial Sentiment Analysis via Retrieval Augmented Large Language...
patiladiti7529 views
DGIQ East 2023 AI Ethics SIG by Karen Lopez
DGIQ East 2023 AI Ethics SIGDGIQ East 2023 AI Ethics SIG
DGIQ East 2023 AI Ethics SIG
Karen Lopez6 views
Business administration Project File.pdf by KiranPrajapati91
Business administration Project File.pdfBusiness administration Project File.pdf
Business administration Project File.pdf
KiranPrajapati9111 views
Listed Instruments Survey 2022.pptx by secretariat4
Listed Instruments Survey  2022.pptxListed Instruments Survey  2022.pptx
Listed Instruments Survey 2022.pptx
secretariat4148 views
DGST Methodology Presentation.pdf by maddierlegum
DGST Methodology Presentation.pdfDGST Methodology Presentation.pdf
DGST Methodology Presentation.pdf
maddierlegum8 views
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx by DataScienceConferenc1
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx
[DSC Europe 23] Branka Panic - Peace in the age of artificial intelligence.pptx
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf by 10urkyr34
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf
6498-Butun_Beyinli_Cocuq-Daniel_J.Siegel-Tina_Payne_Bryson-2011-259s.pdf
10urkyr348 views
Pydata Global 2023 - How can a learnt model unlearn something by SARADINDU SENGUPTA
Pydata Global 2023 - How can a learnt model unlearn somethingPydata Global 2023 - How can a learnt model unlearn something
Pydata Global 2023 - How can a learnt model unlearn something
Analytics Center of Excellence | Data CoE |Analytics CoE| WNS Triange by RNayak3
Analytics Center of Excellence | Data CoE |Analytics CoE| WNS TriangeAnalytics Center of Excellence | Data CoE |Analytics CoE| WNS Triange
Analytics Center of Excellence | Data CoE |Analytics CoE| WNS Triange
RNayak35 views
4_4_WP_4_06_ND_Model.pptx by d6fmc6kwd4
4_4_WP_4_06_ND_Model.pptx4_4_WP_4_06_ND_Model.pptx
4_4_WP_4_06_ND_Model.pptx
d6fmc6kwd47 views
Underfunded.pptx by vgarcia19
Underfunded.pptxUnderfunded.pptx
Underfunded.pptx
vgarcia1916 views
AZConf 2023 - Considerations for LLMOps: Running LLMs in production by SARADINDU SENGUPTA
AZConf 2023 - Considerations for LLMOps: Running LLMs in productionAZConf 2023 - Considerations for LLMOps: Running LLMs in production
AZConf 2023 - Considerations for LLMOps: Running LLMs in production
Lack of communication among family.pptx by ahmed164023
Lack of communication among family.pptxLack of communication among family.pptx
Lack of communication among family.pptx
ahmed16402317 views

From Ambition to Go Live

  • 1. From Ambition to Go Live The National Library Board of Singapore’s Journey to an operational Linked Data Management and Discovery System Richard Wallis Evangelist and Founder Data Liberate richard.wallis@dataliberate.com @dataliberate 15th Semantic Web in Libraries Conference SWIB23 - 12th September 2023 - Berlin
  • 2. Independent Consultant, Evangelist & Founder W3C Community Groups: • Bibframe2Schema (Chair) – Standardised conversion path(s) • Schema Bib Extend (Chair) - Bibliographic data • Schema Architypes (Chair) - Archives • Financial Industry Business Ontology – Financial schema.org • Tourism Structured Web Data (Co-Chair) • Schema Course Extension • Schema IoT Community • Educational & Occupational Credentials in Schema.org richard.wallis@dataliberate.com — @dataliberate 40+ Years – Computing 30+ Years – Cultural Heritage technology 20+ Years – Semantic Web & Linked Data Worked With: • Google – Schema.org vocabulary, site, extensions. documentation and community • OCLC – Global library cooperative • FIBO – Financial Industry Business Ontology Group • Various Clients – Implementing/understanding Linked Data, Schema.org: National Library Board Singapore British Library — Stanford University — Europeana 2
  • 3. 3 National Library Board Singapore Public Libraries Network of 28 Public Libraries, including 2 partner libraries* Reading Programmes and Initiatives Programmes and Exhibitions targeted at Singapore communities *Partner libraries are libraries which are partner owned and funded but managed by NLB/NLB’s subsidiary Libraries and Archives Solutions Pte Ltd. Library@Chinatown and the Lifelong Learning Institute Library are Partner libraries. National Archives Transferred from NHB to NLB in Nov 2012 Custodian of Singapore’s Collective Memory: Responsible for Collection, Preservation and Management of Singapore’s Public and Private Archival Records Promotes Public Interest in our Nation’s History and Heritage National Library Preserving Singapore’s Print and Literary Heritage, and Intellectual memory Reference Collections Legal Deposit (including electronic)
  • 4. 4 Over 560,000 Singapore & SEA items Over 147,000 Chinese, Malay & Tamil Languages items Reference Collection Over 62,000 Social Sciences & Humanities items Over 39,000 Science & Technology items Over 53,000 Arts items Over 19,000 Rare Materials items Archival Materials Over 290,000 Government files & Parliament papers Over 190,000 Audiovisual & sound recordings Over 70,000 Maps & building plans Over 1.14m Photographs Over 35,000 Oral history interviews Over 55,000 Speeches & press releases Over 7,000 Posters National Library Board Singapore Over 5m print collection Over 2.4m music tracks 78 databases Over 7,400 e-newspapers and e-magazines titles Over 8,000 e-learning courses Over 1.7m e-books and audio books Lending Collection
  • 5. 5 National Library Board Online Services
  • 6. 7 The Ambition • To enable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital • To bring together diverse systems across the National Library, National Archives, and Public Libraries in a Linked Data Environment • To provide a staff interface to view and manage all entities, their descriptions and relationships
  • 7. 8 The Ambition • To enable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital • To bring together diverse systems across the National Library, National Archives, and Public Libraries in a Linked Data Environment • To provide a staff interface to view and manage all entities, their descriptions and relationships
  • 8. 9 The Ambition • To enable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital • To bring together diverse systems across the National Library, National Archives, and Public Libraries in a Linked Data Environment • To provide a staff interface to view and manage all entities, their descriptions and relationships
  • 9. 10 The Ambition • To enable the discovery & display of entitles from different sources in a combined interface • To bring together resources physical and digital • To bring together diverse systems across the National Library, National Archives, and Public Libraries in a Linked Data Environment • To provide a staff interface to view and manage all entities, their descriptions and relationships
  • 10. 11 The Journey Starts(March 2021) A Linked Data Management System (LDMS) • Cloud-based System • Management Interface • Source data curated at source – ILS, Authorities, CMS, NAS • Linked Data • Public Interface – Google friendly – Schema.org • Shared standards – Bibframe, Schema.org
  • 11. 12 The Journey Starts(March 2021) A Linked Data Management System (LDMS) • Cloud-based System • Management Interface • Source data curated at source – ILS, Authorities, CMS, NAS • Linked Data • Public Interface – Google friendly – Schema.org • Shared standards – Bibframe, Schema.org
  • 12. 13 The Journey Starts(March 2021) A Linked Data Management System (LDMS) • Cloud-based System • Management Interface • Source data curated at source – ILS, Authorities, CMS, NAS • Linked Data • Public Interface – Google friendly – Schema.org • Shared standards – Bibframe, Schema.org
  • 13. 14 The Journey Starts(March 2021) A Linked Data Management System (LDMS) • Cloud-based System • Management Interface • Source data curated at source – ILS, Authorities, CMS, NAS • Linked Data • Public Interface – Google friendly – Schema.org • Shared standards – Bibframe, Schema.org
  • 14. 15 The Journey Starts(March 2021) A Linked Data Management System (LDMS) • Cloud-based System • Management Interface • Source data curated at source – ILS, Authorities, CMS, NAS • Linked Data • Public Interface – Google friendly – Schema.org • Shared standards – Bibframe, Schema.org
  • 15. 16 The Journey Starts(March 2021) A Linked Data Management System (LDMS) • Cloud-based System • Management Interface • Source data curated at source – ILS, Authorities, CMS, NAS • Linked Data • Public Interface – Google friendly – Schema.org • Shared standards – Bibframe, Schema.org
  • 16. 17 The Journey Starts(March 2021) A Linked Data Management System (LDMS) • Cloud-based System • Management Interface • Source data curated at source – ILS, Authorities, CMS, NAS • Linked Data • Public Interface – Google friendly – Schema.org • Shared standards – Bibframe, Schema.org
  • 17. 18 Contract Awarded metaphactory platform Low-code knowledge graph platform Semantic knowledge modeling Semantic search & discovery AWS Partner Public sector partner Singapore based Linked Data, Structured data, Semantic Web, bibliographic meta data, Schema.org and management systems consultant
  • 18. 19 Initial Challenges • Data – lots of it! – Record-based formats: MARC-XML, DC-XML, CSV • Entities described in the data – lots and lots! • Regular updates – daily, weekly, monthly • Automatic ingestion • Not the single source of truth • Manual update capability • Entity-based search & discovery • Web friendly – Schema.org output
  • 19. 20 Initial Challenges • Data – lots of it! – Record-based formats: MARC-XML, DC-XML, CSV • Entities described in the data – lots and lots! • Regular updates – daily, weekly, monthly • Automatic ingestion • Not the single source of truth • Manual update capability • Entity-based search & discovery • Web friendly – Schema.org output
  • 20. 21 Initial Challenges • Data – lots of it! – Record-based formats: MARC-XML, DC-XML, CSV • Entities described in the data – lots and lots! • Regular updates – daily, weekly, monthly • Automatic ingestion • Not the single source of truth • Manual update capability • Entity-based search & discovery • Web friendly – Schema.org output
  • 21. 22 Initial Challenges • Data – lots of it! – Record-based formats: MARC-XML, DC-XML, CSV • Entities described in the data – lots and lots! • Regular updates – daily, weekly, monthly • Automatic ingestion • Not the single source of truth • Manual update capability • Entity-based search & discovery • Web friendly – Schema.org output
  • 22. 23 Initial Challenges • Data – lots of it! – Record-based formats: MARC-XML, DC-XML, CSV • Entities described in the data – lots and lots! • Regular updates – daily, weekly, monthly • Automatic ingestion • Not the single source of truth • Manual update capability • Entity-based search & discovery • Web friendly – Schema.org output
  • 23. 24 Initial Challenges • Data – lots of it! – Record-based formats: MARC-XML, DC-XML, CSV • Entities described in the data – lots and lots! • Regular updates – daily, weekly, monthly • Automatic ingestion • Not the single source of truth • Manual update capability • Entity-based search & discovery • Web friendly – Schema.org output
  • 24. 25 Initial Challenges • Data – lots of it! – Record-based formats: MARC-XML, DC-XML, CSV • Entities described in the data – lots and lots! • Regular updates – daily, weekly, monthly • Automatic ingestion • Not the single source of truth • Manual update capability • Entity-based search & discovery • Web friendly – Schema.org output
  • 25. 26 Initial Challenges • Data – lots of it! – Record-based formats: MARC-XML, DC-XML, CSV • Entities described in the data – lots and lots! • Regular updates – daily, weekly, monthly • Automatic ingestion • Not the single source of truth • Manual update capability • Entity-based search & discovery • Web friendly – Schema.org output
  • 26. 27 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME • Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph – All entities described using Schema.org as a minimum.
  • 27. 28 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME • Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph – All entities described using Schema.org as a minimum.
  • 28. 29 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME • Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph – All entities described using Schema.org as a minimum.
  • 29. 30 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME • Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph – All entities described using Schema.org as a minimum.
  • 30. 31 Basic Data Model • Linked Data – BIBFRAME to capture detail of bibliographic records – Schema.org to deliver structured data for search engines – Schema.org representation of CMS, NAS, TTE data – Schema.org enrichment of BIBFRAME • Schema.org as the ‘lingua franca’ vocabulary of the Knowledge graph – All entities described using Schema.org as a minimum.
  • 31. 32 Data Data Data! Data Source Source Records Entity Count Update Frequency ILS 1.4m 56.8m Daily CMS 82k 228k Weekly NAS 1.6m 6.7m Monthly TTE 3k 317k Monthly 3.1m 70.4m
  • 32. 33 Data Ingest Pipelines – ILS • Source data in MARC-XML files • Step 1: marc2bibframe2 scripts – Open source – shared by Library of Congress – “standard” approach – Output BIBFRAME as individual RDF-XML files • Step 2: bibframe2schema.org script – Open source – Bibframe2Schema.org Community Group – SPARQL-based script – Output Schema.org enriched individual BIBFRAME RDF files for loading into Knowledge graph triplestore
  • 33. 34 Data Ingest Pipelines – ILS • Source data in MARC-XML files • Step 1: marc2bibframe2 scripts – Open source – shared by Library of Congress – “standard” approach – Output BIBFRAME as individual RDF-XML files • Step 2: bibframe2schema.org script – Open source – Bibframe2Schema.org Community Group – SPARQL-based script – Output Schema.org enriched individual BIBFRAME RDF files for loading into Knowledge graph triplestore
  • 34. 35 Data Ingest Pipelines – ILS • Source data in MARC-XML files • Step 1: marc2bibframe2 scripts – Open source – shared by Library of Congress – “standard” approach – Output BIBFRAME as individual RDF-XML files • Step 2: bibframe2schema.org script – Open source – Bibframe2Schema.org Community Group – SPARQL-based script – Output Schema.org enriched individual BIBFRAME RDF files for loading into Knowledge graph triplestore
  • 35. 36 Data Ingestion Pipelines – TTE (Authorities) • Source data in bespoke CSV format • Exported in collection-based files – People, Places, organisations, time periods, etc. • Interrelated references so needed to be considered as a whole • Bespoke python script • Creating Schema.org entity descriptions • Output RDF format file for loading into Knowledge graph triplestore
  • 36. 37 Data Ingestion Pipelines – TTE (Authorities) • Source data in bespoke CSV format • Exported in collection-based files – People, Places, organisations, time periods, etc. • Interrelated references so needed to be considered as a whole • Bespoke python script • Creating Schema.org entity descriptions • Output RDF format file for loading into Knowledge graph triplestore
  • 37. 38 Data Ingestion Pipelines – TTE (Authorities) • Source data in bespoke CSV format • Exported in collection-based files – People, Places, organisations, time periods, etc. • Interrelated references so needed to be considered as a whole • Bespoke python script • Creating Schema.org entity descriptions • Output RDF format file for loading into Knowledge graph triplestore
  • 38. 39 Data Ingestion Pipelines – NAS & CMS • Source data in DC-XML files • Step 1: DC-XML to DC-Terms RDF – Bespoke XSLT script to translate from DC record to DCT-RDF • Step 2: TTE lookup – Lookup identified values against TTE entities in Knowledge graph • use URI if matched • Step 3: Schema.org entity creation – SPARQL script create schema representation of DCT entities • Output individual RDF format files for loading into Knowledge graph
  • 39. 40 Data Ingestion Pipelines – NAS & CMS • Source data in DC-XML files • Step 1: DC-XML to DC-Terms RDF – Bespoke XSLT script to translate from DC record to DCT-RDF • Step 2: TTE lookup – Lookup identified values against TTE entities in Knowledge graph • use URI if matched • Step 3: Schema.org entity creation – SPARQL script create schema representation of DCT entities • Output individual RDF format files for loading into Knowledge graph
  • 40. 41 Data Ingestion Pipelines – NAS & CMS • Source data in DC-XML files • Step 1: DC-XML to DC-Terms RDF – Bespoke XSLT script to translate from DC record to DCT-RDF • Step 2: TTE lookup – Lookup identified values against TTE entities in Knowledge graph • use URI if matched • Step 3: Schema.org entity creation – SPARQL script create schema representation of DCT entities • Output individual RDF format files for loading into Knowledge graph
  • 41. 42 Data Ingestion Pipelines – NAS & CMS • Source data in DC-XML files • Step 1: DC-XML to DC-Terms RDF – Bespoke XSLT script to translate from DC record to DCT-RDF • Step 2: TTE lookup – Lookup identified values against TTE entities in Knowledge graph • use URI if matched • Step 3: Schema.org entity creation – SPARQL script create schema representation of DCT entities • Output individual RDF format files for loading into Knowledge graph
  • 42. 43 Technical Architecture (simplified) Hosted on Amazon Web Services Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 43. 44 Technical Architecture (simplified) Hosted on Amazon Web Services Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 44. 45 Technical Architecture (simplified) Hosted on Amazon Web Services GraphDB Cluster GraphDB Cluster GraphDB Cluster GraphDB Cluster Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 45. 46 PUBLIC ACCESS Technical Architecture (simplified) Hosted on Amazon Web Services DI GraphDB Cluster GraphDB Cluster GraphDB Cluster GraphDB Cluster Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT
  • 46. 47 CURATOR ACCESS PUBLIC ACCESS Technical Architecture (simplified) Hosted on Amazon Web Services DI GraphDB Cluster GraphDB Cluster GraphDB Cluster GraphDB Cluster Pipeline processing Batch Scripts import control Etc. SOURCE DATA IMPORT DMI
  • 47. 48 A need for entity reconciliation ….. • Lots (and lots and lots) of source entities – 70.4 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data – Singapore Art Museum • Entities from source data • 21 CMS, 1 NAS, 66 ILS, 1 TTE • Users only want 1 of each!
  • 48. 49 A need for entity reconciliation ….. • Lots (and lots and lots) of source entities – 70.4 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data – Singapore Art Museum • Entities from source data • 21 CMS, 1 NAS, 66 ILS, 1 TTE • Users only want 1 of each!
  • 49. 50 A need for entity reconciliation ….. • Lots (and lots and lots) of source entities – 70.4 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data – Singapore Art Museum • Entities from source data • 21 CMS, 1 NAS, 66 ILS, 1 TTE • Users only want 1 of each!
  • 50. 51 A need for entity reconciliation ….. • Lots (and lots and lots) of source entities – 70.4 million entities • Lots of duplication – Lee, Kuan Yew – 1st Prime Minister of Singapore • 160 individual entities in ILS source data – Singapore Art Museum • Entities from source data • 21 CMS, 1 NAS, 66 ILS, 1 TTE • Users only want 1 of each!
  • 51. 52 70.4(million) into 6.1(million) entities does go! • We know that now - but how did we get there? • Requirements hurdles to be cleared: – Not a single source of truth – Regular automatic updates – add / update / delete – Manual management of combined entities • Suppression of incorrect or ’private’ attributes from display • Addition of attributes not in source data • Creating / breaking relationships between entities • Near real-time updates
  • 52. 53 70.4(million) into 6.1(million) entities does go! • We know that now - but how did we get there? • Requirements hurdles to be cleared: – Not a single source of truth – Regular automatic updates – add / update / delete – Manual management of combined entities • Suppression of incorrect or ’private’ attributes from display • Addition of attributes not in source data • Creating / breaking relationships between entities • Near real-time updates
  • 53. 54 70.4(million) into 6.1(million) entities does go! • We know that now - but how did we get there? • Requirements hurdles to be cleared: – Not a single source of truth – Regular automatic updates – add / update / delete – Manual management of combined entities • Suppression of incorrect or ’private’ attributes from display • Addition of attributes not in source data • Creating / breaking relationships between entities • Near real-time updates
  • 54. 55 70.4(million) into 6.1(million) entities does go! • We know that now - but how did we get there? • Requirements hurdles to be cleared: – Not a single source of truth – Regular automatic updates – add / update / delete – Manual management of combined entities • Suppression of incorrect or ’private’ attributes from display • Addition of attributes not in source data • Creating / breaking relationships between entities • Near real-time updates
  • 55. 56 70.4(million) into 6.1(million) entities does go! • We know that now - but how did we get there? • Requirements hurdles to be cleared: – Not a single source of truth – Regular automatic updates – add / update / delete – Manual management of combined entities • Suppression of incorrect or ’private’ attributes from display • Addition of attributes not in source data • Creating / breaking relationships between entities • Near real-time updates
  • 56. 57 Adaptive Data Model Concepts • Source entitles – Individual representation of source data • Aggregation entities – Tracking relationships between source entities for the same thing – No copying of attributes • Primary Entities – Searchable by users – Displayable to users – Consolidation of aggregated source data & managed attributes
  • 57. 58 Adaptive Data Model Concepts • Source entitles – Individual representation of source data • Aggregation entities – Tracking relationships between source entities for the same thing – No copying of attributes • Primary Entities – Searchable by users – Displayable to users – Consolidation of aggregated source data & managed attributes
  • 58. 59 Adaptive Data Model Concepts • Source entitles – Individual representation of source data • Aggregation entities – Tracking relationships between source entities for the same thing – No copying of attributes • Primary Entities – Searchable by users – Displayable to users – Consolidation of aggregated source data & managed attributes
  • 59. 60
  • 60. 61
  • 61. 62
  • 62. 63
  • 63. 64
  • 64. 65
  • 65. 66
  • 66. 67
  • 67. 68
  • 68. 69 Entity Matching • Candidate matches based on schema:names – Lucene indexes matching – Levenshtein similarity refined – Same entity type – Entity type specific rules eg: • Work: name + creator / contributor / author / sameAs • Person: name + birthDate / deathDate / sameAs
  • 69. 70 Entity Matching • Candidate matches based on schema:names – Lucene indexes matching – Levenshtein similarity refined – Same entity type – Entity type specific rules eg: • Work: name + creator / contributor / author / sameAs • Person: name + birthDate / deathDate / sameAs
  • 70. 71 Entity Matching • Candidate matches based on schema:names – Lucene indexes matching – Levenshtein similarity refined – Same entity type – Entity type specific rules eg: • Work: name + creator / contributor / author / sameAs • Person: name + birthDate / deathDate / sameAs
  • 71. 72 Entity Matching • Candidate matches based on schema:names – Lucene indexes matching – Levenshtein similarity refined – Same entity type – Entity type specific rules eg: • Work: name + creator / contributor / author / sameAs • Person: name + birthDate / deathDate / sameAs
  • 72. 73 Entity Matching • Candidate matches based on schema:names – Lucene indexes matching – Levenshtein similarity refined – Same entity type – Entity type specific rules eg: • Work: name + creator / contributor / author / sameAs • Person: name + birthDate / deathDate / sameAs
  • 79. 80 Data Management Interface • Rapidly developed – easily modified • Using mataphactory’s semantic visual interface • A collaboration environment for data curators • Pre-publish entity management • Search & discovery – emulates DI • Dashboard view • Entity management / Creation
  • 80. 81 Data Management Interface • Rapidly developed – easily modified • Using mataphactory’s semantic visual interface • A collaboration environment for data curators • Pre-publish entity management • Search & discovery – emulates DI • Dashboard view • Entity management / Creation
  • 81. 82 Data Management Interface • Rapidly developed – easily modified • Using mataphactory’s semantic visual interface • A collaboration environment for data curators • Pre-publish entity management • Search & discovery – emulates DI • Dashboard view • Entity management / Creation
  • 82. 83 Data Management Interface • Rapidly developed – easily modified • Using mataphactory’s semantic visual interface • A collaboration environment for data curators • Pre-publish entity management • Search & discovery – emulates DI • Dashboard view • Entity management / Creation
  • 83. 84 Data Management Interface • Rapidly developed – easily modified • Using mataphactory’s semantic visual interface • A collaboration environment for data curators • Pre-publish entity management • Search & discovery – emulates DI • Dashboard view • Entity management / Creation
  • 84. 85 Data Management Interface • Rapidly developed – easily modified • Using mataphactory’s semantic visual interface • A collaboration environment for data curators • Pre-publish entity management • Search & discovery – emulates DI • Dashboard view • Entity management / Creation
  • 85. 86
  • 86. 87
  • 87. 88
  • 88. 89
  • 89. 90
  • 90. 91
  • 91. 92
  • 92. 93
  • 93. 94
  • 94. 95
  • 95. 96
  • 96. 97
  • 97. 98
  • 98. 99
  • 99. 100
  • 100. 101
  • 101. 102
  • 103. 104
  • 104. 105
  • 105. 106
  • 106. 107
  • 107. 108
  • 108. 109
  • 109. 110
  • 110. 111 LDMS System Live – December 2022 • Live – data curation team using DMI – Entity curation – Manually merging & splitting aggregations • Live – data ingest pipelines – Daily / Weekly / Monthly – Delta update exports from source systems • Live – Discovery Interface ready for deployment • Live – Support – Issue resolution & updates
  • 111. 112 LDMS System Live – December 2022 • Live – data curation team using DMI – Entity curation – Manually merging & splitting aggregations • Live – data ingest pipelines – Daily / Weekly / Monthly – Delta update exports from source systems • Live – Discovery Interface ready for deployment • Live – Support – Issue resolution & updates
  • 112. 113 LDMS System Live – December 2022 • Live – data curation team using DMI – Entity curation – Manually merging & splitting aggregations • Live – data ingest pipelines – Daily / Weekly / Monthly – Delta update exports from source systems • Live – Discovery Interface ready for deployment • Live – Support – Issue resolution & updates
  • 113. 114 LDMS System Live – December 2022 • Live – data curation team using DMI – Entity curation – Manually merging & splitting aggregations • Live – data ingest pipelines – Daily / Weekly / Monthly – Delta update exports from source systems • Live – Discovery Interface ready for deployment • Live – Support – Issue resolution & updates
  • 114. 115 Discovery Interface • Live and ready for deployment – Performance, load and user tested • Designed & developed by NLB Data Team to demonstrate characteristics and benefits of a Linked Data service • NLB services rolled-out by National Library & Public Library divisions – Recently identified a service for Linked Data integration • Currently undergoing fine tuning to meet service’s UI requirements
  • 115. 116 Discovery Interface • Live and ready for deployment – Performance, load and user tested • Designed & developed by NLB Data Team to demonstrate characteristics and benefits of a Linked Data service • NLB services rolled-out by National Library & Public Library divisions – Recently identified a service for Linked Data integration • Currently undergoing fine tuning to meet service’s UI requirements
  • 116. 117 Discovery Interface • Live and ready for deployment – Performance, load and user tested • Designed & developed by NLB Data Team to demonstrate characteristics and benefits of a Linked Data service • NLB services rolled-out by National Library & Public Library divisions – Recently identified a service for Linked Data integration • Currently undergoing fine tuning to meet service’s UI requirements
  • 117. 118 Daily Delta Ingestion & Reconciliation
  • 118. 119 • Inaccurate source data – Not apparent in source systems – Very obvious when aggregated in LDMS • Eg. Subject and Person URIs duplicated – Identified and analysed in DMI – Reported to source curators and fixed – Updated in next delta update – Data issues identified in LDMS – source data quality enhanced Some Challenges Addressed on the Way
  • 119. 120 • Inaccurate source data – Not apparent in source systems – Very obvious when aggregated in LDMS • Eg. Subject and Person URIs duplicated – Identified and analysed in DMI – Reported to source curators and fixed – Updated in next delta update – Data issues identified in LDMS – source data quality enhanced Some Challenges Addressed on the Way
  • 120. 121 • Asian name matching – Often consisting of several short 3-4 letter names – Lucene matches not of much help – Introduction of Levenshtein similarity matching helped – Tuning is challenging – trading off European & Asian names Some Challenges Addressed on the Way
  • 121. 122 • Asian name matching – Often consisting of several short 3-4 letter names – Lucene matches not of much help – Introduction of Levenshtein similarity matching helped – Tuning is challenging – trading off European & Asian names Some Challenges Addressed on the Way
  • 122. 137 From Ambition to Go Live – A Summary • Ambition based on previous years of trials, experimentation, and understanding of Linked Data potential • Ambition to deliver a production LDMS to provide Linked Data curation and management capable of delivering a public discovery service • With experienced commercial partners & tools as part of a globally distributed team – Benefiting from Open Source community developments where appropriate • Unique and challenging requirements – Distributed sources of truth in disparate data formats – Auto updating, consolidated entity view – plus individual entity management – Web friendly – providing structured data for search engines • Live and operational actively delivering benefits from December 2022
  • 123. 138 From Ambition to Go Live – A Summary • Ambition based on previous years of trials, experimentation, and understanding of Linked Data potential • Ambition to deliver a production LDMS to provide Linked Data curation and management capable of delivering a public discovery service • With experienced commercial partners & tools as part of a globally distributed team – Benefiting from Open Source community developments where appropriate • Unique and challenging requirements – Distributed sources of truth in disparate data formats – Auto updating, consolidated entity view – plus individual entity management – Web friendly – providing structured data for search engines • Live and operational actively delivering benefits from December 2022
  • 124. 139 From Ambition to Go Live – A Summary • Ambition based on previous years of trials, experimentation, and understanding of Linked Data potential • Ambition to deliver a production LDMS to provide Linked Data curation and management capable of delivering a public discovery service • With experienced commercial partners & tools as part of a globally distributed team – Benefiting from Open Source community developments where appropriate • Unique and challenging requirements – Distributed sources of truth in disparate data formats – Auto updating, consolidated entity view – plus individual entity management – Web friendly – providing structured data for search engines • Live and operational actively delivering benefits from December 2022
  • 125. 140 From Ambition to Go Live – A Summary • Ambition based on previous years of trials, experimentation, and understanding of Linked Data potential • Ambition to deliver a production LDMS to provide Linked Data curation and management capable of delivering a public discovery service • With experienced commercial partners & tools as part of a globally distributed team – Benefiting from Open Source community developments where appropriate • Unique and challenging requirements – Distributed sources of truth in disparate data formats – Auto updating, consolidated entity view – plus individual entity management – Web friendly – providing structured data for search engines • Live and operational actively delivering benefits from December 2022
  • 126. 141 From Ambition to Go Live – A Summary • Ambition based on previous years of trials, experimentation, and understanding of Linked Data potential • Ambition to deliver a production LDMS to provide Linked Data curation and management capable of delivering a public discovery service • With experienced commercial partners & tools as part of a globally distributed team – Benefiting from Open Source community developments where appropriate • Unique and challenging requirements – Distributed sources of truth in disparate data formats – Auto updating, consolidated entity view – plus individual entity management – Web friendly – providing structured data for search engines • Live and operational actively delivering benefits from December 2022
  • 127. From Ambition to Go Live The National Library Board of Singapore’s Journey to an operational Linked Data Management and Discovery System Richard Wallis Evangelist and Founder Data Liberate richard.wallis@dataliberate.com @dataliberate 15th Semantic Web in Libraries Conference SWIB23 - 12th September 2023 - Berlin