LINKED DATA FOR AFRICAN LIBRARIES:
METADATA ENRICHING AND DISCOVERY
N L A - 4 0 T H A N N U A L C A T A L O G U I N G , C L A S S I F I C A T I O N & I N D E X I N G
S E M I N A R / W O R K S H O P
Getaneh Alemu
Solent university
October 28th 2021
Metadata that is ENRICHED, LINKED, OPEN
and FILTERED drives usage of resources.
(Alemu, 2014)
A T H E O RY O F M E TA D ATA E N R I C H I N G A N D F I LT E R I N G
A T H E O RY O F M E TA D ATA E N R I C H I N G A N D F I LT E R I N G
(Alemu, 2014)
METADATA ENRICHING & FILTERING
•From the principle of metadata simplicity to the principle of
metadata enriching
•From human-readable metadata to structured, uniquely
identified and interlinked metadata (metadata linking)
•From metadata silos to metadata openness enabling
metadata sharing and re-use (metadata openness)
•From a single interface to user-led, re-configurable interface
(metadata filtering) (Alemu, 2014)
“The convenience of the public is always to be set before the ease
of the cataloguer.” Cutter, 1904
SAVE THE TIME OF THE READER
Users expecting
• Instantaneous
• 24/7
• Seamless
• Triangulated/complete
• Full-text
• Convenient access
CATALOGUING PRINCIPLES
• The principle of sufficiency and necessity
• The principle of user convenience
• The principle of representation
• The principle of standardisation
(Svenonius, 2000; IFLA, 2009)
LIMITATION OF STANDARDS
Growing library collections
Ever changing technologies
Changing user expectations
Standards and their limitations
Books often lend themselves to various
interpretations and contexts
“The social space of documents is missing”
R D A – R E S O U R C E D E S C R I P T I O N & A C C E S S
AACR2:
• AACR2 - borne in a time when space on the 3X4 inch card catalogue/storage
space was a major issue
• Adheres to the principle of metadata simplicity
• Abbreviated bibliographic description (such as ed., rev., vol., s.l., s.n., n.d., &
et al)
RDA:
• AACR2’s rule of three expanded – more access points /index terms
• RDA better empowers the cataloguer
• RDA is designed with linking and collocating multi-part and related works
together
• Richer metadata description (the principle of metadata enriching)
OPENNESS ---
https://discover.libraryhub.jisc.ac.uk/
WHAT IS LINKED DATA?
A data model that identifies, describes, links and relates
structured data elements, analogous to the way relational
database systems function, although Linked Data
is aimed to operate at a web scale.
(Allemnag & Hendler, 2008)
WHAT IS LINKED DATA?
• Linked Data is data model
• Identifies data
• Describes data
• Links/relations between data elements
• Structured data elements
• Analogous to the way relational database systems function
• But Linked Data is aimed to operate at a web scale
• Web-scale data linking
DESIGN PRINCIPLES
https://www.w3.org/DesignIssues/LinkedData.html
WHAT IS LINKED DATA?
There are many benefits of adopting Linked Data
principles in library standards, but five key ones are
indicated and discussed below. These benefits include:
 Metadata openness and sharing
 Facilitate serendipitous discovery of information
resources
 Identification of resource usage patterns, zeitgeist
and emergent metadata
 Facet-based navigation
 Metadata enriched with links
WHY LINKED DATA?
https://www.emeraldinsight.com/doi/full/10.1108/03074801211282920
IFLA’s LRM AND LINKED DATA
In June 2020, IFLA began work publishing its standards as Linked Data using IFLA namespaces.
Currently, it is representing its bibliographic standards (namespaces) as Linked Data. The
current list includes:
The FRBR Vocabularies: https://www.iflastandards.info/fr
The ISBD Vocabularies: https://www.iflastandards.info/isbd
The LRM Vocabularies: https://www.iflastandards.info/lrm
The UNIMARC Vocabularies: https://www.iflastandards.info/unimarc
MulDiCat: https://www.iflastandards.info/muldicat/
WHY LINKED DATA?
• Making sense of data / annotating data
• Re‐usability
• Cross‐linking
• Integration and sharing of data (Berners‐Lee,
2009; Shadbolt, 2010; W3C, 2011).
“Adding a page provides content, but adding a link provides the organization,
structure and endorsement to information on the Web which turn the content as a
whole into something of great value” (Berners‐Lee (2007)
Linked Data is expressed in several overarching technological frameworks
including RDF, RDFS, OWL, SPARQL and URI.
CHALLENGES TO ADOPT LINKED DATA
T E C H N O L O G I E S
• The MARC format, although dominant, is considered to be a record and document-centric metadata
structure, rather than being an actionable data-centric format (Coyle, 2010; Coyle & Hillmann, 2007; Styles,
2009; Styles, et al., 2008).
• A second challenge, singled out by the W3C Library Linked Data Incubator Group (W3C, 2011), is
the terminological disparity that exists between library and web-based standards.
• The third and important challenge confronting potential adopters is the complexity of Linked
Data technologies such as RDF/XML, RDFS, OWL and SPARQL. There is an apparent lack of tools
and applications for creating Linked Data in libraries. Berners-Lee has remarked that “the
[current] web has grown because it's easy to write a web page and easy to link to other pages”
(Berners-Lee, 2007).
HOW LINKED DATA?
Linked Data is expressed in several overarching technological frameworks including RDF, RDFS,
OWL, SPARQL and URI.
Resource Description Framework (RDF)
RDF is a data model to describe any concept or object (physical and abstract) using simple
Subject‐Predicate‐Object (also called triple) statements (Allemnag and Hendler, 2008).
It helps to describe an object through a set of self‐describing attributes (properties) and relations.
Unlike contemporary metadata schemas, RDF properties and relations are uniquely identified and
explicitly described in a manner that is machine processable. It is a simple, but robust and scalable
data model aimed at web scale rather than limited to a specific domain or applications.
HOW LINKED DATA?
Linked Data is expressed in several overarching technological frameworks including RDF, RDFS,
OWL, SPARQL and URI.
Resource Description Framework (RDF)
https://www.w3.org/TR/rdf-schema/
m
<RDF> <Description about="http://www.yourdomainname.com/RDF"> <book>Everything is
miscellaneous></book> <author>http://www.w3schools.com</homepage> </Description> </RDF>
RDF
https://www.w3schools.com/xml/xml_rdf.asp
What is RDF?
“RDF stands for Resource Description Framework
RDF is a framework for describing resources on the web
RDF is designed to be read and understood by computers
RDF is not designed for being displayed to people
RDF is written in XML”
Example
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:si="https://www.w3schools.com/rdf/">
<rdf:Description rdf:about="https://www.w3schools.com">
<si:title>W3Schools</si:title>
<si:author>Jan Egil Refsnes</si:author>
</rdf:Description>
</rdf:RDF>
BIBFRAME http://bibframe.org/bfe/index.html#
BIBFRAME
http://bibframe.org/bfe/index.html#
BIBFRAME - RDF
BIBFRAME - RDF
BIBFRAME - RDF
L I N K E D O P E N D ATA
Five-Star Linked Open Data
https://5stardata.info/en/
• Most, if not all, libraries still use MARC, a 1960s standard
• RDA (since 2010) and Library Reference Model (LRM) (2017)
are being implemented
• BIBFRAME is being tested and developed (since 2012)
• No library is fully Linked Data yet
LINKED DATA: ARE WE THERE YET?
THANK YOU

Linked Data for African Libraries

  • 1.
    LINKED DATA FORAFRICAN LIBRARIES: METADATA ENRICHING AND DISCOVERY N L A - 4 0 T H A N N U A L C A T A L O G U I N G , C L A S S I F I C A T I O N & I N D E X I N G S E M I N A R / W O R K S H O P Getaneh Alemu Solent university October 28th 2021
  • 2.
    Metadata that isENRICHED, LINKED, OPEN and FILTERED drives usage of resources. (Alemu, 2014)
  • 3.
    A T HE O RY O F M E TA D ATA E N R I C H I N G A N D F I LT E R I N G
  • 4.
    A T HE O RY O F M E TA D ATA E N R I C H I N G A N D F I LT E R I N G (Alemu, 2014)
  • 5.
    METADATA ENRICHING &FILTERING •From the principle of metadata simplicity to the principle of metadata enriching •From human-readable metadata to structured, uniquely identified and interlinked metadata (metadata linking) •From metadata silos to metadata openness enabling metadata sharing and re-use (metadata openness) •From a single interface to user-led, re-configurable interface (metadata filtering) (Alemu, 2014)
  • 6.
    “The convenience ofthe public is always to be set before the ease of the cataloguer.” Cutter, 1904 SAVE THE TIME OF THE READER Users expecting • Instantaneous • 24/7 • Seamless • Triangulated/complete • Full-text • Convenient access
  • 7.
    CATALOGUING PRINCIPLES • Theprinciple of sufficiency and necessity • The principle of user convenience • The principle of representation • The principle of standardisation (Svenonius, 2000; IFLA, 2009)
  • 8.
    LIMITATION OF STANDARDS Growinglibrary collections Ever changing technologies Changing user expectations Standards and their limitations Books often lend themselves to various interpretations and contexts “The social space of documents is missing”
  • 9.
    R D A– R E S O U R C E D E S C R I P T I O N & A C C E S S AACR2: • AACR2 - borne in a time when space on the 3X4 inch card catalogue/storage space was a major issue • Adheres to the principle of metadata simplicity • Abbreviated bibliographic description (such as ed., rev., vol., s.l., s.n., n.d., & et al) RDA: • AACR2’s rule of three expanded – more access points /index terms • RDA better empowers the cataloguer • RDA is designed with linking and collocating multi-part and related works together • Richer metadata description (the principle of metadata enriching)
  • 10.
  • 11.
    WHAT IS LINKEDDATA? A data model that identifies, describes, links and relates structured data elements, analogous to the way relational database systems function, although Linked Data is aimed to operate at a web scale. (Allemnag & Hendler, 2008)
  • 12.
    WHAT IS LINKEDDATA? • Linked Data is data model • Identifies data • Describes data • Links/relations between data elements • Structured data elements • Analogous to the way relational database systems function • But Linked Data is aimed to operate at a web scale • Web-scale data linking
  • 13.
  • 14.
    WHAT IS LINKEDDATA? There are many benefits of adopting Linked Data principles in library standards, but five key ones are indicated and discussed below. These benefits include:  Metadata openness and sharing  Facilitate serendipitous discovery of information resources  Identification of resource usage patterns, zeitgeist and emergent metadata  Facet-based navigation  Metadata enriched with links
  • 15.
    WHY LINKED DATA? https://www.emeraldinsight.com/doi/full/10.1108/03074801211282920 IFLA’sLRM AND LINKED DATA In June 2020, IFLA began work publishing its standards as Linked Data using IFLA namespaces. Currently, it is representing its bibliographic standards (namespaces) as Linked Data. The current list includes: The FRBR Vocabularies: https://www.iflastandards.info/fr The ISBD Vocabularies: https://www.iflastandards.info/isbd The LRM Vocabularies: https://www.iflastandards.info/lrm The UNIMARC Vocabularies: https://www.iflastandards.info/unimarc MulDiCat: https://www.iflastandards.info/muldicat/
  • 16.
    WHY LINKED DATA? •Making sense of data / annotating data • Re‐usability • Cross‐linking • Integration and sharing of data (Berners‐Lee, 2009; Shadbolt, 2010; W3C, 2011). “Adding a page provides content, but adding a link provides the organization, structure and endorsement to information on the Web which turn the content as a whole into something of great value” (Berners‐Lee (2007) Linked Data is expressed in several overarching technological frameworks including RDF, RDFS, OWL, SPARQL and URI.
  • 17.
    CHALLENGES TO ADOPTLINKED DATA T E C H N O L O G I E S • The MARC format, although dominant, is considered to be a record and document-centric metadata structure, rather than being an actionable data-centric format (Coyle, 2010; Coyle & Hillmann, 2007; Styles, 2009; Styles, et al., 2008). • A second challenge, singled out by the W3C Library Linked Data Incubator Group (W3C, 2011), is the terminological disparity that exists between library and web-based standards. • The third and important challenge confronting potential adopters is the complexity of Linked Data technologies such as RDF/XML, RDFS, OWL and SPARQL. There is an apparent lack of tools and applications for creating Linked Data in libraries. Berners-Lee has remarked that “the [current] web has grown because it's easy to write a web page and easy to link to other pages” (Berners-Lee, 2007).
  • 18.
    HOW LINKED DATA? LinkedData is expressed in several overarching technological frameworks including RDF, RDFS, OWL, SPARQL and URI. Resource Description Framework (RDF) RDF is a data model to describe any concept or object (physical and abstract) using simple Subject‐Predicate‐Object (also called triple) statements (Allemnag and Hendler, 2008). It helps to describe an object through a set of self‐describing attributes (properties) and relations. Unlike contemporary metadata schemas, RDF properties and relations are uniquely identified and explicitly described in a manner that is machine processable. It is a simple, but robust and scalable data model aimed at web scale rather than limited to a specific domain or applications.
  • 19.
    HOW LINKED DATA? LinkedData is expressed in several overarching technological frameworks including RDF, RDFS, OWL, SPARQL and URI. Resource Description Framework (RDF) https://www.w3.org/TR/rdf-schema/ m <RDF> <Description about="http://www.yourdomainname.com/RDF"> <book>Everything is miscellaneous></book> <author>http://www.w3schools.com</homepage> </Description> </RDF>
  • 20.
    RDF https://www.w3schools.com/xml/xml_rdf.asp What is RDF? “RDFstands for Resource Description Framework RDF is a framework for describing resources on the web RDF is designed to be read and understood by computers RDF is not designed for being displayed to people RDF is written in XML” Example <?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:si="https://www.w3schools.com/rdf/"> <rdf:Description rdf:about="https://www.w3schools.com"> <si:title>W3Schools</si:title> <si:author>Jan Egil Refsnes</si:author> </rdf:Description> </rdf:RDF>
  • 21.
  • 22.
  • 23.
  • 24.
  • 25.
  • 26.
    L I NK E D O P E N D ATA Five-Star Linked Open Data https://5stardata.info/en/
  • 27.
    • Most, ifnot all, libraries still use MARC, a 1960s standard • RDA (since 2010) and Library Reference Model (LRM) (2017) are being implemented • BIBFRAME is being tested and developed (since 2012) • No library is fully Linked Data yet LINKED DATA: ARE WE THERE YET?
  • 28.

Editor's Notes

  • #4 As part of my PhD which I completed in June 2014, using constructivist grounded research method, I developed a theory of metadata enriching and filtering. The theory includes four overarching principles, namely the principle of metadata enriching, linking, openness and filtering. My PhD is two words: enriching and filtering.  The theory of metadata enriching and filtering espouses that metadata should be enriched through standardised and socially-constructed metadata approaches. ... In theory, metadata creation and enhancement (metadata enriching) is a continuous process and it involves authors, publishers, suppliers, librarians and users.
  • #5 As part of my PhD which I completed in June 2014, using constructivist grounded research method, I developed a theory of metadata enriching and filtering. The theory includes four overarching principles, namely the principle of metadata enriching, linking, openness and filtering. My PhD is two words: enriching and filtering.  The theory of metadata enriching and filtering espouses that metadata should be enriched through standardised and socially-constructed metadata approaches. ... In theory, metadata creation and enhancement (metadata enriching) is a continuous process and it involves authors, publishers, suppliers, librarians and users.
  • #7 Cutter, C. A. (1962). Rules for a dictionary catalog. Washington: Government printing office. Ranganathan, S. R. (1931). The five laws of library science. Madras: Madras Library Association. http://aims.fao.org/activity/blog/five-laws-library-science-detailing-principles-operating-library-system
  • #8 IFLA (2009). STATEMENT OF INTERNATIONAL CATALOGUING PRINCIPLES https://www.ifla.org/files/assets/cataloguing/icp/icp_2009-en.pdf Svenonius, E. (2000). The intellectual foundation of information organization. Cambridge, Mass. ; London: MIT Press.
  • #13 Linked Data : As the name indicates, Linked Data is a data model that identifies, describes, links and relates structured data elements, analogous to the way relational database systems function, albeit the fact that Linked Data is aimed to operate at a web scale. Linked Data is a meta-model wherein it provides a framework to defining, designing, developing and maintaining schemas and vocabularies of any kind and size in a given domain. This in effect means institutions, such as libraries, need not necessarily abandon existing metadata standards (RDA, FRBR), controlled vocabularies (Library of Congress Subject Headings), authority lists (Library of Congress Authorities) and legacy metadata, instead what is required it to adapt Linked Data principles, which are also described in this paper. The adoption of Linked Data can provide an open interactive system, with external links and the ability to make information easily accessible, re-usable and with the possibility of serendipitous discovery of other resources. The overall purpose of Linked Data is facilitating the re-usability, cross-linking, integration and sharing of data (Berners-Lee, 2009; Shadbolt, 2010; W3C, 2011). Berners-Lee (2007) notes that “adding a page provides content, but adding a link provide the organization, structure and endorsement to information on the Web which turn the content as a whole into something of great value”. Linked Data is expressed in several overarching technological frameworks including RDF, RDFS, OWL, SPARQL and URI. Resource Description Framework (RDF): RDF is a data model to describe any concept or object (physical and abstract) using simple Subject-Predicate-Object (also called triple) statements (Allemnag & Hendler, 2008). It helps to describe an object through a set of self-describing attributes (properties) and relations. Unlike contemporary metadata schemas, RDF properties and relations are uniquely identified and explicitly described in a manner that is machine processable. It is a simple, but robust and scalable data model aimed at web scale rather than limited to a specific domain or applications. RDF Schema (RDFS): In order for the RDF model to function, it requires defined language with vocabularies and constructs. RDF schema specifies basic vocabularies such as Class, SubClassOf, Domain, Range, Label, and Comment (W3C, 2004c). Web Ontology Language (OWL): OWL extends RDFS with additional vocabularies such as equivalency (e.g., equivalentClass, equivalentProperty, sameAs, and differentFrom), inverse (inverseOf), cardinality relations and data value constraints (Allemnag & Hendler, 2008; Berners-Lee, 1997; W3C, 2004a, 2004b).
  • #17 (Coyle, 2010; Coyle & Hillmann, 2007; Lagoze, 2010; Mathes, 2004; Shirky, 2005; Veltman, 2001; Weinberger, 2005, 2007; Wright, 2007; Lehmann, 2010; Andersen & Skouvig, 2006.; Floridi, 2000; Hjorland, 2000)
  • #27  Whilst the increase in the degree of openness is beneficial, the ideals of metadata openness could be optimally utilised if it is not only represented in open formats and is freely available, but in its being linked. Thus, a five-star rating should be the goal if a library aspires to fully embrace openness. A five-star (*****) rating indicates that metadata is already rated as four-star is enriched with links both within the database and also to external metadata. For metadata openness to fulfil a five-star rating criterion, metadata values should be globally and uniquely identified, represented in open and scalable formats, enriched with links to internal and external data sources, and amenable for re-use, mixing and matching with other data sources. Metadata that has a five-star rating encourages re-use and aggregation between data sources, thus data owners should make explicit what licensing scheme and star rating they impose when making their data available. Licensing ensures users explicitly know, beforehand, the nature of data as well as the rights issues that are associated with it. It should be evident that if a metadata database is made available with a five-star rating, it means that it has achieved the highest degree of openness.