Semantic web technologies and digital library search

Semantic web and search

Richard Nurse
Open University
Library Services

Outline
•
•
•
•

Background
Basics of semantic web technologies
Relevance to libraries and search
STELLAR search project

Open University
•
•
•
•
•

UK distance learning University +200,000 students
Undergraduate/Postgraduate/Research
Online learning supported by course materials & local tutors
Milton Keynes campus and regional/national offices
BUT… most students never visit the main campus

Library Services
•
•
•
•

24/7 helpdesk
Online library resources
Online help sessions
Links to library resources and skills activities
embedded in VLE
• Discovery platform, website resource lists
• Librarians work with academics to build new courses

Library Services
•
•
•
•
•
•

Cross-university Information Management services
Institutional Repository ORO http://oro.open.ac.uk/
Research Data Management Project
Data retention and records management
University Archive
Metadata expertise

Library Services
• Innovation projects

http://www.open.ac.uk/blogs/macon/
http://www.open.ac.uk/blogs/RISE/
http://www.open.ac.uk/blogs/telstar/

Library Services
•
•
•
•

Innovation and development
OU Knowledge Media Institute and others
Semantic web
Video search

http://www.open.ac.uk/blogs/AVA/
http://projects.kmi.open.ac.uk/reflex/index.xml
http://kmi.open.ac.uk/projects/name/lucero

Search
“It‟s always so hit-and-miss… I used to sit
there for hours and just not find anything.
There were thousands and thousands of bits
of material but no way of drilling down to find
what I really needed. My manager needed
to know, by tomorrow, whether there was
something we could use or not and I didn‟t
know the answer, so had to say no”.

Search

• Terms
• Boolean logic – AND, OR, NOT
• - site: “ “

Search

http://www.flickr.com/photos/niallkennedy/

Search

http://www.flickr.com/photos/dullhunk/

Search
• „things not strings‟
http://googleblog.blogspot.co.uk/2012/05/introducing-knowledge-graph-things-not.html

Search

Google‟s Knowledge Graph

Semantic web
Definition: "The Semantic Web is not a separate Web
but an extension of the current one, in which information
is given well-defined meaning, better enabling computers
and people to work in cooperation."
The Semantic Web
Tim Berners-Lee, James Hendler, and Ora Lassila
Scientific American, 2001
http://www.sciam.com/article.cfm?id=the-semantic-web

http://www.nature.com/scientificamerican/journal/v284/n5/pdf/scientifica
merican0501-34.pdf

Semantic web basics
• „web of meaning‟
• „web of data‟
http://www.w3.org/2001/sw/
http://semanticweb.org/wiki/Main_Page
http://www.slideshare.net/fadirra/semanticweb-intro-040411

Semantic web basics
•
•
•
•

URIs
Linked data
Ontologies
but also…

Semantic web basics
• URIs – Uniform Resource Identifier
• http://en.wikipedia.org/wiki/Uniform_resource_identifier

http://www.slideshare.net/mdaquin/sssw13-ldtut

Linked data
• “Linked Data is about using the Web to connect related
data that wasn't previously linked, or using the Web to
lower the barriers to linking data currently linked using
other methods.”
• Wikipedia defines Linked Data as "a term used to
describe a recommended best practice for exposing,
sharing, and connecting pieces of data, information,
and knowledge on the Semantic Web using URIs and
RDF."

http://linkeddata.org/home

Subject > Predicate < Object
Jane Austen
„is the author of‟
Pride and Prejudice

http://www.nature.com/scientificamerican/journal/
v284/n5/pdf/scientificamerican0501-34.pdf

Ontologies
“An ontology is a formal specification of a shared conceptualization”
Tom Gruber
http://en.wikipedia.org/wiki/Tom_Gruber
http://viaf.org/viaf/72955884/

http://www.slideshare.net/mdaquin/sssw13-ldtut

Ontologies
eg
Virtual International Authority File – VIAF – maintained by OCLC
Friend of a Friend – FOAF http://www.foaf-project.org/

http://oclc.org/developer/documentation/virtual-international-authority-file-viaf/viaf-rdf-example

Ontologies
http://viaf.org/viaf/102333412/#foaf:Person

Ontologies

http://dbpedia.org/About

http://lov.okfn.org/dataset/lov/

Linked data ‘cloud’

Richard Cyganiak and Anja Jentzsch

http://lod-cloud.net/

Why is this of interest?

Lorcan
Dempsey
OCLC
http://www.slideshare.net/lisld/the-inside-out-library

Quoted by Lorcan
Dempsey
“Inside Out library:
Scale, Learning and
Engagement”

http://www.slideshare.net/lisld/the-inside-out-library

“The change that libraries will need to make …
must include the transformation of the library’s
public catalog from a stand-alone database of
bibliographic records to a highly hyperlinked data
set that can interact with information resources on
the World Wide Web.”
Karen Coyle
Understanding the semantic web
http://www.alatechsource.org/library-technologyreports/understanding-the-semantic-webbibliographic-data-and-metadata

Search is a major “pain point” for students and staff
Students
‘The library is very expansive which is great but
you can never find what you need. They need
to redo the system make it easier.’
NSS comment

Staff
‘I would be more likely to explore existing noncurrent learning materials if there were a
better way of finding them.’
STELLAR survey comment

What are libraries doing?

http://lodlam.net/

http://datahub.io/group/lld

http://www.w3.org/2005/Incubator/lld/

at the OU Library
• Library catalogue
• Archival material
• Old course materials in the University Archive

University Archive
• OU study materials – print and audio-visual
• Historical materials – photographs, oral history
• Papers of OU people
http://www.open.ac.uk/library/library-resources/the-openuniversity-archive

Range of learning resource types

The OU Digital Library (OUDL)
FEDORA Flexible Extensible Digital Object Repository
Architecture
Open source, created by and supported by the digital
preservation community
purpose-designed
Supports international metadata standards
PREMIS – METS – MODS – EAD – DC - OAI

Supports Linked Data natively
Mulgara triplestore

The STELLAR project
• Semantic Technologies Enhancing the Lifecycle of Learning
Resources
• OU Library Services/OU Knowledge Media Institute
• Experiment with semantic technologies in a digital library
environment … and to consider the sustainability implications
of using semantic technologies.
• Jisc-funded 2012-2013
• Jisc Digital Infrastructure programme – Sustainability of digital
content

STELLAR project aims
Taking collections preserved in the OUDL, the STELLAR project was
established to:
• Develop a detailed understanding of the value of legacy learning materials
as perceived by academic staff and other key stakeholders
• Experiment with the use of semantic technologies in a digital library
environment to ascertain the extent to which the perceived value of these
materials might be enhanced and to consider the sustainability
implications of using semantic technologies.
• Inform the development of digital libraries of learning resources by
contributing to the evidence base for their effectiveness
• Increase the return on investment of learning materials by
developing an evidence based model for lifecycle
management

The STELLAR project
•
•
•
•

Project approach
Create a baseline of perceptions of the value of the collection
Carry out an enhancement of the collection
Assess the impact of that enhancement on perceptions of value

Initial survey into value
• 89.2% of respondents (501) agreed or strongly agreed
with the statement that maintaining an archive of
non-current OU learning materials is important to the
reputation of the OU.
• 75.9% of respondents thought that this should be
maintained in perpetuity.
• 90.16% of respondents (504) agreed or strongly
agreed that non-current learning materials are
important to the context of the history of higher
education.
• 91.75% of those respondents who were involved in
module production (356) agreed or strongly agreed
that when producing new OU learning material, I am
likely to look to previous material, whether for
inspiration or for potential reuse.
“We are the world leaders in distance learning, so our curriculum designs
are much admired and so are our materials. It would be remiss of us not to
treat them as potential objects of scholarship themselves”.

Capturing perceptions
Using a balanced scorecard approach we conducted a benchmarking survey of academic staff
and stakeholders to investigate the value they place on non-current learning materials
Personal and professional perspectives of value

I would be disappointed if the OU learning
materials that I helped to produce were not kept
I keep my own copies of the OU learning
materials that I am involved in producing
I would be pleased if others chose to reuse of
reversion the OU learning materials that I have
helped to produce

Value to internal processes and cultures
I keep my own copies of the OU learning
materials that I am involved in producing
When producing new OU learning material, I am
likely to look to previous material, whether for
inspiration or for potential reuse
I would be more likely to explore existing noncurrent learning materials if there were a better
way of finding them.

Value to HE and academic communities

Maintaining an archive of non-current OU learning
materials is important to the reputation of the OU
I think the non-current OU learning materials are
important in the context of the history of higher
education
I think the non-current OU learning materials are
important in showing how the OU taught at
particular times in history

Financial / bottom line perspectives of value
I think that there is a monetary value to non-current
OU learning materials
The OU could make savings if more learning material
were reused

http://www.gla.ac.uk/services/library/espida/

Module Information
A metadata
module record
was created
which connects
the complicated
web of content
and metadata
associated with
each module

STELLAR allowed us to link the metadata for all this
module content, making it more discoverable & reusable

Basic linked data model
(for data.open.ac.uk and to comply with current module descriptions)

doau:a103

dc:title | rdfs:label | courseware:has-title

rdf:type
courseware:istaught-present

courseware:ha
s-courseware
daou-library:339347
dc:title

dc:isVersionOf

aiiso#code

“An Introduction to
the Humanities”
courseware:Course
|
mlo:LearningOppor
tunitySpecification |
aiiso:Module |
xcri:course
“false”

dc:subject“A103”

doau:a102
dc:isVersionOf

doau:a101

“An introduction to the
humanities : resource book 2”

jacs:V900 |
doau-topic:artsand-humanities

Application of Linked Data
• Text entered into the tool is passed through a semantic meaning
engine and concepts are matched against the concepts
contained within the digital library dataset.

• A selection of the closest matches are then displayed. These link
through to the object in the Fedora digital library
• The semantic web tool analyses
the meaning of those words and
finds related material
• the tool can also show related
material from other
datasets from data.open.ac.uk

http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/stellar2.mp4

Directly access digitised
content stored in the OUDL
Materials include those
originally in print, audio and
video formats

Links to the extensive metadata about the
course or element of the course, held on a
data.open.ac.uk page

Architecture of the STELLAR tool

Try the technology
• http://discou.info/alfa/

Headline findings
• A consistently positive reaction to the enhanced collections. In every area
the majority of respondents agreed or strongly agreed that the enhanced
materials had value
• Were two dimensions where the evaluation indicates the transformation of
the materials has increased the perceived value of the material:
• value to internal processes & culture
• financial/bottom line value
• Participants also made several comments regarding which materials should
be preserved & enhanced
• Read the full report on the blog:
http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/STELLAR-Post-Enhancement-Survey-Report.pdf

Value to internal processes
& culture
• 89% of respondents agreed or strongly agreed that they would be more likely
to explore existing materials if they knew they had been enhanced

• 94% agreed or strongly agreed that such enhancement makes content easier to
reuse or refer to for inspiration during module production
• When thinking about existing systems, 94% also agreed or strongly agreed that
the semantic analysis they had seen suggested material which they would not
have found using a traditional search
• 78% of respondents agreed or strongly agreed that enhanced materials are
more likely to be referred to during module production than those preserved
in existing OU systems

Financial / bottom line value
• Improving the discoverability and reusability of the materials appears to
have increased the perceived financial value of the materials
• In the pre-enhancement survey 75.9% of respondents agreed that the OU
could make savings if more learning material were reused
• Following the enhancement, an increased 83% agreed or strongly agreed
that the OU could make cost savings if existing materials were enhanced
to make them more discoverable
“It will be helpful to know what kind of support
and budget is available to make more old course
resources available. This will help reducing costly
budgets for new modules in production.”

Value of semantic search
Stakeholder views of semantic search
• ‘More likely to use material’ - 89% agreed/strongly agreed
• ‘Content easier to reuse’ – 94% agreed/strongly agreed
• ‘Found material that traditional search wouldn’t – 94% agreed/strongly agreed
Cost-savings could be made if material re-used

After

Before

72.00% 74.00% 76.00% 78.00% 80.00% 82.00% 84.00%

http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/STELLARPost-Enhancement-Survey-Report.pdf

Key findings
• Significant effort required to improve the metadata
• To make best use of the Linked Data, it was beneficial to digitise and
preserve all course materials for the selected courses
• Trade-off between value of extra content digitised and the
cost of cataloguing
• Once you’ve built it into your system you can automatically generate linked
data for new content of that type
• Stakeholders can see the value of this type of search

Follow-up work to STELLAR
• Linked Data embedded into OU Digital Library
• Used to link to related iTunesU and OpenLearn material

STELLAR
• STELLAR blog www.open.ac.uk/blogs/stellar
• Final report http://www.open.ac.uk/blogs/stellar/wpcontent/uploads/2013/09/STELLAR-JISC-Final-Report.pdf
• Final report in Jorum http://hdl.handle.net/10949/18379

Semantic web technologies and digital library search

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to Semantic web technologies and digital library search

Similar to Semantic web technologies and digital library search (20)

More from Richard Nurse

More from Richard Nurse (8)

Recently uploaded

Recently uploaded (20)

Semantic web technologies and digital library search

Editor's Notes