Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Semantic web technologies and digital library search

semantic web technologies and digital library search presentation discussing linked data basics and STELLAR project work

  • Login to see the comments

  • Be the first to like this

Semantic web technologies and digital library search

  1. 1. Semantic web and search Richard Nurse Open University Library Services
  2. 2. Outline • • • • Background Basics of semantic web technologies Relevance to libraries and search STELLAR search project
  3. 3. Open University • • • • • UK distance learning University +200,000 students Undergraduate/Postgraduate/Research Online learning supported by course materials & local tutors Milton Keynes campus and regional/national offices BUT… most students never visit the main campus
  4. 4. Library Services • • • • 24/7 helpdesk Online library resources Online help sessions Links to library resources and skills activities embedded in VLE • Discovery platform, website resource lists • Librarians work with academics to build new courses
  5. 5. Library Services • • • • • • Cross-university Information Management services Institutional Repository ORO http://oro.open.ac.uk/ Research Data Management Project Data retention and records management University Archive Metadata expertise
  6. 6. Library Services • Innovation projects http://www.open.ac.uk/blogs/macon/ http://www.open.ac.uk/blogs/RISE/ http://www.open.ac.uk/blogs/telstar/
  7. 7. Library Services • • • • Innovation and development OU Knowledge Media Institute and others Semantic web Video search http://www.open.ac.uk/blogs/AVA/ http://projects.kmi.open.ac.uk/reflex/index.xml http://kmi.open.ac.uk/projects/name/lucero
  8. 8. Search
  9. 9. Search “It‟s always so hit-and-miss… I used to sit there for hours and just not find anything. There were thousands and thousands of bits of material but no way of drilling down to find what I really needed. My manager needed to know, by tomorrow, whether there was something we could use or not and I didn‟t know the answer, so had to say no”.
  10. 10. Search • Terms • Boolean logic – AND, OR, NOT • - site: “ “
  11. 11. Search http://www.flickr.com/photos/niallkennedy/
  12. 12. Search http://www.flickr.com/photos/dullhunk/
  13. 13. Search • „things not strings‟ http://googleblog.blogspot.co.uk/2012/05/introducing-knowledge-graph-things-not.html
  14. 14. Search Google‟s Knowledge Graph
  15. 15. Semantic web Definition: "The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation." The Semantic Web Tim Berners-Lee, James Hendler, and Ora Lassila Scientific American, 2001 http://www.sciam.com/article.cfm?id=the-semantic-web http://www.nature.com/scientificamerican/journal/v284/n5/pdf/scientifica merican0501-34.pdf
  16. 16. Semantic web basics • „web of meaning‟ • „web of data‟ http://www.w3.org/2001/sw/ http://semanticweb.org/wiki/Main_Page http://www.slideshare.net/fadirra/semanticweb-intro-040411
  17. 17. Semantic web basics • • • • URIs Linked data Ontologies but also…
  18. 18. Semantic web basics • URIs – Uniform Resource Identifier • http://en.wikipedia.org/wiki/Uniform_resource_identifier http://www.slideshare.net/mdaquin/sssw13-ldtut
  19. 19. Linked data • “Linked Data is about using the Web to connect related data that wasn't previously linked, or using the Web to lower the barriers to linking data currently linked using other methods.” • Wikipedia defines Linked Data as "a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF." http://linkeddata.org/home
  20. 20. Subject > Predicate < Object Jane Austen „is the author of‟ Pride and Prejudice http://www.nature.com/scientificamerican/journal/ v284/n5/pdf/scientificamerican0501-34.pdf
  21. 21. Ontologies “An ontology is a formal specification of a shared conceptualization” Tom Gruber http://en.wikipedia.org/wiki/Tom_Gruber http://viaf.org/viaf/72955884/ http://www.slideshare.net/mdaquin/sssw13-ldtut
  22. 22. Ontologies eg Virtual International Authority File – VIAF – maintained by OCLC Friend of a Friend – FOAF http://www.foaf-project.org/ http://oclc.org/developer/documentation/virtual-international-authority-file-viaf/viaf-rdf-example
  23. 23. Ontologies http://viaf.org/viaf/102333412/#foaf:Person
  24. 24. Ontologies http://dbpedia.org/About http://lov.okfn.org/dataset/lov/
  25. 25. Linked data ‘cloud’ Richard Cyganiak and Anja Jentzsch http://lod-cloud.net/
  26. 26. Why is this of interest? Lorcan Dempsey OCLC http://www.slideshare.net/lisld/the-inside-out-library
  27. 27. Why is this of interest? Quoted by Lorcan Dempsey “Inside Out library: Scale, Learning and Engagement” http://www.slideshare.net/lisld/the-inside-out-library
  28. 28. Why is this of interest? “The change that libraries will need to make … must include the transformation of the library’s public catalog from a stand-alone database of bibliographic records to a highly hyperlinked data set that can interact with information resources on the World Wide Web.” Karen Coyle Understanding the semantic web http://www.alatechsource.org/library-technologyreports/understanding-the-semantic-webbibliographic-data-and-metadata
  29. 29. Why is this of interest? Search is a major “pain point” for students and staff Students ‘The library is very expansive which is great but you can never find what you need. They need to redo the system make it easier.’ NSS comment Staff ‘I would be more likely to explore existing noncurrent learning materials if there were a better way of finding them.’ STELLAR survey comment
  30. 30. What are libraries doing? http://lodlam.net/ http://datahub.io/group/lld http://www.w3.org/2005/Incubator/lld/
  31. 31. at the OU Library • Library catalogue • Archival material • Old course materials in the University Archive
  32. 32. University Archive • OU study materials – print and audio-visual • Historical materials – photographs, oral history • Papers of OU people http://www.open.ac.uk/library/library-resources/the-openuniversity-archive
  33. 33. Range of learning resource types
  34. 34. Range of learning resource types
  35. 35. The OU Digital Library (OUDL) FEDORA Flexible Extensible Digital Object Repository Architecture Open source, created by and supported by the digital preservation community purpose-designed Supports international metadata standards PREMIS – METS – MODS – EAD – DC - OAI Supports Linked Data natively Mulgara triplestore
  36. 36. The STELLAR project • Semantic Technologies Enhancing the Lifecycle of Learning Resources • OU Library Services/OU Knowledge Media Institute • Experiment with semantic technologies in a digital library environment … and to consider the sustainability implications of using semantic technologies. • Jisc-funded 2012-2013 • Jisc Digital Infrastructure programme – Sustainability of digital content
  37. 37. STELLAR project aims Taking collections preserved in the OUDL, the STELLAR project was established to: • Develop a detailed understanding of the value of legacy learning materials as perceived by academic staff and other key stakeholders • Experiment with the use of semantic technologies in a digital library environment to ascertain the extent to which the perceived value of these materials might be enhanced and to consider the sustainability implications of using semantic technologies. • Inform the development of digital libraries of learning resources by contributing to the evidence base for their effectiveness • Increase the return on investment of learning materials by developing an evidence based model for lifecycle management
  38. 38. The STELLAR project • • • • Project approach Create a baseline of perceptions of the value of the collection Carry out an enhancement of the collection Assess the impact of that enhancement on perceptions of value
  39. 39. Initial survey into value • 89.2% of respondents (501) agreed or strongly agreed with the statement that maintaining an archive of non-current OU learning materials is important to the reputation of the OU. • 75.9% of respondents thought that this should be maintained in perpetuity. • 90.16% of respondents (504) agreed or strongly agreed that non-current learning materials are important to the context of the history of higher education. • 91.75% of those respondents who were involved in module production (356) agreed or strongly agreed that when producing new OU learning material, I am likely to look to previous material, whether for inspiration or for potential reuse. “We are the world leaders in distance learning, so our curriculum designs are much admired and so are our materials. It would be remiss of us not to treat them as potential objects of scholarship themselves”.
  40. 40. Capturing perceptions Using a balanced scorecard approach we conducted a benchmarking survey of academic staff and stakeholders to investigate the value they place on non-current learning materials Personal and professional perspectives of value I would be disappointed if the OU learning materials that I helped to produce were not kept I keep my own copies of the OU learning materials that I am involved in producing I would be pleased if others chose to reuse of reversion the OU learning materials that I have helped to produce Value to internal processes and cultures I keep my own copies of the OU learning materials that I am involved in producing When producing new OU learning material, I am likely to look to previous material, whether for inspiration or for potential reuse I would be more likely to explore existing noncurrent learning materials if there were a better way of finding them. Value to HE and academic communities Maintaining an archive of non-current OU learning materials is important to the reputation of the OU I think the non-current OU learning materials are important in the context of the history of higher education I think the non-current OU learning materials are important in showing how the OU taught at particular times in history Financial / bottom line perspectives of value I think that there is a monetary value to non-current OU learning materials The OU could make savings if more learning material were reused http://www.gla.ac.uk/services/library/espida/
  41. 41. Module Information A metadata module record was created which connects the complicated web of content and metadata associated with each module STELLAR allowed us to link the metadata for all this module content, making it more discoverable & reusable
  42. 42. Basic linked data model (for data.open.ac.uk and to comply with current module descriptions) doau:a103 dc:title | rdfs:label | courseware:has-title rdf:type courseware:istaught-present courseware:ha s-courseware daou-library:339347 dc:title dc:isVersionOf aiiso#code “An Introduction to the Humanities” courseware:Course | mlo:LearningOppor tunitySpecification | aiiso:Module | xcri:course “false” dc:subject“A103” doau:a102 dc:isVersionOf doau:a101 “An introduction to the humanities : resource book 2” jacs:V900 | doau-topic:artsand-humanities
  43. 43. Relationship model
  44. 44. Fedora record course
  45. 45. Fedora record
  46. 46. Application of Linked Data • Text entered into the tool is passed through a semantic meaning engine and concepts are matched against the concepts contained within the digital library dataset. • A selection of the closest matches are then displayed. These link through to the object in the Fedora digital library • The semantic web tool analyses the meaning of those words and finds related material • the tool can also show related material from other datasets from data.open.ac.uk
  47. 47. http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/stellar2.mp4
  48. 48. Directly access digitised content stored in the OUDL Materials include those originally in print, audio and video formats Links to the extensive metadata about the course or element of the course, held on a data.open.ac.uk page
  49. 49. data.open.ac.uk
  50. 50. Architecture of the STELLAR tool
  51. 51. Try the technology • http://discou.info/alfa/
  52. 52. Headline findings
  53. 53. Headline findings • A consistently positive reaction to the enhanced collections. In every area the majority of respondents agreed or strongly agreed that the enhanced materials had value • Were two dimensions where the evaluation indicates the transformation of the materials has increased the perceived value of the material: • value to internal processes & culture • financial/bottom line value • Participants also made several comments regarding which materials should be preserved & enhanced • Read the full report on the blog: http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/STELLAR-Post-Enhancement-Survey-Report.pdf
  54. 54. Value to internal processes & culture • 89% of respondents agreed or strongly agreed that they would be more likely to explore existing materials if they knew they had been enhanced • 94% agreed or strongly agreed that such enhancement makes content easier to reuse or refer to for inspiration during module production • When thinking about existing systems, 94% also agreed or strongly agreed that the semantic analysis they had seen suggested material which they would not have found using a traditional search • 78% of respondents agreed or strongly agreed that enhanced materials are more likely to be referred to during module production than those preserved in existing OU systems
  55. 55. Financial / bottom line value • Improving the discoverability and reusability of the materials appears to have increased the perceived financial value of the materials • In the pre-enhancement survey 75.9% of respondents agreed that the OU could make savings if more learning material were reused • Following the enhancement, an increased 83% agreed or strongly agreed that the OU could make cost savings if existing materials were enhanced to make them more discoverable “It will be helpful to know what kind of support and budget is available to make more old course resources available. This will help reducing costly budgets for new modules in production.”
  56. 56. Value of semantic search Stakeholder views of semantic search • ‘More likely to use material’ - 89% agreed/strongly agreed • ‘Content easier to reuse’ – 94% agreed/strongly agreed • ‘Found material that traditional search wouldn’t – 94% agreed/strongly agreed Cost-savings could be made if material re-used After Before 72.00% 74.00% 76.00% 78.00% 80.00% 82.00% 84.00% http://www.open.ac.uk/blogs/stellar/wp-content/uploads/2013/07/STELLARPost-Enhancement-Survey-Report.pdf
  57. 57. Key findings • Significant effort required to improve the metadata • To make best use of the Linked Data, it was beneficial to digitise and preserve all course materials for the selected courses • Trade-off between value of extra content digitised and the cost of cataloguing • Once you’ve built it into your system you can automatically generate linked data for new content of that type • Stakeholders can see the value of this type of search
  58. 58. Follow-up work to STELLAR • Linked Data embedded into OU Digital Library • Used to link to related iTunesU and OpenLearn material
  59. 59. STELLAR • STELLAR blog www.open.ac.uk/blogs/stellar • Final report http://www.open.ac.uk/blogs/stellar/wpcontent/uploads/2013/09/STELLAR-JISC-Final-Report.pdf • Final report in Jorum http://hdl.handle.net/10949/18379
  60. 60. Questions?

×