Linked Data at The Open University: From Technical Challenges to Organizational Innovation Mathieu d’Aquin (@mdaquin) Knowledge Media Institute Stuart Brown (@stuartbrown) Communication Services The Open University
What are we doing in the industrytrack? Knowledge Media Institute: leading research center on semantic web technologies: – Ontology engineering, ontology discovery – Knowledge representation, reasoning, pr oblem solving – Interoperability, services, onto logy matching, data linking – 80 researchers/research assistants/PhD Students/ Academic-related staff – 100s of publications
And So?KMi is a department of the OpenUniversityThe Open University: – The largest university in the UK: 250K students per year, 8000 associate lecturers, a big campus in Milton Keynes – Created in 1969 – Almost entirely open and distance learning – 13 regional center, more national centers, courses available in a large number of countries Big organization = crazyinformation infrastructure
data.open.ac.ukThe first linked data platform providingopen information from a across a wholeuniversity
Lots a (types) of dataCourse information: 580 modules/ description of the course, information about the levels and number of credits associated with it, topics, and conditions of enrolment.Research publications: 16,000 academic articles / information about authors, dates, abstract and venue of the publication.Podcasts: 2220 video podcasts and 1500 audio podcats / short description, topics, link to a representative image and to a transscript if available, information about the course the podcast might relate to and license information regarding the content of the podcast.Open Educational Resources: 640 OpenLearn Units / short description, topics, tags used to annotate the resource, its language, the course it might relate to, and the license that applies to the content.Youtube videos: 900 videos / short description of the video, tags that were used to annotate the video, collection it might be part of and link to the related course if relevant.University buildings: 100 buildings / address, a picture of the building and the sub-divisions of the building into floors and spaces.Library catalogue: 12,000 books/ topics, authors, publisher and ISBN, as well as the course related.Others…
Collect Extract Link Store Expose Scheduler Ontologies Cleaning RDF file (add) rules URL redirection RDF file RSS Delete (1) rules (delete) Extractor RDF Add (2) ORO, podcast Cleaner Web RSS feed RDF file (add) Server RDF file (delete) RSS Updater RDF New items Extractor SPARQL Obsolete items Each datasets endpoint Entity XML Name Updater System Lib, courses, loc URI creation rules Planning + Logging Generic process Dataset specific process
First Issue: Convincing PeopleNot the technical bit… that’s easydata.open.ac.uk is now a core infrastructureelement of the Open University. But it took a lot oftalking… - Identify data Where most of LD Team - Get sample data Initial - Identify Copyright Issues the work is Meeting with Data Data Owner - Identify possible links done Owner - Identify users and usage - Find reusable ontologies LD Team Data Data LD Team - Map onto the data Modeling Modeling - Identify uncovered parts Data LD Team sessions Validation - Define URI Scheme Owner URI Creation Development Deployment LD Team Rules of Extractor Definition
What worksNot changing the way people work: – Pull data from feed exported by the original system. Not changing them, or introducing any additional difficulty. – Data taken “as is”, with an effort on understanding its original modeling.Bring the user along, add value – Data re-modeling as linked data creates positive side effects on the original data – Talk about possible links and usages first – Improve the usability of the data produce by the institution = make the work of people more visible and useful
So, what it is that we can do?Oh no! We can’t find a killer app!“Small things” that either were impossible before, orare now trivial to doBen, in the corridor: Hey! Your linked data thing, can it tell me what are the podcasts attached to courses that we no longer offer?Time difference in answering this question:x weeks 5 minutesnot really feasible easy
Simple works… A lot of “simple” works! SocialResourceDiscovery Research Exploration
Example: map of buildings Interactive map of Open University Buildings in the UK Built in 1 hour Connected to Ordnance Survey for location based on post-codes Allowed us to find out about issues in the data.
Example: Connecting our resourcesShow the courses and podcasts that connect to a piece of openeducational resources.Trivial with data.open.ac.ukImpossible to do before (!)
Simple things as examples: Inspire The simple apps above are not demonstrating particularly impressive technical achievements: They are here to show what can be done (easily)
Study at the OU mobile application(Communication and student services)
Supporting Research Evaluation(Research School)
Discovery of open educational resources(Open Media Unit and KMi)
So the technology is mature enough afterall?No!Providing an open SPARQL endpoint is a very badidea: 1 query can kill everything (and it does… often)Our approach: leave things open, fix when it breaksExample: Mirror triple-stores updated in parallelExample: Simple cache based on serving static filesfor most popular URIs/query. The cache is updatedwith the data.Keep the standard/open/application independentinterface: Free and easy reuse helps innovation, anAPI is an obstacle.
ConclusionStarting point: Showing off our technology, information integration issues, access to open informationWhere we got: (Open) innovation, competitive advantage, Linked Data as part of the backbone of the University’s information infrastructure, new systems built doing linked data by designBTW, anybody has a better wordthan backbone for linked data basedinformation infrastructure?
Going further…We are not the only University (anymore): data.southampton.ac.uk, data.ox.ac.uk, data.aalto.fi, data.uni- muenster.de…Ultimately, the University does not count: moving to“education à la carte” mEducator Orgs., Buid Research The Open outputs ings, Locati University ons Data.gov.uk education Learning resourcesOrganicEduNet University of Muenster, DE University of University of Bristol Southampton
What we need: Community effortLinkedUniversities.org
What we need: A reusable toolsetMarimba4lib.com
What we need: Compelling, global usecasesLinedUp-Project.eu
Thanks! More info: http://data.open.ac.uk http://lucero-project.org http://linkeduniversities.org http://linkedup-project.eu http://email@example.com
What it provides• Linked Data: URIs resolve with redirects to RDF and HTML, content negotiation• CC-By license on everything• SPARQL endpoint (SPARQL 1.1)• That’s all…
Integration• Big organization = crazy information infrastructure• Special focus on open/public information: – Course information – Open educational resources (OpenLearn) – Multimedia material: Podcast repository, iTunes U, openly licensed Youtube videos – Open access repository of research publications
Example: Charting our offering Showing basic charts generated from the answers to SPARQL queries The only effort required is coming up with the
Lean-back OU podcast channel on GoogleTV(IT services)
More on using the users…Obviously: Won’t be convinced by technologicalblablaShow examples of what it can do! (BTW we have aproblem here…)More importantly: ask them what they would like todo. We asked: communication services, librarypeople, student services, marketingservices, faculties… (they are very creative)A typical email of my inbox (inspired by a true story): Hi Mathieu, Stuart told me that your linked data thing might help with the problem Laura had with Guy’s system. Can we have a chat? Tx, Ben.
The OU’s presence in the media(Media relations) Academics in “Arts and Topics most commonly Humanities” most often involved mentioned by news outlets with the media (in number of news own by the BBC (in number items) of news items)