AWS Community Day CPH - Three problems of Terraform
LUCERO - Building the Open University Web of Linked Data
1. Building the Open University’s Web of Linked Data Mathieu d’Aquin and the LUCERO team @mdaquin Knowledge Media Institute, the Open University LUCERO project lucero-project.info – data.open.ac.uk
2. People SalmanElahi ((Ex)-Dev) Carlo Allocca (Dev) Jane Whild (Admin) FouadZablith (Dev) KMi AndriyNikolov (linking) Enrico Motta (SGP) Mathieu d’Aquin (PD) Arts Suzanne Duncanson-Hunter John Wolfe Paul Lawrence Richard Nurse ((ex-)PM) Owen Stephens (PM) Stuart Brown Com./ Student Comp. Services Data Owners Non Scantlebury Library Specialists Arts Specialists OU Library
3. Linked Data As set of principles and technologies for a Web of Data Putting the “raw” data online in a standard, web enabled representation (RDF) Make the data Web addressable (URIs) Link with other data
5. So Linked Data for the OU? RAE DBPedia Data from Research Outputs OpenLearn Content ORO Exposed as linked data, our data interlink with each other and the external world: become part of the “global data space” on the Web Archive of Course Material Library’s Catalogue Of Digital Content geonames data.gov.uk Currently: OU public data sit in different systems – hard to discover, obtain, integrate by users. A/V Material Podcasts iTunesU BBC DBLP
6. Why is it important? The OU has been the first University to expose its data as linked data: http://data.open.ac.uk Now widely recognized as a critical step forward for the HE sector in the UK (and worldwide) Favor transparency and reuse of data, both externally and internally Reduces cost of dealing with our own public data: integration and reuse by design Enable both new kinds of applications, and to make the ones that are already feasible more cost effective At least 3 other UK universities have now followed our example: http://data.online.lincoln.ac.uk/, http://data.ox.ac.uk/, http://data.southampton.ac.uk/ And others in other countries are setting up similar initiatives
7. The data.open.ac.uk Stack Applications Institutional repository data Research Data (Arts) Organizational infrastructure Technical infrastructure
9. Expose Store Collect Extract Link Ontologies Scheduler Cleaning rules RDF file (add) RDF file (delete) URL redirection rules RSS Extractor Delete (1) Add (2) RDF Cleaner Web Server ORO, podcast RSS feed RDF file (add) RDF file (delete) Triple Store RSS Updater SPARQL endpoint RDF Extractor New items Obsolete items Each datasets Index Entity Name System Search XML Updater URI creation rules Lib, courses, loc Planning + Logging Generic process Dataset specific process
17. Define URI SchemeData Modeling Validation Lucero Core Team Lucero members Data Owner Development of Extractor URI Creation Rules Definition Deployment Lucero KMi Team
18. Datasets Already “officially” in place: ORO: more than 18,000 publications from OU researchers Podcasts: 2,500 audio and video tracks from podcast.open.ac.uk, linked to the relate courses Study at the OU: more than 600 live module descriptions OpenLearn: more than 550 Units of course material KMi Staff and Planet newsletter Currently being processed: OU Buildings in MK and regional centers Library Catalogue YouTube channel Old Courses “Reading Experience Database” project People Profiles
20. Applications For education Mobile podcast explorer, podcast explorer on TV OU Building Map, OU location tracker (cf. foursquare) OU Expert Search Connecting courses/OpenLearn to relevant podcast OU Course Profile Facebook app using list of courses, “Study Buddy” app connecting facebook users to relevant courses For Research Display connections in a research community Research Data/Impact Analysis Connection research datasets to external data
25. Example application: Expert Search using publication information and connecting to contact information within the OU
26. Example application: Explore Information about a person in the “Reading Experience Database” based on data provided by DBPedia (Linked Data version of Wikipedia) New ways to look at humanities research data
27. The future (practically) More data… always more data More links, especially to external entities BBC Government agencies Other universities More applications: Integration into main OU websites (e.g., study at the OU) Integration into common OU applications (people profile, Facebook course profile, etc.) Support for common OU processes (REF audit, course recommendation, providing resources to AL and lecturers) Sustainability LUCERO is finishing soon and…. data.open.ac.uk is becoming a core component of the OU information infrastructure…
28. The future (more generally) From nice demonstrators to real semantic web applications Use of reasoning and data mining for data consolidation and analysis Need proper frameworks for application developers! Linked data and the Semantic Web to support research Not only research communities Identifying new research questions and collecting evidence through connected datasets It is not about individual Universities! Universities sharing data to benefit students and researchers: the higher education’s web of linked data Needs collective vocabularies, recipes, approaches, classifications… the GoodRelations of higher education?
29. The future (research) Linked data analytics/Linked data mining Interfaces to linked data/Making sense of linked data (with ontologies) Semantic web for activity data/personal data
Usual pitch: - data on the web = every piece of data is web addressable, so data across different places/stores/systems become linkable: the Web = 1 data space