Working with data.open.ac.uk, the Linked Data Platform of the Open University
Upcoming SlideShare
Loading in...5
×
 

Working with data.open.ac.uk, the Linked Data Platform of the Open University

on

  • 3,162 views

Presentation of the Linked Data work realised at the Open University to the IT developer's forum - 10/05/2011

Presentation of the Linked Data work realised at the Open University to the IT developer's forum - 10/05/2011

Statistics

Views

Total Views
3,162
Views on SlideShare
2,169
Embed Views
993

Actions

Likes
5
Downloads
25
Comments
1

4 Embeds 993

http://lucero-project.info 980
url_unknown 11
http://twitter.com 1
http://translate.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Usual pitch: - data on the web = every piece of data is web addressable, so data across different places/stores/systems become linkable: the Web = 1 data space

Working with data.open.ac.uk, the Linked Data Platform of the Open University Working with data.open.ac.uk, the Linked Data Platform of the Open University Presentation Transcript

  • Working with data.open.ac.uk,
    the linked data platform of the OU
    Mathieu d’Aquin and the LUCERO team
    @mdaquin
    Knowledge Media Institute, the Open University
    LUCERO project
    lucero-project.info – data.open.ac.uk
  • Linked Data
    As set of principles and technologies for a Web of Data
    Putting the “raw” data online in a standard, web enabled representation (RDF)
    Make the data Web addressable (URIs)
    Link with other data
  • Graph (up to date)
  • So Linked Data for the OU?
    RAE
    DBPedia
    Data from
    Research
    Outputs
    OpenLearn
    Content
    ORO
    Exposed as linked data, our data interlink with each other and the external world: become part of the “global data space” on the Web
    Archive of
    Course
    Material
    Library’s
    Catalogue
    Of Digital
    Content
    geonames
    data.gov.uk
    Currently: OU public data sit in different systems – hard to discover, obtain, integrate by users.
    A/V Material
    Podcasts
    iTunesU
    BBC
    DBLP
  • Why is it important?
    The OU has been the first University to expose its data as linked data: http://data.open.ac.uk
    Now widely recognized as a critical step forward for the HE sector in the UK (and worldwide)
    Favor transparency and reuse of data, both externally and internally
    Reduces cost of dealing with our own public data: integration and reuse by design
    Enable both new kinds of applications, and to make the ones that are already feasible more cost effective
    At least 3 other UK universities have now followed our example:
    http://data.online.lincoln.ac.uk/, http://data.ox.ac.uk/, http://data.southampton.ac.uk/
    And others in other countries are setting up similar initiatives
  • “if you are working in an IT department within a University you better read this report, as soon your department will need to be making these same decisions.”
    David Flanders,
    JISCExpoProgramme Manager,
    http://code.google.com/p/jiscexpo/wiki/luceroproject#Site_Visit_Report
  • The data.open.ac.uk Stack
    Applications
    Institutional repository data
    Research Data (Arts)
    Organizational infrastructure
    Technical infrastructure
  • data.open.ac.uk
  • Technological principle: Everything has a URI
    Example:
    http://data.open.ac.uk/course/m366 – the course M366
    http://data.open.ac.uk/oro/21166 – an article in ORO
    http://data.open.ac.uk/page/person/ext-911ee9dfa3db572830b00bd8a9983e39 – an Person, who authored the article above
    http://xmlns.com/foaf/0.1/Person – the type person
    http://purl.org/dc/terms/creator – the property that links an author to an article
  • Technological principle: Content negotiation
    Accept: text/html Accept: application/rdf+xml
    <?xml version="1.0" encoding="UTF-8"?>
    <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Descriptionrdf:about="http://data.open.ac.uk/oro/9719">
    <label xmlns="http://www.w3.org/2000/01/rdf-schema#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</label>
    <authorListxmlns="http://purl.org/ontology/bibo/" rdf:resource="http://data.open.ac.uk/oro/9719#authors"/>
    <title xmlns="http://purl.org/dc/terms/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</title>
    <abstract xmlns="http://purl.org/ontology/bibo/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers against the glycosylated form of MUC1 are described, along with their use in treatment and diagnosis of conditions associated with elevated production of MUC1.</abstract>
    <isPartOfxmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/oro/repository"/>
    <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/peerReviewed"/>
    <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/published"/>
    <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-07bcb3718cb0de7883dc7b8fde7e283d"/>
    <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/b7fc322e6386517c5ebef3c09d13bd9e"/>
    <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-7c8b5252e28115f91640559c2fe64ca3"/>
    <date xmlns="http://purl.org/dc/terms/">2007-11-15</date>
    <rdf:typerdf:resource="http://purl.org/ontology/bibo/Article"/>
    <rdf:typerdf:resource="http://purl.org/ontology/bibo/Patent"/>
    </rdf:Description></rdf:RDF>
  • RDF
    <?xml version="1.0" encoding="UTF-8"?>
    <rdf:RDFxmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Descriptionrdf:about="http://data.open.ac.uk/oro/9719">
    <label xmlns="http://www.w3.org/2000/01/rdf-schema#" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</label>
    <authorListxmlns="http://purl.org/ontology/bibo/" rdf:resource="http://data.open.ac.uk/oro/9719#authors"/>
    <title xmlns="http://purl.org/dc/terms/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers directed to MUC1</title>
    <abstract xmlns="http://purl.org/ontology/bibo/" rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Aptamers against the glycosylated form of MUC1 are described, along with their use in treatment and diagnosis of conditions associated with elevated production of MUC1.</abstract>
    <isPartOfxmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/oro/repository"/>
    <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/peerReviewed"/>
    <status xmlns="http://purl.org/ontology/bibo/" rdf:resource="http://purl.org/ontology/bibo/status/published"/>
    <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-07bcb3718cb0de7883dc7b8fde7e283d"/>
    <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/b7fc322e6386517c5ebef3c09d13bd9e"/>
    <creator xmlns="http://purl.org/dc/terms/" rdf:resource="http://data.open.ac.uk/person/ext-7c8b5252e28115f91640559c2fe64ca3"/>
    <date xmlns="http://purl.org/dc/terms/">2007-11-15</date>
    <rdf:typerdf:resource="http://purl.org/ontology/bibo/Article"/>
    <rdf:typerdf:resource="http://purl.org/ontology/bibo/Patent"/>
    </rdf:Description></rdf:RDF>
  • By the way…
    On Study at the OU:
    http://data.open.ac.uk/course/m366 – if HTML requested, goes to http://www3.open.ac.uk/study/undergraduate/course/m366.htm
    Try http://www3.open.ac.uk/study/undergraduate/course/m366.rdf
  • Technological principle: link… also to external datasets
    Using URIs makes pieces of data directly addressable and linkable on the Web, independently of where the data is:
    http://data.open.ac.uk/course/m366 isAvailableInhttp://sws.geonames.org/458258/ (Republic of Latvia)
    http://data.open.ac.uk/organization/the_open_universitysameAshttp://education.data.gov.uk/doc/school/133849
    http://data.open.ac.uk/location/building/mbbn (Berrill Building North) postcode http://data.ordnancesurvey.co.uk/id/postcodeunit/MK76AA
    And others can link to our data…
  • SPARQL
    The “SQL” of RDF and linked data
    Fits the graph data model of RDF
    Select [variables: ?x ?name, etc.]
    From [graph, or all graphs if nothing]
    Where [triple patterns and filters]
    Order by, limit, offset, etc.
    SPARQL protocol: simply based on HTTP
    A SPARQL endpoint is a URL that takes a “query” parameter
    And return results in the SPARQL xml format
    See http://data.open.ac.uk
  • SPARQL: example queries
    Courses available in Nigeria
    select distinct ?course
    where {?course
    <http://data.open.ac.uk/saou/ontology#isAvailableIn>
    <http://sws.geonames.org/2328926/>.
    ?course a <http://purl.org/vocab/aiiso/schema#Module>}
    http://data.open.ac.uk/query?query=select%20distinct%20%3Fcourse%20where%20{%3Fcourse%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23isAvailableIn%3E%20%3Chttp%3A%2F%2Fsws.geonames.org%2F2328926%2F%3E.%20%3Fcourse%20a%20%3Chttp%3A%2F%2Fpurl.org%2Fvocab%2Faiiso%2Fschema%23Module%3E}
  • SPARQL: example queries
    Courses available in Nigeria
    select distinct ?course
    where {?course
    <http://data.open.ac.uk/saou/ontology#isAvailableIn>
    <http://sws.geonames.org/2328926/>.
    ?course a <http://purl.org/vocab/aiiso/schema#Module>}
    http://data.open.ac.uk/query?query=select%20distinct%20%3Fcourse%20where%20{%3Fcourse%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23isAvailableIn%3E%20%3Chttp%3A%2F%2Fsws.geonames.org%2F2328926%2F%3E.%20%3Fcourse%20a%20%3Chttp%3A%2F%2Fpurl.org%2Fvocab%2Faiiso%2Fschema%23Module%3E}
  • SPARQL: example queries
    Video podcasts related to postgraduate courses in computing
    select ?x ?t where {
    ?c <http://purl.org/dc/terms/subject> <http://data.open.ac.uk/topic/computing>. ?c <http://data.open.ac.uk/saou/ontology#courseLevel> <http://data.open.ac.uk/saou/ontology#postgraduate>.
    ?x <http://data.open.ac.uk/podcast/ontology/relatesToCourse> ?c.
    ?x <http://purl.org/dc/terms/title> ?t.
    ?x <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/podcast/ontology/VideoPodcast>}
    http://data.open.ac.uk/query?query=select%20%3Fx%20%3Ft%0Awhere%20{%0A%20%20%20%3Fc%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fsubject%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Ftopic%2Fcomputing%3E.%0A%20%20%20%3Fc%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23courseLevel%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fsaou%2Fontology%23postgraduate%3E.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FrelatesToCourse%3E%20%3Fc.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Ftitle%3E%20%3Ft.%0A%20%20%20%3Fx%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%20%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FVideoPodcast%3E%0A}&limit=0
  • SPARQL: example queries
    Things related to “earthquake”
    select ?c ?desc where {
    ?c <http://purl.org/dc/terms/description> ?desc .
    { {?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/openlearn/ontology/OpenLearnUnit>}
    UNION
    {?c <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://data.open.ac.uk/podcast/ontology/VideoPodcast>} }
    FILTER regex(str(?desc), "earthquake", "i" )}
    http://data.open.ac.uk/query?query=select%20%3Fc%20%3Fdesc%20where%7B%0A%3Fc%20%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Fterms%2Fdescription%3E%20%3Fdesc%20.%0A%7B%7B%3Fc%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%0A%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fopenlearn%2Fontology%2FOpenLearnUnit%3E%7D%0AUNION%0A%7B%3Fc%20%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23type%3E%0A%3Chttp%3A%2F%2Fdata.open.ac.uk%2Fpodcast%2Fontology%2FVideoPodcast%3E%7D%7D%0AFILTER%20regex(str(%3Fdesc)%2C%20%22earthquake%22%2C%20%22i%22%20)%0A%7D&limit=0
  • Expose
    Store
    Collect
    Extract
    Link
    Ontologies
    Scheduler
    Cleaning rules
    RDF file (add) RDF file (delete)
    URL redirection rules
    RSS Extractor
    Delete (1)
    Add (2)
    RDF Cleaner
    Web Server
    ORO, podcast
    RSS feed
    RDF file (add) RDF file (delete)
    Triple Store
    RSS Updater
    SPARQL
    endpoint
    RDF Extractor
    New items
    Obsolete items
    Each datasets
    Index
    Entity Name System
    Search
    XML Updater
    URI creation rules
    Lib, courses, loc
    Planning + Logging
    Generic process
    Dataset specific process
  • Method for a exposing a dataset
    • Identify data
    • Get sample data
    • Identify Copyright Issues
    • Identify possible links
    • Identify users and usage
    Initial Meeting with Data Owner
    Lucero Core Team
    Data Owner
    Data Modeling sessions
    Lucero KMi Team
    • Find reusable ontologies
    • Map onto the data
    • Identify uncovered parts
    • Define URI Scheme
    Data Modeling Validation
    Lucero Core Team
    Lucero members
    Data Owner
    Development of Extractor
    URI Creation Rules Definition
    Deployment
    Lucero KMi Team
  • Datasets
    Already “officially” in place:
    ORO: more than 18,000 publications from OU researchers
    Podcasts: 2,500 audio and video tracks from podcast.open.ac.uk, linked to the relate courses
    Study at the OU: more than 600 live module descriptions
    OpenLearn: more than 550 Units of course material
    KMi Staff and Planet newsletter
    Currently being processed:
    OU Buildings in MK and regional centers
    Library Catalogue
    YouTube channel
    Old Courses
    “Reading Experience Database” project
    People Profiles
  • Screenshot of the dataset page
  • Building applications with Linked Data
    Everything is based on HTTP/XML
    In principle, just need a Web connection…
    Libraries available in many languages to manipulate RDF data
    Java: Jena (http://openjena.org/)
    PHP: ARC2 (https://github.com/semsol/arc2)
    Python:RDFLib (http://www.rdflib.net/)

  • Example: Accessing data.open.ac.uk with PHP/Arc2
    include_once("arc2/ARC2.php");
    // declare the SPARQL endpoint
    $config = array('remote_store_endpoint' => 'http://data.open.ac.uk/query’,);
    $store = ARC2::getRemoteStore($config);
    // Execute a SPARQL query
    $postcodesq = 'select distinct ?p where {[] <http://data.ordnancesurvey.co.uk/ontology/postcode/postcode> ?p.}’;
    $rows = $store->query($postcodesq, 'rows');
    // Display the results
    foreach($rows as $row) {
    echo $row[‘p’].”</br/>”;
    }
  • Applications
    For education
    Mobile podcast explorer, podcast explorer on TV
    OU Building Map, OU location tracker (cf. foursquare)
    OU Expert Search
    Connecting courses/OpenLearn to relevant podcast
    OU Course Profile Facebook app using list of courses, “Study Buddy” app connecting facebook users to relevant courses
    For Research
    Display connections in a research community
    Research Data/Impact Analysis
    Connection research datasets to external data
  • Example application: Link OpenLearn to relevant course/podcasts
  • Example Application: keep track of location, meetings, tutorials, at the OU
  • Example application:
    Expert Search using publication information and connecting to contact information within the OU
  • Example application: Explore Information about a person in the “Reading Experience Database” based on data provided by DBPedia (Linked Data version of Wikipedia)  New ways to look at humanities research data
  • Example application: exploring research communities
  • The future
    More data… always more data
    More links, especially to external entities
    BBC
    Government agencies
    Other universities
    More applications:
    Integration into main OU websites (e.g., study at the OU)
    Integration into common OU applications (people profile, Facebook course profile, etc.)
    Support for common OU processes (REF audit, course recommendation, providing resources to AL and lecturers)
    Connecting to other Universities
    Many other universities in the UK and abroad are making the move to linked data (see linkeduniversities.org)
    Linked data has the potential to create connections across institutions, a data-based network on higher education course providers
  • Conclusion
    Linked data is more than an emerging, academic trend.
    data.open.ac.uk and linked data in general are fast becoming very valuable resources for developers, internally and externally
    We are very proud to have been the first university to really deploy a linked data platform
    Needs to sustain and evolve as a core service at the OU…
    … and as a key component of the Web of University Linked Data
  • Thank You
    SalmanElahi
    ((Ex)-Dev)
    Carlo Allocca
    (Dev)
    Jane Whild
    (Admin)
    FouadZablith
    (Dev)
    KMi
    AndriyNikolov
    (linking)
    Enrico Motta
    (SGP)
    Mathieu d’Aquin
    (PD)
    Arts
    Suzanne Duncanson-Hunter
    John Wolfe
    Paul Lawrence
    Richard Nurse
    ((ex-)PM)
    Owen Stephens
    (PM)
    Stuart Brown
    Com./
    Student
    Comp.
    Services
    Data Owners
    Non Scantlebury
    Library
    Specialists
    Arts Specialists
    OU Library