Putting Linked Data to Use in a
Large Higher-Education
Organisation
                             Mathieu d’Aquin
                      Knowledge Media Institute (KMi),
                             The Open University, UK
                                       @mdaquin
Motivation
• Many works fosus on the publication of
  linked data
• But what do we do once its published

• We have built a full linked data platform
  for our university (the Open
  University, data.open.ac.uk)
• And built a lot of applications to
  demonstrate what we could do with it
• What do we learn from getting people
  to, unknowingly, use linked data?
• What experience can we reuse for the
  development of interactive tools relying
  on linked data?
Linked Data at the Open University
                            data.open.ac.uk
Linked Data at the Open University
•   Course information:
     –   580 modules/ description of the course, information about the levels and number of
         credits associated with it, topics, and conditions of enrolment.
•   Research publications:
     –   16,000 academic articles / information about authors, dates, abstract and venue of the
         publication.
•   Podcasts:
     –   2220 video podcasts and 1500 audio podcats / short description, topics, link to a
         representative image and to a transscript if available, information about the course the
         podcast might relate to and license information regarding the content of the podcast.
•   Open Educational Resources:
     –   640 OpenLearn Units / short description, topics, tags used to annotate the resource, its
         language, the course it might relate to, and the license that applies to the content.
•   Youtube videos:
     –   900 videos / short description of the video, tags that were used to annotate the
         video, collection it might be part of and link to the related course if relevant.
•   University buildings:
     –   100 buildings / address, a picture of the building and the sub-divisions of the building
         into floors and spaces.
•   Library catalogue:
     –   12,000 books/ topics, authors, publisher and ISBN, as well as the course related.
•   Others…
Applications                    Mobile and
                                Personal
                                Semantics




                                             Social


Resource Discovery




                     Research
  Exploration
Application 1: Study at the OU




        That’s where linked data is
Application 1: What we learned
• From the users’ perspective
   – Useful functionality can be very simple
   – Combining information from different
     sources
   – Transparent/Seamless
• From the developers’ perspective
   – Development time: from months to             Looked at it in rage for
     minutes                                      hours… just didn’t think
   – Interacting directly with the data, rather    it wouldn’t give me an
     than multiple different systems               error if I mispelled the
   – Lack of awareness of Semantic Web               name of a property
     technologies
   – Correspondance with other, more common
     technologies (e.g., SQL and relational
     DBs) misleading
   – Performance: large number of SPARQL
     queries not easy to handle. Requires
     caching of pre-canned queries. Contradict
     the idea of open and unexpected reuse
Application 2: Supporting the REF




  * Combining public and semi-private data
  * Read/write
Application 2: What we learned
• From the users’ perspective                                            Really? This uses
   – No additional or duplicated output required from users:          linked data? I thought
     reusing what was collected in multiple systems                      we bought it from
   – Again transparent/seemless technology                               some company…
   – Still some confusion related to consistancy across
     systems/representations
   – Assumptions hard to conform with when data is drawn from
     multiple systems with “unwritten conventions”
• From the developers’ perspective
   – Again, rapide development
   – Extensibility and flexibility
   – SPARQL Query / SPARQL Update duo very powerful for
     lightweight interfaces (even client side)                           Can you add a
   – Dealing with incomplete data is tricky (we don’t know when it         new field?
     is incomplete)
   – No “meta-properties” of the data (i.e., all IDs are unique and
     non redundant)
   – Assumption made are specific to the application, not generic
   – Where is the problem? In the application, linked data, the
     original data?
Application 3:
Research communities




       Generic vs Specific
           Interface
Application 3: What we learned
• From the users’ perspective              Shouldn’t that be here
  – Generic: more knowledge = more             in that case?
    functionalisties
  – Generic: homogeneous interface to
    heterogenesous data
  – Generic: more demanding for users
  – Application-driven vs data-driven navigation
  – Specific interface allows for more complexity
• From the developers’ perspective
  – Generic is harder: can’t make assumptions
    related to the specific data/application
  – Specific is less customisable/extensible:
    adding new features requires custom code
Application 4: The OU in the media




Academics in “Arts and Humanities”       Topics most commonly mentioned by
most often involved with the media (in   news outlets own by the BBC (in
number of news items)                    number of news items)
Application 4: What we learned
• From the users’ perspective                        I would like this
                                                       chart for my
   – Easy understandable outputs: embedable              blog…
     charts                                            What do you
                                                    mean by “give me
   – Customisable: build a dynamics dashboard in       3 minutes”?
     minutes
   – Benefits of linked data: bring external data
     that can be jointly queried with you own
• From the developers’ perspective
   – Requires a good understanding of the data
     and the technology (especially SPARQL)
   – Generic component to build specific interfaces
     (best of both words?)
   – But again cannot rely on application/data
     specific assumptions (meta-properties
     regarding redundancy, completeness, etc.)
Discussion
• Linked data should be hidden from the users
   –   Obvious? Yes… but is it really happening?
   –   Requires some aspects of the data tto be persent, eg. Huamn readbale labels
   –   Many lapplicatoins of linked data are still linked data applications
   –   Higher level concpets, such as data0integration from multiple sources, are harder to
       hide

• Generic vs Specigic
   – Reuse of software components is good
   – But forces to addopt a specifi form of interatction witch is driven by the technicallities
     and the data
   – Trade-off to be found: generic + customisable

• Openess and flexibility
   –   … are not always easy to deal with
   –   Building interfaces fro the unknown.
   –   No assumption can be made on the data, regarding redundance and complete ness
   –   Need for meta-properties that can guide the building of applications (see what is
       applicable)
Conclusion
• Applications in an large
  organisations used to more
  common technologies raise
  challenges that help understanding
  the common pitfalls of interactions
  with linked data
• Important to share experiences in     Thank you!
  addition to techniques/tools
• To build better systems and
  approaches for interaction


                    Any question?
Images   (others are mine)

           •    Broadcast:
                http://commons.wikimedia.org/wiki/File:Ibaraki_Broadcast_System_he
                adquater01.jpg
           •    Don’t know:
                http://commons.wikimedia.org/wiki/File:I_Don%27t_Know_ANY_of_Th
                is!.jpg
           •    Development: http://commons.wikimedia.org/wiki/File:Applications-
                development.svg
           •    Learning:
                http://www.flickr.com/photos/vivacomopuder/3122401239/
           •    Course / degree:
                http://commons.wikimedia.org/wiki/File:Degree.svg
           •    Article :
                http://commons.wikimedia.org/wiki/File:Articles.JPG
           •    Open Learning: http://commons.wikimedia.org/wiki/File:Colearn_-
                _learning_together.jpg
           •    Youtube:
                http://commons.wikimedia.org/wiki/File:Logo_YouTube_por_Hernando
                .svg
           •    Open University building :
                http://www.flickr.com/photos/rattyfied/3011643690/
           •    Library:
                http://commons.wikimedia.org/wiki/File:SteacieLibrary.jpg

Putting Linked Data to Use in a Large Higher-Education Organisation

  • 1.
    Putting Linked Datato Use in a Large Higher-Education Organisation Mathieu d’Aquin Knowledge Media Institute (KMi), The Open University, UK @mdaquin
  • 2.
    Motivation • Many worksfosus on the publication of linked data • But what do we do once its published • We have built a full linked data platform for our university (the Open University, data.open.ac.uk) • And built a lot of applications to demonstrate what we could do with it • What do we learn from getting people to, unknowingly, use linked data? • What experience can we reuse for the development of interactive tools relying on linked data?
  • 3.
    Linked Data atthe Open University data.open.ac.uk
  • 4.
    Linked Data atthe Open University • Course information: – 580 modules/ description of the course, information about the levels and number of credits associated with it, topics, and conditions of enrolment. • Research publications: – 16,000 academic articles / information about authors, dates, abstract and venue of the publication. • Podcasts: – 2220 video podcasts and 1500 audio podcats / short description, topics, link to a representative image and to a transscript if available, information about the course the podcast might relate to and license information regarding the content of the podcast. • Open Educational Resources: – 640 OpenLearn Units / short description, topics, tags used to annotate the resource, its language, the course it might relate to, and the license that applies to the content. • Youtube videos: – 900 videos / short description of the video, tags that were used to annotate the video, collection it might be part of and link to the related course if relevant. • University buildings: – 100 buildings / address, a picture of the building and the sub-divisions of the building into floors and spaces. • Library catalogue: – 12,000 books/ topics, authors, publisher and ISBN, as well as the course related. • Others…
  • 5.
    Applications Mobile and Personal Semantics Social Resource Discovery Research Exploration
  • 6.
    Application 1: Studyat the OU That’s where linked data is
  • 7.
    Application 1: Whatwe learned • From the users’ perspective – Useful functionality can be very simple – Combining information from different sources – Transparent/Seamless • From the developers’ perspective – Development time: from months to Looked at it in rage for minutes hours… just didn’t think – Interacting directly with the data, rather it wouldn’t give me an than multiple different systems error if I mispelled the – Lack of awareness of Semantic Web name of a property technologies – Correspondance with other, more common technologies (e.g., SQL and relational DBs) misleading – Performance: large number of SPARQL queries not easy to handle. Requires caching of pre-canned queries. Contradict the idea of open and unexpected reuse
  • 8.
    Application 2: Supportingthe REF * Combining public and semi-private data * Read/write
  • 9.
    Application 2: Whatwe learned • From the users’ perspective Really? This uses – No additional or duplicated output required from users: linked data? I thought reusing what was collected in multiple systems we bought it from – Again transparent/seemless technology some company… – Still some confusion related to consistancy across systems/representations – Assumptions hard to conform with when data is drawn from multiple systems with “unwritten conventions” • From the developers’ perspective – Again, rapide development – Extensibility and flexibility – SPARQL Query / SPARQL Update duo very powerful for lightweight interfaces (even client side) Can you add a – Dealing with incomplete data is tricky (we don’t know when it new field? is incomplete) – No “meta-properties” of the data (i.e., all IDs are unique and non redundant) – Assumption made are specific to the application, not generic – Where is the problem? In the application, linked data, the original data?
  • 10.
    Application 3: Research communities Generic vs Specific Interface
  • 11.
    Application 3: Whatwe learned • From the users’ perspective Shouldn’t that be here – Generic: more knowledge = more in that case? functionalisties – Generic: homogeneous interface to heterogenesous data – Generic: more demanding for users – Application-driven vs data-driven navigation – Specific interface allows for more complexity • From the developers’ perspective – Generic is harder: can’t make assumptions related to the specific data/application – Specific is less customisable/extensible: adding new features requires custom code
  • 12.
    Application 4: TheOU in the media Academics in “Arts and Humanities” Topics most commonly mentioned by most often involved with the media (in news outlets own by the BBC (in number of news items) number of news items)
  • 13.
    Application 4: Whatwe learned • From the users’ perspective I would like this chart for my – Easy understandable outputs: embedable blog… charts What do you mean by “give me – Customisable: build a dynamics dashboard in 3 minutes”? minutes – Benefits of linked data: bring external data that can be jointly queried with you own • From the developers’ perspective – Requires a good understanding of the data and the technology (especially SPARQL) – Generic component to build specific interfaces (best of both words?) – But again cannot rely on application/data specific assumptions (meta-properties regarding redundancy, completeness, etc.)
  • 14.
    Discussion • Linked datashould be hidden from the users – Obvious? Yes… but is it really happening? – Requires some aspects of the data tto be persent, eg. Huamn readbale labels – Many lapplicatoins of linked data are still linked data applications – Higher level concpets, such as data0integration from multiple sources, are harder to hide • Generic vs Specigic – Reuse of software components is good – But forces to addopt a specifi form of interatction witch is driven by the technicallities and the data – Trade-off to be found: generic + customisable • Openess and flexibility – … are not always easy to deal with – Building interfaces fro the unknown. – No assumption can be made on the data, regarding redundance and complete ness – Need for meta-properties that can guide the building of applications (see what is applicable)
  • 15.
    Conclusion • Applications inan large organisations used to more common technologies raise challenges that help understanding the common pitfalls of interactions with linked data • Important to share experiences in Thank you! addition to techniques/tools • To build better systems and approaches for interaction Any question?
  • 16.
    Images (others are mine) • Broadcast: http://commons.wikimedia.org/wiki/File:Ibaraki_Broadcast_System_he adquater01.jpg • Don’t know: http://commons.wikimedia.org/wiki/File:I_Don%27t_Know_ANY_of_Th is!.jpg • Development: http://commons.wikimedia.org/wiki/File:Applications- development.svg • Learning: http://www.flickr.com/photos/vivacomopuder/3122401239/ • Course / degree: http://commons.wikimedia.org/wiki/File:Degree.svg • Article : http://commons.wikimedia.org/wiki/File:Articles.JPG • Open Learning: http://commons.wikimedia.org/wiki/File:Colearn_- _learning_together.jpg • Youtube: http://commons.wikimedia.org/wiki/File:Logo_YouTube_por_Hernando .svg • Open University building : http://www.flickr.com/photos/rattyfied/3011643690/ • Library: http://commons.wikimedia.org/wiki/File:SteacieLibrary.jpg

Editor's Notes

  • #3 Add images for feelins of each point
  • #5 Add images for each bullet point…