Digital Enterprise Research Institute                                                             www.deri.ie




                                  Rethinking Microblogging:
                                  Open, Distributed, Semantic
                               Alexandre Passant, John G. Breslin, Stefan Decker

                                              Digital Enterprise Research Institute, NUI Galway
                                                                 http://deri.ie




ICWE2010
Thursday, 8th July 2010
Vienna, Austria
© Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
Agenda
Digital Enterprise Research Institute                                       www.deri.ie




           Microblogging: current state and issues
           Requirements to enhance microblogging
           SMOB – Semantic MicroBLogging
                  An ontology stack for Social Semantic Web applications
                  Distributed hubs and communication protocols
                  Integration with the Linking Open Data (LOD) cloud
           Browsing, discovering and querying
                  Integrated capabilities (end-user interface, SPARQL, maps)
                  External indexes and components
           Future Work
           Conclusion
Microblogging
Digital Enterprise Research Institute                                     www.deri.ie




           Short status update, generally < 140 chars
                  Real-time information management / Citizen-sensing
                  Popular on the Web (Twitter) and in the enterprise (Yammer)
Limits of current systems
Digital Enterprise Research Institute                                      www.deri.ie




           And of Web 2.0 systems in general
                  Walled-garden systems / Lack of portability
                  Data belongs to the service used to publish it
                  Lack of structure / semantics
                     –  Recent work on OpenGraph and Twitter Annotations
                  « A Bill of Rights for Users of the Social Web »


           Most research on microblogging focuses on
            communication patterns and social behaviours
                  But what about engineering issues ?
Requirements
Digital Enterprise Research Institute                                      www.deri.ie




           Machine-readable metadata (R1)
                  Making microblogging systems more interoperable
                  Focus on microblog posts, content and authors


           Distributed architecture and open data (R2)
                  Solving the walled garden and data portability issues
                  Letting users own and control their data


           Data reuse and interlinking (R3)
                  Interlinking microblog posts with other initiatives
                  Reusing existing data to make more sense of microblogging
SMOB – Semantic MicrOBlogging
Digital Enterprise Research Institute                                     www.deri.ie




           Our proposal
                  A framework for open, distributed and semantic
                   microblogging
                  Based on state-of-the-art Semantic Web technologies (RDF(S)/
                   OWL, RDFa, SPARQL) and Linked Data principles to provide
                   the previous requirements


           SMOB - http://smob.me:
                  Open-source framework (GPL)
                  Started mid-2008, completely re-designed end-2009
                     –  Distributed architecture, LOD-integration, etc.
The Semantic Web and Linked Data
Digital Enterprise Research Institute                                     www.deri.ie




           Semantic Web
                  From documents to structured data
                  Annotations (RDF), ontologies (RDFS/OWL), queries (SPARQL)
           Linked Data
                  A set of principles for publishing data on the Web
                  Linking Open Data project – interlinking datasets on the Web
                   using the LD principles
SMOB and our initial requirements
Digital Enterprise Research Institute                                      www.deri.ie




           Machine-readable metadata (R1)
                  An ontology stack for microblogging
                  Representation of posts in RDFa, SPARQL endpoint


           Distributed architecture and open data (R2)
                  Distributed hubs spread over the Web
                  Interacting via HTTP + SPARQL/Update


           Data reuse and interlinking (R3)
                  Interlinking microblog (and their posts) with other systems
                  Reusing existing data when available
Ontologies for microblogging (R1)
Digital Enterprise Research Institute                                      www.deri.ie




           Different needs
                  Users and Profiles (Personal information and Social Networks)
                  Presence (Geolocation, current activity, etc.)
                  Data (microblog containers and microblog posts)
                  Topics (#tags)


           Our approach
                  Integrating and extending existing lightweight ontologies to
                   focus on modularity and reusability of components
                  Providing a complete ontology stack for Semantic
                   Microblogging, and more broadly for any Social Semantic
                   Web application
FOAF – People and Social Networks
Digital Enterprise Research Institute                                       www.deri.ie



           FOAF – Friend Of A Friend
                http://foaf-project.org
                An ontology to describe people and their relationships
                Can be integrated with any other SW vocabularies
           FOAF on the Web
                  hi5, LiveJournal, Drupal 7, etc. and exporters for popular
                   services
FOAF – Distributed user-profiles
Digital Enterprise Research Institute                                                        www.deri.ie




           Ability to reuse existing profiles
                  Such as RDFa-enabled documents (e.g. Drupal 7)
                  No need to duplicate personal data




                                              foaf:depiction in RDFa
                                                                       http://apassant.net




                                        http://example.org
SIOC – Profiles and data
Digital Enterprise Research Institute                                  www.deri.ie




           SIOC – Semantically-Interlinked Online Communities
                  http://sioc-project.org
                  Representing online communities and their content
                  W3C Member Submission
                  A types module for finer-grained content-types
SIOC – Profiles and data
Digital Enterprise Research Institute                                       www.deri.ie




           SIOC extensions
                  Current state of SIOC cannot caputure all the properties of
                   microblogging


           New Classes
                  sioct:MicroBlog : Microblog container
                  sioct:MicroblogPost : Microblog post


           New properties
                  sioc:follows : following / followers (directed graph model)
                  sioc:addressed_to : @reply patterns
OPO – Presence Information
Digital Enterprise Research Institute                                       www.deri.ie




           OPO – Online Presence Ontology
                  http://online-presence.net
                  Representing rich presence information using semantics
                  Geolocation, current activity (project, etc.) …
                  Integration with SIOC to map content to the one’s presence
MOAT – Semantic Tagging
Digital Enterprise Research Institute                                                                                                                              www.deri.ie




           MOAT – Meaning Of A Tag
                  http://moat-project.org
                  A model to provide semantic tagging capabilities
                  Linking #tags to their meanings (defined as URIs)
                  Provides integration with the Linking Open Data cloud
                                                                                                                                       Tag Ontology
                                                                  tag:RestrictedTagging


                                                                        rdf:type                                                      http://example.org/tag/
                                                                                                                                               apple
                                                                                                          tag:associatedTag
                                                                   http://example.org/
                                                                         tagging1


                                                        tag:taggedBy    tag:taggedResource


                                                                                                                          moat:tagMeaning

                                                                                   http://example.org/                                       http://dbpedia.org/
                                                                                                                 moat:taggedWith            resource/Apple_Inc.
                                                                                          post/1


                                                foaf:maker                   dct:title

                                                                  Nouvel iPhone           rdf:type                                   MOAT + DBpedia
                          http://apassant.net/alex/
                                                                    disponible


                                                                                     sioct:BlogPost
                                    FOAF
                                                                                                         SIOC + DC
The SMOB Ontology Stack
Digital Enterprise Research Institute                                      www.deri.ie




           Integration of the previous components
                  Plus a smob:Hub class to represent user’s hubs
                  Can be reused in various Social Semantic Web contexts
                  Makes Social Web applications part of the LOD cloud
Representation
Digital Enterprise Research Institute                                     www.deri.ie




           Each microblog post is represented in RDF using the
            aforementionned ontology stack
                  Dereferenceable URI for each post
                  Subset directly in XHTML pages using RDFa (/page)
                  Complete representation also available using Turtle (/data)
Distributed architecture (R2)
Digital Enterprise Research Institute                                        www.deri.ie




           Personal SMOB hubs spread all over the Web
                  No central server / no dependency to 3rd-party services
                  Ensure data ownership and privacy
                  Each hub simply requires LAMP settings (based on ARC2)
                   and provides a SPARQL endpoint
                  Can be used as read-write Twitter clients


           Following / follower registration
                  Ability to get « remote followers », represented internally
                   (in both the follower and followee hub) in RDF
                  :user_a sioc:follows :user_b .!
Communication between hubs
Digital Enterprise Research Institute                                     www.deri.ie




           Replication / notification between peers
                  Broadcasting data to followers when new content is created
                  Using SPARQL/Update via HTTP POST (Checking access rights)
                  Simple HTTP POST to Twitter API
Integration with the LOD cloud (R3)
Digital Enterprise Research Institute                                       www.deri.ie




           Semantic tagging
                  URIs being suggested on runtime when typing #tags
                  Integration of microblogging within the LOD cloud
                   (Dbpedia lookup, Sindice)
                  Plug-in system to add new service (e.g. enterprise KB)
Geolocation
Digital Enterprise Research Institute                               www.deri.ie




           Geolocation features
                  Run-time geolocation lookup using GeoNames.org
                  Modelled as part of the user’s presence
Interlinking benefits
Digital Enterprise Research Institute                                                          www.deri.ie




           Benefits of LOD interlinking
                  Can re-use backgound knowledge when querying data
                     –  E.g. Microblog posts about any city in Italy (will retrieve #Torino)
                  Microblog content becomes more discoverable
Browsing, discovering and querying
Digital Enterprise Research Institute                                     www.deri.ie




           End-user interface
                  Genuine microblogging interface, on top of RDF data
           Map view
                  Geolocation capabilities provided thanks to GeoNames
           Integration with Sindice
                  Third-party service for identifying SMOB hubs and content
           SPARQL endpoint
                  Direct queries and pluggable components via HTTP
End-user interface
Digital Enterprise Research Institute                                  www.deri.ie




           Genuine microblogging interface
                  Generated using SPARQL queries
                  Integration of Twitter posts (also stored in RDF)
                  RDFa markup for each post and user
Map view
Digital Enterprise Research Institute                                   www.deri.ie




           Geolocation features
                  Benefits of the GeoNames lookup integration
                  Reusing coordinates provided by the GeoNames KB (in RDF)
Integration with Sindice
Digital Enterprise Research Institute                                      www.deri.ie




           Sindice – the Semantic Web index
                  http://sindice.com
                  SMOB hubs can ping Sindice when new content is created
                  Retrieving distributed SMOB data from a single entry point
                  Transversal SPARQL querying to discover microblog posts
SPARQL endpoint
Digital Enterprise Research Institute                                         www.deri.ie




           Each hub provides its own endpoint
                  Using SPARQL, no need to learn a new API
                  Direct queries sent via HTTP / answers as JSON/XML
                  Ability to plug-in external components (e.g. Explorator)
Future Work
Digital Enterprise Research Institute                                   www.deri.ie




           Scalability
                  PubSubHubbub integration (cf. recent work on sparqlPuSH)
                  Cache system for rendering SPARQL-based pages
           Modelling
                  Full-RDFa / content-negociation
           Interlinking and data generation
                  Nanosyntaxes (generating RDF data about updates’ content)


           Uptake
                  Spread the word to enrich the number of SMOB hubs
                   deployed on the Web !
Conclusion
Digital Enterprise Research Institute                                       www.deri.ie




           Contributions
                  An Ontology stack for Social Semantic Web applications
                  A distributed architecture for microblogging
                  Integration of microblogging with the LOD cloud
                  Deployed in SMOB – http://smob.me


           Take-home message
                  There are opportunities for a distributed Social Semantic Web
                  SMOB can be just a part of this ecosystem, it’s up to you !
                  Semantic Web and Linked Data provide straightworward
                   integration of other components following the same approach
Thank you !
Digital Enterprise Research Institute                                    www.deri.ie




           http://smob.me
                  GNU/GPL, runs on any LAMP environment
                  Regular new features and bugfixes, consider the SVN version




           Main contact
                  http://apassant.net
                  alexandre.passant@deri.org
                  @terraces

Rethinking Microblogging: Open Distributed Semantic

  • 1.
    Digital Enterprise ResearchInstitute www.deri.ie Rethinking Microblogging: Open, Distributed, Semantic Alexandre Passant, John G. Breslin, Stefan Decker Digital Enterprise Research Institute, NUI Galway http://deri.ie ICWE2010 Thursday, 8th July 2010 Vienna, Austria © Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
  • 2.
    Agenda Digital Enterprise ResearchInstitute www.deri.ie   Microblogging: current state and issues   Requirements to enhance microblogging   SMOB – Semantic MicroBLogging   An ontology stack for Social Semantic Web applications   Distributed hubs and communication protocols   Integration with the Linking Open Data (LOD) cloud   Browsing, discovering and querying   Integrated capabilities (end-user interface, SPARQL, maps)   External indexes and components   Future Work   Conclusion
  • 3.
    Microblogging Digital Enterprise ResearchInstitute www.deri.ie   Short status update, generally < 140 chars   Real-time information management / Citizen-sensing   Popular on the Web (Twitter) and in the enterprise (Yammer)
  • 4.
    Limits of currentsystems Digital Enterprise Research Institute www.deri.ie   And of Web 2.0 systems in general   Walled-garden systems / Lack of portability   Data belongs to the service used to publish it   Lack of structure / semantics –  Recent work on OpenGraph and Twitter Annotations   « A Bill of Rights for Users of the Social Web »   Most research on microblogging focuses on communication patterns and social behaviours   But what about engineering issues ?
  • 5.
    Requirements Digital Enterprise ResearchInstitute www.deri.ie   Machine-readable metadata (R1)   Making microblogging systems more interoperable   Focus on microblog posts, content and authors   Distributed architecture and open data (R2)   Solving the walled garden and data portability issues   Letting users own and control their data   Data reuse and interlinking (R3)   Interlinking microblog posts with other initiatives   Reusing existing data to make more sense of microblogging
  • 6.
    SMOB – SemanticMicrOBlogging Digital Enterprise Research Institute www.deri.ie   Our proposal   A framework for open, distributed and semantic microblogging   Based on state-of-the-art Semantic Web technologies (RDF(S)/ OWL, RDFa, SPARQL) and Linked Data principles to provide the previous requirements   SMOB - http://smob.me:   Open-source framework (GPL)   Started mid-2008, completely re-designed end-2009 –  Distributed architecture, LOD-integration, etc.
  • 7.
    The Semantic Weband Linked Data Digital Enterprise Research Institute www.deri.ie   Semantic Web   From documents to structured data   Annotations (RDF), ontologies (RDFS/OWL), queries (SPARQL)   Linked Data   A set of principles for publishing data on the Web   Linking Open Data project – interlinking datasets on the Web using the LD principles
  • 8.
    SMOB and ourinitial requirements Digital Enterprise Research Institute www.deri.ie   Machine-readable metadata (R1)   An ontology stack for microblogging   Representation of posts in RDFa, SPARQL endpoint   Distributed architecture and open data (R2)   Distributed hubs spread over the Web   Interacting via HTTP + SPARQL/Update   Data reuse and interlinking (R3)   Interlinking microblog (and their posts) with other systems   Reusing existing data when available
  • 9.
    Ontologies for microblogging(R1) Digital Enterprise Research Institute www.deri.ie   Different needs   Users and Profiles (Personal information and Social Networks)   Presence (Geolocation, current activity, etc.)   Data (microblog containers and microblog posts)   Topics (#tags)   Our approach   Integrating and extending existing lightweight ontologies to focus on modularity and reusability of components   Providing a complete ontology stack for Semantic Microblogging, and more broadly for any Social Semantic Web application
  • 10.
    FOAF – Peopleand Social Networks Digital Enterprise Research Institute www.deri.ie   FOAF – Friend Of A Friend   http://foaf-project.org   An ontology to describe people and their relationships   Can be integrated with any other SW vocabularies   FOAF on the Web   hi5, LiveJournal, Drupal 7, etc. and exporters for popular services
  • 11.
    FOAF – Distributeduser-profiles Digital Enterprise Research Institute www.deri.ie   Ability to reuse existing profiles   Such as RDFa-enabled documents (e.g. Drupal 7)   No need to duplicate personal data foaf:depiction in RDFa http://apassant.net http://example.org
  • 12.
    SIOC – Profilesand data Digital Enterprise Research Institute www.deri.ie   SIOC – Semantically-Interlinked Online Communities   http://sioc-project.org   Representing online communities and their content   W3C Member Submission   A types module for finer-grained content-types
  • 13.
    SIOC – Profilesand data Digital Enterprise Research Institute www.deri.ie   SIOC extensions   Current state of SIOC cannot caputure all the properties of microblogging   New Classes   sioct:MicroBlog : Microblog container   sioct:MicroblogPost : Microblog post   New properties   sioc:follows : following / followers (directed graph model)   sioc:addressed_to : @reply patterns
  • 14.
    OPO – PresenceInformation Digital Enterprise Research Institute www.deri.ie   OPO – Online Presence Ontology   http://online-presence.net   Representing rich presence information using semantics   Geolocation, current activity (project, etc.) …   Integration with SIOC to map content to the one’s presence
  • 15.
    MOAT – SemanticTagging Digital Enterprise Research Institute www.deri.ie   MOAT – Meaning Of A Tag   http://moat-project.org   A model to provide semantic tagging capabilities   Linking #tags to their meanings (defined as URIs)   Provides integration with the Linking Open Data cloud Tag Ontology tag:RestrictedTagging rdf:type http://example.org/tag/ apple tag:associatedTag http://example.org/ tagging1 tag:taggedBy tag:taggedResource moat:tagMeaning http://example.org/ http://dbpedia.org/ moat:taggedWith resource/Apple_Inc. post/1 foaf:maker dct:title Nouvel iPhone rdf:type MOAT + DBpedia http://apassant.net/alex/ disponible sioct:BlogPost FOAF SIOC + DC
  • 16.
    The SMOB OntologyStack Digital Enterprise Research Institute www.deri.ie   Integration of the previous components   Plus a smob:Hub class to represent user’s hubs   Can be reused in various Social Semantic Web contexts   Makes Social Web applications part of the LOD cloud
  • 17.
    Representation Digital Enterprise ResearchInstitute www.deri.ie   Each microblog post is represented in RDF using the aforementionned ontology stack   Dereferenceable URI for each post   Subset directly in XHTML pages using RDFa (/page)   Complete representation also available using Turtle (/data)
  • 18.
    Distributed architecture (R2) DigitalEnterprise Research Institute www.deri.ie   Personal SMOB hubs spread all over the Web   No central server / no dependency to 3rd-party services   Ensure data ownership and privacy   Each hub simply requires LAMP settings (based on ARC2) and provides a SPARQL endpoint   Can be used as read-write Twitter clients   Following / follower registration   Ability to get « remote followers », represented internally (in both the follower and followee hub) in RDF   :user_a sioc:follows :user_b .!
  • 19.
    Communication between hubs DigitalEnterprise Research Institute www.deri.ie   Replication / notification between peers   Broadcasting data to followers when new content is created   Using SPARQL/Update via HTTP POST (Checking access rights)   Simple HTTP POST to Twitter API
  • 20.
    Integration with theLOD cloud (R3) Digital Enterprise Research Institute www.deri.ie   Semantic tagging   URIs being suggested on runtime when typing #tags   Integration of microblogging within the LOD cloud (Dbpedia lookup, Sindice)   Plug-in system to add new service (e.g. enterprise KB)
  • 21.
    Geolocation Digital Enterprise ResearchInstitute www.deri.ie   Geolocation features   Run-time geolocation lookup using GeoNames.org   Modelled as part of the user’s presence
  • 22.
    Interlinking benefits Digital EnterpriseResearch Institute www.deri.ie   Benefits of LOD interlinking   Can re-use backgound knowledge when querying data –  E.g. Microblog posts about any city in Italy (will retrieve #Torino)   Microblog content becomes more discoverable
  • 23.
    Browsing, discovering andquerying Digital Enterprise Research Institute www.deri.ie   End-user interface   Genuine microblogging interface, on top of RDF data   Map view   Geolocation capabilities provided thanks to GeoNames   Integration with Sindice   Third-party service for identifying SMOB hubs and content   SPARQL endpoint   Direct queries and pluggable components via HTTP
  • 24.
    End-user interface Digital EnterpriseResearch Institute www.deri.ie   Genuine microblogging interface   Generated using SPARQL queries   Integration of Twitter posts (also stored in RDF)   RDFa markup for each post and user
  • 25.
    Map view Digital EnterpriseResearch Institute www.deri.ie   Geolocation features   Benefits of the GeoNames lookup integration   Reusing coordinates provided by the GeoNames KB (in RDF)
  • 26.
    Integration with Sindice DigitalEnterprise Research Institute www.deri.ie   Sindice – the Semantic Web index   http://sindice.com   SMOB hubs can ping Sindice when new content is created   Retrieving distributed SMOB data from a single entry point   Transversal SPARQL querying to discover microblog posts
  • 27.
    SPARQL endpoint Digital EnterpriseResearch Institute www.deri.ie   Each hub provides its own endpoint   Using SPARQL, no need to learn a new API   Direct queries sent via HTTP / answers as JSON/XML   Ability to plug-in external components (e.g. Explorator)
  • 28.
    Future Work Digital EnterpriseResearch Institute www.deri.ie   Scalability   PubSubHubbub integration (cf. recent work on sparqlPuSH)   Cache system for rendering SPARQL-based pages   Modelling   Full-RDFa / content-negociation   Interlinking and data generation   Nanosyntaxes (generating RDF data about updates’ content)   Uptake   Spread the word to enrich the number of SMOB hubs deployed on the Web !
  • 29.
    Conclusion Digital Enterprise ResearchInstitute www.deri.ie   Contributions   An Ontology stack for Social Semantic Web applications   A distributed architecture for microblogging   Integration of microblogging with the LOD cloud   Deployed in SMOB – http://smob.me   Take-home message   There are opportunities for a distributed Social Semantic Web   SMOB can be just a part of this ecosystem, it’s up to you !   Semantic Web and Linked Data provide straightworward integration of other components following the same approach
  • 30.
    Thank you ! DigitalEnterprise Research Institute www.deri.ie   http://smob.me   GNU/GPL, runs on any LAMP environment   Regular new features and bugfixes, consider the SVN version   Main contact   http://apassant.net   alexandre.passant@deri.org   @terraces