Digital Enterprise Research Institute                                                             www.deri.ie




                     SemWebbers, LODers:
               What PubSubHubbub can do for you
                                                                   Alexandre Passant
                                                                               DERI, NUI Galway




Seminar @ Talis
23 Sept 2010, Birmingham, U.K.
© Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
A *real-time* Web
Digital Enterprise Research Institute                                    www.deri.ie




           Information is no longer static
                  The Web becomes an information stream
                  New trends in ubiquitous computing, plus IoT
                     –  Even the @towerbridge is Tweeting !
                  Twitter, Foursquare, Gowalla, Qik, etc.


           Citizen Sensing
                  Earthquake detection on Twitter (WWW2010 paper)
                  Emergency management, reporting and monitoring
                   (Mumbai attacks on Flickr, Twitter, etc.)
                  Opinion and trends mining (Box office prediction on
                   Twitter – HP labs)


                                            2 of XYZ
Pull versus Push approaches
Digital Enterprise Research Institute                                         www.deri.ie




           What if …
                  I want to know when a friend of mine is checking-in in a
                   pub in Ireland
                  I want to get an alert when a wikipedia page about a punk-
                   rock band from the 80’s is edited


           Pull
                  Going to websites every minute to see what’s new
                  Useless HTTP calls (API/RSS), risks of being banned (TOS)
           Push
                  Websites let me know when they have something relevant
                  Wait. Receive. Consume

                                          3 of XYZ
PubSubHubbub (PuSH) at a glance
Digital Enterprise Research Institute                                           www.deri.ie




           Google’s Push approach
                  http://code.google.com/p/pubsubhubbub
                  Using Atom / RSS
                     –  link rel=“hub” header for identifying Hub from a feed
                  Simple registration / notification approach
                     –  API available in various languages
                  Broadcasting data through public hubs
                     –  Use Google’s one, or host your own




                                             4 of XYZ
Semantics and the real-time Web
Digital Enterprise Research Institute                                          www.deri.ie




           Semantics can help making sense of online content,
            by combining raw information and structured data
                  What’s happening right now, sport-wise, 25km around
                   here ?
                     –  Geonames, Twitter, Dbpedia, etc.
                  Which of my social network contact, whatever the website,
                   is visiting Europe next month
                     –  FOAF, Geonames, etc.

           New architectures are required
                  Enabling pro-active notification (PuSH) based on triggers
                  Triggers can be semantically defined with SPARQL !



                                               5 of XYZ
PubSubHubbub, SemWeb and LOD ?
Digital Enterprise Research Institute                                    www.deri.ie




           Using PubSubHubbub to register / get notifications
            about structured data
                  Dynamic triggers identifying relevant data changes
                  Defined as SPARQL queries
                  Combined with other data sources from the LOD cloud


           A friend of mine checking-in in a pub in Ireland
              <http://apassant.net/alex> foaf:knows ?friend ;!
              ?friend ex:checksIn ?pub ;!
              ?pub foaf:based_near ?location ;!
              ?location skos:subject
                dbpedia:Cities_in_the_Republic_ofIreland .!

                                         6 of XYZ
Our approach: sparqlPuSH
Digital Enterprise Research Institute                                       www.deri.ie




           sparqlPuSH
                  Combining SPARQL, SPARQL Update and PubSubHubbub
                   for proactive notifications of changes in RDF stores
                  An interface that can be plugged on the top of any RDF
                   store - http://code.google.com/p/sparqlpush/


           Based on
                  SPARQL and SPARQL Update to register feeds and launch
                   actions when content of the RDF store changes
                  Atom and RSS to get feeds of related changes
                  PubSubHubbub for broadcasting changes (benefiting from
                   public hubs such as Google’s one)


                                         7 of XYZ
sparqlPuSH
Digital Enterprise Research Institute                                       www.deri.ie




           A two-steps approach
                  Query registration
                  Change notification


           Both steps are independent of the RDF store
            implementation
                  Registration can be done remotely, with an HTTP request
                   sent to a sparqlPuSH interface
                  Notification is triggered as soon as relevant data appears in
                   the store, loaded with SPARQL Update through sparqlPuSH
                  Clients must understand the rel=“hub” link in the feed
                   header, and interpret notification from PuSH hubs


                                          8 of XYZ
Query registration
Digital Enterprise Research Institute              www.deri.ie




                                        9 of XYZ
Example of query registration
Digital Enterprise Research Institute                    www.deri.ie




           Identifying changes on a particular object
                  Using the Changeset vocabulary




                                        10 of XYZ
Query registration
Digital Enterprise Research Institute               www.deri.ie




                                        11 of XYZ
Conventions in query registration
Digital Enterprise Research Institute                                          www.deri.ie




           Using conventions to get a well-formatted Atom /
            RSS feed
                  Easier to read in standard aggregators

           Mandatory elements
                  ?uri - their URI of the element(s) to be retrieved
                  ?date - their creation / modification date
                  Can be used to retrieve named graphs if content itself is
                   not dated
           Optional elements
                  ?label - their label
                  ?author - their author

                                            12 of XYZ
Browsing available feeds
Digital Enterprise Research Institute                                          www.deri.ie




           The sparqlPuSH UI
                  Lists available feeds, including timestamp of last update
                  Ability to create feeds from the interface




                                          13 of XYZ
Notification
Digital Enterprise Research Institute               www.deri.ie




                                        14 of XYZ
Notification on data update
Digital Enterprise Research Institute                                                www.deri.ie




           SPARQL Update support
                  New data has to be HTTP POST-ed to sparqlPuSH
                     –  Then loaded in the underlying RDF store
                     –  Allows *real-time* identification (as opposed to cron-job)


           Identifying relevant changes
                  Applying all registered queries to the updated dataset
           Broadcasting changes
                  Using PubSubHubbub ! (Ensures scalability)




                                             15 of XYZ
Implementation
Digital Enterprise Research Institute                                     www.deri.ie




           Source code (PHP)
                  http://code.google.com/p/sparqlpush/ (BSD license)


           Server
                  Connection to any SPARQL endpoint
                  Additional connector for ARC2 using the ARC2 API
                  Generating RSS or Atom feeds
           Demo client
                  Registering / unregistering queries to remote sparqlPuSH
                   interfaces
                  Receiving updates from registered feeds


                                         16 of XYZ
Use-cases
Digital Enterprise Research Institute                                       www.deri.ie




           Simple sparqlPuSH use
                  http://vimeo.com/11023983
           Advanced query using DBpedia
                  http://vimeo.com/11023983


           Twarql
                  Twitter data extraction, interlinking and notification
                  http://wiki.knoesis.org/index.php/Twarql
           SMOB
                  In-progress: Broadcasting SPARQL-Update using PuSH
                  http://smob.me


                                          17 of XYZ
Questions ?
Digital Enterprise Research Institute                                   www.deri.ie




           sparqlPuSH source
                  http://code.google.com/p/sparqlpush/ (BSD license)


           Details on sparqlPuSH
                  SFSW2010 paper (w/ @pablomendes)


           Contact
                  alexandre.passant@deri.org
                  http://apassant.net
                  @terraces




                                         18 of XYZ

Semwebbers, LODers: What PubSubHubbub can do for you

  • 1.
    Digital Enterprise ResearchInstitute www.deri.ie SemWebbers, LODers: What PubSubHubbub can do for you Alexandre Passant DERI, NUI Galway Seminar @ Talis 23 Sept 2010, Birmingham, U.K. © Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
  • 2.
    A *real-time* Web DigitalEnterprise Research Institute www.deri.ie   Information is no longer static   The Web becomes an information stream   New trends in ubiquitous computing, plus IoT –  Even the @towerbridge is Tweeting !   Twitter, Foursquare, Gowalla, Qik, etc.   Citizen Sensing   Earthquake detection on Twitter (WWW2010 paper)   Emergency management, reporting and monitoring (Mumbai attacks on Flickr, Twitter, etc.)   Opinion and trends mining (Box office prediction on Twitter – HP labs) 2 of XYZ
  • 3.
    Pull versus Pushapproaches Digital Enterprise Research Institute www.deri.ie   What if …   I want to know when a friend of mine is checking-in in a pub in Ireland   I want to get an alert when a wikipedia page about a punk- rock band from the 80’s is edited   Pull   Going to websites every minute to see what’s new   Useless HTTP calls (API/RSS), risks of being banned (TOS)   Push   Websites let me know when they have something relevant   Wait. Receive. Consume 3 of XYZ
  • 4.
    PubSubHubbub (PuSH) ata glance Digital Enterprise Research Institute www.deri.ie   Google’s Push approach   http://code.google.com/p/pubsubhubbub   Using Atom / RSS –  link rel=“hub” header for identifying Hub from a feed   Simple registration / notification approach –  API available in various languages   Broadcasting data through public hubs –  Use Google’s one, or host your own 4 of XYZ
  • 5.
    Semantics and thereal-time Web Digital Enterprise Research Institute www.deri.ie   Semantics can help making sense of online content, by combining raw information and structured data   What’s happening right now, sport-wise, 25km around here ? –  Geonames, Twitter, Dbpedia, etc.   Which of my social network contact, whatever the website, is visiting Europe next month –  FOAF, Geonames, etc.   New architectures are required   Enabling pro-active notification (PuSH) based on triggers   Triggers can be semantically defined with SPARQL ! 5 of XYZ
  • 6.
    PubSubHubbub, SemWeb andLOD ? Digital Enterprise Research Institute www.deri.ie   Using PubSubHubbub to register / get notifications about structured data   Dynamic triggers identifying relevant data changes   Defined as SPARQL queries   Combined with other data sources from the LOD cloud   A friend of mine checking-in in a pub in Ireland <http://apassant.net/alex> foaf:knows ?friend ;! ?friend ex:checksIn ?pub ;! ?pub foaf:based_near ?location ;! ?location skos:subject dbpedia:Cities_in_the_Republic_ofIreland .! 6 of XYZ
  • 7.
    Our approach: sparqlPuSH DigitalEnterprise Research Institute www.deri.ie   sparqlPuSH   Combining SPARQL, SPARQL Update and PubSubHubbub for proactive notifications of changes in RDF stores   An interface that can be plugged on the top of any RDF store - http://code.google.com/p/sparqlpush/   Based on   SPARQL and SPARQL Update to register feeds and launch actions when content of the RDF store changes   Atom and RSS to get feeds of related changes   PubSubHubbub for broadcasting changes (benefiting from public hubs such as Google’s one) 7 of XYZ
  • 8.
    sparqlPuSH Digital Enterprise ResearchInstitute www.deri.ie   A two-steps approach   Query registration   Change notification   Both steps are independent of the RDF store implementation   Registration can be done remotely, with an HTTP request sent to a sparqlPuSH interface   Notification is triggered as soon as relevant data appears in the store, loaded with SPARQL Update through sparqlPuSH   Clients must understand the rel=“hub” link in the feed header, and interpret notification from PuSH hubs 8 of XYZ
  • 9.
    Query registration Digital EnterpriseResearch Institute www.deri.ie 9 of XYZ
  • 10.
    Example of queryregistration Digital Enterprise Research Institute www.deri.ie   Identifying changes on a particular object   Using the Changeset vocabulary 10 of XYZ
  • 11.
    Query registration Digital EnterpriseResearch Institute www.deri.ie 11 of XYZ
  • 12.
    Conventions in queryregistration Digital Enterprise Research Institute www.deri.ie   Using conventions to get a well-formatted Atom / RSS feed   Easier to read in standard aggregators   Mandatory elements   ?uri - their URI of the element(s) to be retrieved   ?date - their creation / modification date   Can be used to retrieve named graphs if content itself is not dated   Optional elements   ?label - their label   ?author - their author 12 of XYZ
  • 13.
    Browsing available feeds DigitalEnterprise Research Institute www.deri.ie   The sparqlPuSH UI   Lists available feeds, including timestamp of last update   Ability to create feeds from the interface 13 of XYZ
  • 14.
    Notification Digital Enterprise ResearchInstitute www.deri.ie 14 of XYZ
  • 15.
    Notification on dataupdate Digital Enterprise Research Institute www.deri.ie   SPARQL Update support   New data has to be HTTP POST-ed to sparqlPuSH –  Then loaded in the underlying RDF store –  Allows *real-time* identification (as opposed to cron-job)   Identifying relevant changes   Applying all registered queries to the updated dataset   Broadcasting changes   Using PubSubHubbub ! (Ensures scalability) 15 of XYZ
  • 16.
    Implementation Digital Enterprise ResearchInstitute www.deri.ie   Source code (PHP)   http://code.google.com/p/sparqlpush/ (BSD license)   Server   Connection to any SPARQL endpoint   Additional connector for ARC2 using the ARC2 API   Generating RSS or Atom feeds   Demo client   Registering / unregistering queries to remote sparqlPuSH interfaces   Receiving updates from registered feeds 16 of XYZ
  • 17.
    Use-cases Digital Enterprise ResearchInstitute www.deri.ie   Simple sparqlPuSH use   http://vimeo.com/11023983   Advanced query using DBpedia   http://vimeo.com/11023983   Twarql   Twitter data extraction, interlinking and notification   http://wiki.knoesis.org/index.php/Twarql   SMOB   In-progress: Broadcasting SPARQL-Update using PuSH   http://smob.me 17 of XYZ
  • 18.
    Questions ? Digital EnterpriseResearch Institute www.deri.ie   sparqlPuSH source   http://code.google.com/p/sparqlpush/ (BSD license)   Details on sparqlPuSH   SFSW2010 paper (w/ @pablomendes)   Contact   alexandre.passant@deri.org   http://apassant.net   @terraces 18 of XYZ