Twarql Architecture - Streaming Annotated Tweets

  • 4,012 views
Uploaded on

Here we describe the architecture of Twarql, and how we enabled streaming tweets that match a given SPARQL query.

Here we describe the architecture of Twarql, and how we enabled streaming tweets that match a given SPARQL query.

More in: Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
4,012
On Slideshare
0
From Embeds
0
Number of Embeds
4

Actions

Shares
Downloads
0
Comments
2
Likes
7

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • We work on the stream of tweets from the twitter API.Incoming stream of tweets are annotated and pushed to a triple store.Annotations include info on location, taggedwith, author, friend etc.Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • The red boxes represent Web-accessible endpoints (web services)The red flow (arrows) represents the client-side view of the world, i.e. the sequence of actions triggered by the client.The blue flow represents the server-side view
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.
  • Use case: get me all the tweets related to sports from all the friends of a user.The Formulate query will make a query for this assigning an ID for this, and will configure the social sensor for the query, id and hubURL.The hubURL will be given back to the client so the client knows where to collect information in future.

Transcript

  • 1. Twarql Streaming Annotated TweetsPablo N. Mendes, Pavan Kapanipathi, Alex Passant, Amit P. Sheth May 17th, 2010
  • 2. Twarql in a Nutshell• extract content (entity mentions, hashtags and URLs) from microposts;• encode content in a structured format (RDF) using shared vocabularies (FOAF, SIOC, MOAT, etc.);• enable structured querying of microposts (SPARQL);• enable subscription to a stream of microposts that match a given query (Concept Feeds);• enable scalable real-time delivery of streaming data (SparqlPuSH). For more information visit: http://wiki.knoesis.org/index.php/Twarql
  • 3. Simple Streaming Activity DiagramWeb Client APP SERVER Twitter API /setup LISTEN(tweet) STREAM(tweet) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 4. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setup LISTEN(tweet) STREAM(tweet) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 5. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 6. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet)STREAM(query, #id) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 7. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 8. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 9. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, new hubURL()) /publish /feed RDF store cache /sparql /sparql
  • 10. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish /feed RDF store cache /sparql /sparql
  • 11. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish /feed RDF store cache /sparql /sparql
  • 12. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publishREQUEST(#id) /feed RDF store cache /sparql /sparql
  • 13. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish #idREQUEST(#id) /feed RDF store cache /sparql /sparql
  • 14. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish #idREQUEST(#id) /feed RDF store cache /sparql /sparql
  • 15. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish #idREQUEST(#id) /feedUPDATE INTERFACE RDF store cache /sparql /sparql
  • 16. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish #idREQUEST(#id) /feedUPDATE INTERFACE RDF storePOLL cache /sparql /sparql
  • 17. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish #idREQUEST(#id) /feedUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 18. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish #idREQUEST(#id) /feedUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 19. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish #idREQUEST(#id) /feedUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 20. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish #idREQUEST(#id) /feedUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 21. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish #idREQUEST(#id) /feedUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 22. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish PUBLISH(tweet) #idREQUEST(#id) /feedUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 23. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish PUBLISH(tweet) #id FILTER(tweet, for all query)REQUEST(#id) /feedUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 24. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish PUBLISH(tweet) #id FILTER(tweet, for all query)REQUEST(#id) STORE(tweet) /feedUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 25. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish PUBLISH(tweet) #id FILTER(tweet, for all query)REQUEST(#id) STORE(tweet) /feedUPDATE INTERFACE RDF store #id UPDATE(aTweet, #id)POLL cache /sparql /sparql
  • 26. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish PUBLISH(tweet) #id FILTER(tweet, for all query)REQUEST(#id) STORE(tweet) /feedUPDATE INTERFACE RDF store #id UPDATE(aTweet)POLL cache CACHE(tweet) /sparql /sparql
  • 27. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish PUBLISH(tweet) #id FILTER(tweet, for all query)REQUEST(#id) STORE(tweet) /feedUPDATE INTERFACE RDF store #id UPDATE(aTweet)POLL cache CACHE(tweet)QUERY /sparql /sparql
  • 28. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish PUBLISH(tweet) #id FILTER(tweet, for all query)REQUEST(#id) STORE(tweet) /feedUPDATE INTERFACE RDF store #id UPDATE(aTweet)POLL cache CACHE(tweet) RELAYQUERY /sparql /sparql
  • 29. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish PUBLISH(tweet) #id FILTER(tweet, for all query)REQUEST(#id) STORE(tweet) /feedUPDATE INTERFACE RDF store #id UPDATE(aTweet)POLL cache CACHE(tweet) RELAYQUERY /sparql /sparql
  • 30. Simple Streaming Activity Diagram Web Client APP SERVER Twitter APISETUP /setupFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) topic id query REGISTER(query, #id1 SELECT… new hubURL()) #id2 /publish PUBLISH(tweet) #id FILTER(tweet, for all query)REQUEST(#id) STORE(tweet) /feedUPDATE INTERFACE RDF store #id UPDATE(aTweet)POLL cache QUERY(#id, query) CACHE(tweet) RELAYQUERY /sparql /sparql
  • 31. Simple (Flattened) Architecture• Was our first implementation for testing – App Server performs all functions Social Sensor, Semantic Publisher. No Distribution Hubs involved.• Overloads the server with streaming, annotating and publishing.• Client polling increases server overload.
  • 32. Our Proposed Architecture• Separates concerns between modules – Social Sensor, Semantic Publisher, Distribution Hub, Application Server• Uses pubsubhubbub protocol to increase scalability for distribution• Multiple app servers and multiple social sensors allow flexibility in handling multiple streams in a scalable fashion
  • 33. PuSH Activity DiagramWeb Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHER LISTEN(tweet) STREAM(tweet) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 34. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUP LISTEN(tweet) STREAM(tweet) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 35. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 36. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet)STREAM(query, #id) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 37. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 38. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe /publish /feed RDF store cache /sparql /sparql
  • 39. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, new hubURL()) /publish /feed RDF store cache /sparql /sparql
  • 40. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, new hubURL()) /publish /feed RDF store cache /sparql /sparql
  • 41. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 /feed RDF store cache /sparql /sparql
  • 42. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 /feed RDF store cache /sparql /sparql
  • 43. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2REQUEST(#id) /feed RDF store cache /sparql /sparql
  • 44. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #idREQUEST(#id) /feed RDF store cache /sparql /sparql
  • 45. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) /feed RDF store cache /sparql /sparql
  • 46. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed RDF store cache /sparql /sparql
  • 47. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed update RDF store cache /sparql /sparql
  • 48. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed update RDF store cache /sparql /sparql
  • 49. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed updateUPDATE INTERFACE RDF store cache /sparql /sparql
  • 50. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed updateUPDATE INTERFACE RDF storePOLL cache /sparql /sparql
  • 51. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed updateUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 52. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed updateUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 53. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed updateUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 54. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed updateUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 55. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed updateUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 56. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURLREQUEST(#id) PULL(hubURL, req) /feed feed updateUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 57. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) /feed feed updateUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 58. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #idPOLL cache /sparql /sparql
  • 59. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #id UPDATE(hubURL, rssTweet)POLL cache /sparql /sparql
  • 60. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #id UPDATE(tweet) UPDATE(hubURL, rssTweet)POLL cache /sparql /sparql
  • 61. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #id UPDATE(tweet) UPDATE(hubURL, rssTweet)POLL cache PUSH(tweet, subscriber) /sparql /sparql
  • 62. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #id UPDATE(tweet) UPDATE(hubURL, rssTweet)POLL cache PUSH(tweet, subscriber) /sparql /sparql
  • 63. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #id UPDATE(tweet) UPDATE(hubURL, rssTweet)POLL cache CACHE(tweet) PUSH(tweet, subscriber) /sparql /sparql
  • 64. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #id UPDATE(tweet) UPDATE(hubURL, rssTweet)POLL cache CACHE(tweet) PUSH(tweet, subscriber)QUERY /sparql /sparql
  • 65. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #id UPDATE(tweet) UPDATE(hubURL, rssTweet)POLL cache CACHE(tweet) PUSH(tweet, subscriber) RELAYQUERY /sparql /sparql
  • 66. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #id UPDATE(tweet) UPDATE(hubURL, rssTweet)POLL cache CACHE(tweet) PUSH(tweet, subscriber) RELAYQUERY /sparql /sparql
  • 67. PuSH Activity Diagram Web Client APP SERVER DIST. HUB (SEMANTIC) SOCIAL SENSOR Twitter API PUBLISHERSETUPFORMULATE QUERY keywords LISTEN(tweet) STREAM(tweet) query, #idSTREAM(query, #id) /register /subscribe ANNOTATE(tweet) REGISTER(query, topic id Hub URL new hubURL()) #id1 http://hub1 /publish PUBLISH(tweet) #id2 http://hub2 #id hubURL FILTER(tweet, for all query)REQUEST(#id) PULL(hubURL, req) STORE(tweet) /feed feed updateUPDATE INTERFACE RDF store #id UPDATE(tweet) UPDATE(hubURL, rssTweet)POLL cache QUERY(#id, query) CACHE(tweet) PUSH(tweet, subscriber) RELAYQUERY /sparql /sparql
  • 68. Algorithms
  • 69. Algorithm – SOCIAL SENSOR• SOCIAL SENSOR while (tweet = LISTEN(stream)) { aTweet = ANNOTATE(tweet) PUBLISH(aTweet) }
  • 70. Algorithm – SEMANTIC PUBLISHER• SEMANTIC PUBLISHER function PUBLISH(aTweet) { for each registered feed { if aTweet matches feed.query UPDATE(feed.hub) } Pavan’s “matches” implementation sparul = SERIALIZE(aTweet) } rdf.STORE(sparul) for each query { match = rdf.QUERY(query) if (match) UPDATE(hub) }
  • 71. Acknowledgements• Visit our project page: http://wiki.knoesis.org/index.php/Twarql• Thanks to Michael Cooney for the tedious work of making the animations SlideShare-friendly• Thanks to Pavan, Hemant, Pramod and Delroy for questions and suggestions at the March Hackaton where I first described this version of the architecture.