Semantic Pipes and Semantic MashupsPresentation Transcript
Semantic Pipes and Semantic Mashups Adrian Giurca
Basically pipes are just about Web data aggregation. You query/aggregate various data sources (typically Atom or RSS feeds ) and create a new data source (as a new feed)
Pipes were pioneered by Yahoo! with its Yahoo! Pipes , a free online service that lets you remix popular feed types and create data mashups using a visual editor.
Yahoo! Pipes is a data mashup tool rather than a complete Mashup editor.
Pipes are created by using various building blocks: User Inputs, Sources, URL, Operators, Location and so on.
New York Times thru Flickr
Yahoo Finance Stock Quote Watch List
Yahoo Finance Stock Quote Watch List
Data from pipes is delivered as widgets, RSS feeds, JSON objects...
The new created data can be used on other pipes too.
What is a Semantic Pipe?
A Semantic Pipe is a stored sequence of SPARQL queries over one or more RDF sources. The result of each query is seen as a concrete and independent RDF resource.
Semantic Pipes are close to Yahoo! Pipes as they aggregate data too.
RDF Data Sources
U.S. Census : 1 billion triples of U.S. census data
GovTrack : Around 10 million triples containing census data for U.S. locations, brief biographical data for all members of Congress, and mainly data for federal legislation and voting records going back five years.
DBLP Bibliography Database : 15 million triples on computer science bibliographic data.
DBPedia : Describing 1.6M Wikipedia articles
my.opera.com : Almost 2.7 million triples of my.opera.com data.
BBC Backstage from HP Labs: detailed information on the BBC schedules (including digital tv and radio) for the next week, updated every morning.
Work started in 2007 (1) and finalized in 2009 (2)
DERI Pipes is a Tool for building Semantic Pipes
Inspired from Yahoo! Pipes
Defines various blocks to process data sources (RDF but also HTML, XML and JSON) and blocks for processing data (mainly RDF data) i.e. operators (defined on top of SPARQL language) as well as user input blocks and conditional blocks
DERI Pipes defines a pipe XML language behind the visual interface
(1) Christian Morbidoni and Axel Polleres and Giovanni Tummarello, "Who the FOAF knows Alice? A needed step towards Semantic Web Pipes", ISWC 2007 Workshop on New forms of Reasoning for the Semantic Web: Scaleable, Tolerant and Dynamic, Busan, Korea (2) Danh Le Phuoc, Axel Polleres, Christian Morbidoni, and Manfred Hauswirth, Giovanni Tummarello. Rapid semantic web mashup development through semantic web pipes. In Proceedings of the 18 th World Wide Web Conference (WWW2009), Madrid, Spain, April 2009.
A Semantic Mashup is a data mashup using RDF(S) as data model and SPARQL services to implement the behavior.
In many cases, Semantic Mashups comes as Semantic Pipes i.e. applications that aggregate RDF(S) data.
However, while basic semantic pipes just aggregate semantic data, semantic mashups may have reasoning capabilities .
This may involve Semantic Web Reasoners such as KAON2 , Pellet , Racer , Jena ...
Some Semantic Mashups
Flickcurl (1) a mashup that gives RDF descriptions for all photos in Flickr .
RDF Book Mashup (1) - Wraps several book-related APIs
CiteULike (2) - a service for managing and discovering scholarly references
All above services are implemented programmatically
(1) Dave Beckett, http://www.dajobe.org , February 2007 (2) Christian Bizer, Richard Cyganiak, Tobias Gauß: The RDF Book Mashup: From Web APIs to a Web of Data. 3rd Workshop on Scripting for the Semantic Web (SFSW2007), Innsbruck, Austria, June 2007 (3) Querying Distributed RDF Data Sources with SPARQL by: Bastian Quilitz, Ulf Leser, The Semantic Web: Research and Applications (2008), pp. 524-538.
Feeds are the most important data source of the Web
All major content creators provides feeds of their data: Reuters , CNN , BBC , Elsevier , ...
Most used feed formats are Atom 1.0 and RSS 2.0
Discussion: Feeds and Semantic Data(1)
Most of semantic mashups process RDF data...
But RDF data is in very small amount comparing with the whole Web...
Therefore they should also extract RDF data from ordinary data sources
RDF data should be automatically created by extracting semantics from HTML documents
RDFa is a source of RDF data too but lack of RDFa annotation tools makes the job difficult for users...
Lack of tools that create RDF data sources?
Discussion: Feeds and Semantic Data(2)
RDF data should be automatically created from standard feeds
We will not debate on feed format adoption but,
There is a work on mapping Atom to RDF
Semantic Mashups should process Atom feeds as RDF and this will highly expand their applications