From research to business: the Web of linked data
Upcoming SlideShare
Loading in...5
×
 

From research to business: the Web of linked data

on

  • 3,358 views

invited talk at the Enterprise X.0/Econom Workshops @ BIS 2009 (April 29th, 2009)

invited talk at the Enterprise X.0/Econom Workshops @ BIS 2009 (April 29th, 2009)

Statistics

Views

Total Views
3,358
Views on SlideShare
3,351
Embed Views
7

Actions

Likes
4
Downloads
82
Comments
0

4 Embeds 7

http://www.service-finder.eu 3
http://iricelino.org 2
http://www.slideshare.net 1
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

From research to business: the Web of linked data From research to business: the Web of linked data Presentation Transcript

  • From research to business: the Web of linked data Irene Celino – Semantic Web Practice CEFRIEL – ICT Institute, Politecnico di Milano email: irene.celino@cefriel.it – web: http://swa.cefriel.it From research to business: the Web of linked data Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
  • Agenda The problem of integration Web as a platform Linked data How do we produce linked data today? The case of Service-Finder How do we manage linked data today? The case of Urban Computing in LarKC What’s next? What’s already going on Business view Scientific & technical view 2 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • The problem of integration When do we have an integration problem? Very large amounts of data that grow and evolve continuously problem of scale Numerous and different data typologies (documents, media, email, Web results, contacts, etc.) problem of data heterogeneity Numerous and different information systems (DB, legacy systems, ERP, etc.) problem of system heterogeneity 3 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • When 1 + 1 > 2 ? Data integration always gives an added value Getting a global high-level view Sharing knowledge Business opportunities Business Intelligence Still there is the technological problem: problem How to reconcile data heterogeneity? Who took advantage from integration? Can (Semantic) Web be of help? 4 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Lesson learned from Web 2.0 Participation politics and “wisdom of the crowds” Great success of mash-ups Mash-ups: applications made up of light integration of artifacts provided by third parties (often API or REST services) New integration paradigm to application development Publication and access via Web Storing our information on the Web is becoming easier and easier Accessing our information on the Web (e.g. by retrieving it with search engines) is becoming more and more frequent 5 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • The Web as integration platform What if we integrate on the Web? Web Web as a platform Data prosumer (producer + consumer) “Web of Data” Data From current “Web of Documents” to a Web of data Not only information retrieval, but also data retrieval Exposing your data on the Web Converting/translating to a suitable format “Wrapping” the data source Triplify Virtuoso D2R SPASQL R2O Relational.OWL Talis DartGrid SPOON SquirrelRDF 6 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Linked data and data cloud Linked Data The realization of the “Web of Data” (and of the Semantic Web) Tim Berners-Lee: http://www.w3.org/DesignIssues/LinkedData Linking Open Data Initiative A community publishing and linking data on the Web http://linkeddata.org/ Data cloud Today everybody talks about cloud computing However, often it’s not only a computation or storage issue, but it also about data and knowledge management 7 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Challenges for linked data Automatic linked data creation and linkage Automatic generation of linked data and smart mechanisms to identify “contact points” between different data sources and to seamlessly link them Distributed querying Querying distributed data over different Web sources regardless the “physical position” of data and getting aggregated results Distributed reasoning Applying inference techniques to distributed data, preserving consistency and correctness of the reasoning 8 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Service-Finder http://demo.service-finder.eu There’s a lot of information already on the Web: how can we turn it into linked data? From research to business: the Web of linked data Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
  • Context: SOA onto the Web Service Oriented Architectures (SOAs) along with Web Services technologies are widely seen as the most promising fundament for realizing service interchange in business to business settings. However, it is envisioned that SOAs and Web Services will increasingly move out of these settings and out onto the Web. Web size Google: 1.000.000.000.000 URIs (08/2008) [ http://developer.ebay.com/ ] NetCraft: 62.000.000 active hosts Service Web size Google: filetype:asmx inurl:wsdl (818) Service-Finder: > 25.000 [ http://aws.amazon.com/ ] 10 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • The rise and fall of public UDDI registries One of the essential building blocks for UDDI Business creating applications that utilize the vast Registry Shutdown. quot;With the approval of UDDI quantities of services, which are available on v3.02 as an OASIS Standard the Web is making it easier to discovery in 2005, and the momentum UDDI has achieved in market and select the right services adoption, IBM, Microsoft and UDDI was initially proposed as a SAP have evaluated the status of the UDDI Business Registry component of Web Services usage process and determined that the goals enabling registering and discovering for the project have been achieved. Given this, the UDDI services, but finally UDDI did not reach its Business Registry will be expected potential discontinued as of 12 January 2006.quot; The critical problem in this new Web [from “Registering for UDDI” 2005-12-17 ] oriented environment is one of scale [see http://xml.coverpages.org/uddi.html ] because services appear, disappear and change at a rate much higher than in business to business settings 11 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Pitfalls of public UDDI registries 1. UDDI is centered around programmatic access to the registry and only a few mostly technically focused user interfaces are available. 2. The information in public UDDI registry was often outdated. The value of the service in the public UDDI registry is minimal if the service itself does not exist anymore. 3. There are no means for community feedback. Practically there is only one possibility to provide feedback allowing the user to contact a provider by email listed in the service description. 4. A WSDL definition and a short description is not sufficient for a service consumer to select a service. To make decision about applicability of the service, service consumer need to become familiar with pricing, terms and condition, service level agreements to name just a few. 12 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Overcoming UDDI limitation 1. Easy to use GUI – It is important that early adopters of Web Services technology, who learns about it for the first time, should be able to start exploring it with a few simply steps 2. Search Engine style – Web is unpredictable and services can appear and disappear (the same as websites), but one can put up a mechanism (periodic crawling and availability check) allowing to eliminate these services which are not available any more 3. Architecture of participation – Learn from Web 2.0 (e.g., wikis, blogs, etc.) in enabling community contribution 4. More useful info – Include all information required by a user to make decision about applicability of the service; e.g., pricing, terms and condition, service level agreements, etc. 13 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • project idea Service-Finder aims at developing a platform for service discovery in which Service-Finder aims at developing a platform for service discovery in which Web Services are embedded in a Web 2.0 environment Web Services are embedded in a Web 2.0 environment http://demo.service-finder.eu Automatic Semantic Search Semantic Annotation Conceptual Indexing Combining smart-machine Semantic Matching and smart-data Web 2.0 Semantics User clustering Knowledge Representation Realizing Web Service User-Resource correlation & Reasoning Discovery at Web Scale Semantic Web Services Web Services As a means to realize As a basic tool to implement Service Oriented Architecture a Service Oriented Architecture 14 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • key objectives Create a Semantic Search Engine for Web Services Create a Semantic Search Engine for Web Services Aggregates information from heterogeneous sources: Aggregates information from heterogeneous sources: WSDL, wikis, blogs and also users’ feedbacks and behaviour WSDL, wikis, blogs and also users’ feedbacks and behaviour Create a Web Service Crawler to identify Web Services and their Create a Web Service Crawler to identify Web Services and their relevant information relevant information Automatically generate Semantic Service Descriptions Automatically generate Semantic Service Descriptions by analyzing heterogeneous sources by analyzing heterogeneous sources Allow efficient and effective search of collected and Allow efficient and effective search of collected and generated data generated data Provide a Web 2.0 portal Provide a Web 2.0 portal To support users in searching and browsing for Web Services To support users in searching and browsing for Web Services To give recommendations to users To give recommendations to users To track user behaviour for improving accuracy of service search To track user behaviour for improving accuracy of service search and user recommendations and user recommendations 15 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Realizing____________ Realizing Jan 2008 June 2008 Dec 2008 Today Dec 2009 16 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Use cases for____________ for To gather requirements we imaged several use cases A system administrator at a bank who is looking for an SMS Messaging service that sends him an SMS in any case failures with the on-line payment system of the bank A business and technology consultant working on a e-health project that needs to make it possible for general practitioners to send and receive fax directly from their patient record application using an on-line service A web developer that, after using a service listed on Service-Finder, decides to edit the information on the portal in order to improve it for other community users 17 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Requirements for ___________ We identified within those previous use cases more than 60 requirements and we grouped similar requirements together into three main categories: Search related: search for text, search for tag, search for concept, disambiguation, facet-browsing, ranking, sorting, comparing, etc. Web Service information related: Services details: interface, how can the service be used, its payment modalities, its terms and clauses, user-added information as ratings, comments and tags, measured values of service levels such as availability (uptime) or performance (response time) and the service level declared by the provider. Providers info: name of the provider and its references, user- added information as ratings, comments and tags User Community related: rating, commenting, tagging, editing, writing wiki entries, registration, recommendations 18 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Architecture and Components 19 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Key innovations of ___________ Research Activities To automatic create Web Service descriptions by analyzing Automatic WSDL and related information Service • coping with contradictions Annotation • using community process to verify results To investigate and implement techniques for: User and • clustering users accordingly to their behaviours Service • clustering services accordingly to their usage by users Clustering belonging to the same clusters Research and Engineering Activities To apply semantic technologies in the Web Service discovery Conceptual domain Indexing and To adopt them to the new forms of input descriptions: Matching • Automatic annotations, clusters, contexts Integration Activities To provide a Web 2.0 portal Service-Finder • demonstrating the developed technologies Portal • fostering communities participation 20 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Beyond state of the art Feature State of the art Improvement Architecture for lightweight Approaches based on a Enables to scale service semantic service discovery registration process or discovery with the upcoming an editorial team increase of publicly available services Largest and most accurate set Specialized portals only Focused crawler able to identify of publicly available services containing subset of services services and related information Innovative; under-researched Automatic metadata creation for Metadata generation from Web Web Service 2.0 data and services Indexed textual descriptions Integration of formal and informal Hybrid match-making (textual) knowledge algorithm Automatic creation of both user Only general-purpose clustering Specialize clustering and service clusters techniques exist algorithms that jointly cluster users and services Innovative interface that Current Web 2.0 portals do not Techniques that enable combines Web 2.0 features and include semantic metadata. handling of semantic metadata service related features in Web 2.0 portals 21 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Expected Impacts Service-Finder provides core mechanisms to cope with changing environments: It uses Web principles such as openness and robustness; It takes explicit and implicit user interaction for construction, improvement and validation of rich service description; and It exploits Semantic Web technologies as means to organize internally the data on available services. It simplifies the service publishing process by removing the burden of any registration and brings service discovery even to non-technical persons. Publishers increase their productivity, by being able to provide complex services without the need to register them explicitly. Creators become able to design more communicative forms of content by integrating third party services. Organizations can automate their processes by quickly finding adequate services. 22 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Exploitation Prospects The results of the Service-Finder project have the potential to revolutionize this market and to outperform existing solutions Using Service Finder for Public services Unique chance market for public services increases (xignite, cdyne, …) Missing Alternatives UDDI (has been shutdown in 2006) Google (no reliable filter / no additional information) Portals (rely on editorial process <=400 services) Service finder can also be applied within organizations Number of Services increases in organizations As within internet repositories in big companies can be quickly outdated IT Manager like minimal invasive technology 23 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • So what? Service-Finder and linked data Even if I didn’t explicitly talk about linked data, that is exactly the result of Service-Finder We take information about services from the Web, we translate it into structured information describing services wrt to domain-specific ontologies, we gives this information back to the community that can further enrich it Is this linked data? Not yet, but: RDFa annotation in SF portal pages coming soon Services to query the knowledge base coming soon Possibly a “dump” of SF knowledge base could be easily published on the Web as linked data 24 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Urban Computing in LarKC http://wiki.larkc.eu/UrbanComputing There are lots of data sources about cities on the Web: how can we query and reason on it? From research to business: the Web of linked data Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
  • Context: Cities are alive Cities come to life, grow, evolve like living beings The state of a city changes continuously, influenced by a lot of factors human factors: people moving in the city or extending it natural factors: precipitations or climate changes [source http://www.citysense.com] 26 Irene Celino –DoCoMo Invited speech, 11-3-2009 NTT From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 200926
  • Today Cities’ Challenges Our cities face many challenges •• How can we redevelop existing neighbourhoods and How can we redevelop existing neighbourhoods and business districts to improve the quality of life? business districts to improve the quality of life? •• How can we create more choices in housing, How can we create more choices in housing, accommodating diverse lifestyles and all income levels? accommodating diverse lifestyles and all income levels? •• How can we reduce traffic congestion yet stay connected? How can we reduce traffic congestion yet stay connected? •• How can we include citizens in planning their communities How can we include citizens in planning their communities rather than limiting input to only those affected by the next rather than limiting input to only those affected by the next project? project? •• How can we fund schools, bridges, roads, and clean water How can we fund schools, bridges, roads, and clean water while meeting short-term costs of increased security? while meeting short-term costs of increased security? [ source http://www.uli.org/] 27 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Urban Computing to address challenges 28 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Urban Computing A definition: The integration of computing, sensing, and actuation technologies into everyday urban settings and lifestyles. [source IEEE Pervasive Computing,July-September 2007 (Vol. 6, No. 3)] Urban settings include, for example, streets, squares, pubs, shops, buses, and cafés - any space in the semipublic realms of our towns and cities Only in the last few years have researchers paid much attention to technologies in these spaces Pervasive computing has largely been applied either in relatively homogeneous rural areas, where researchers have added sensors in places such as forests, vineyards, and glaciers or, on the other hand, in small-scale, well-defined patches of the built environment such as smart houses or rooms Urban settings are challenging for experimentation and deployment, and they remain little explored 29 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Availability of Data Some years ago, due to the lack of data, solving Urban Computing problems with ICT looked like a Sci-Fi idea Nowadays, a large amount of the required information can be made available on the Web at almost no cost. We are running a survey and we have collected more than 50 sources of data: maps with streets and paths (Google Maps, Yahoo! Maps…), events scheduled (EVDB, Upcoming…), multimedia data with information about location (Flickr…) relevant places (schools, bus stops, airports...) traffic information (accidents, problems of public transportation...) city life (job ads, pollution, health care...) We are running a survey (please contribute), see http://wiki.larkc.eu/UrbanComputing/ShowUsABetterWay http://wiki.larkc.eu/UrbanComputing/OtherDataSources 30 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Are Data Mashups the solution? [source: http://pipes.yahoo.com/pipes/ ] [source: http://www.popfly.com/ ] [source: http://editor.googlemashups.com ] IBM Lotus Mashups [source: http://openkapow.com/ ] [source: http://www-01.ibm.com/software/lotus/products/mashups/ ] 31 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Data Mashups offer powerful visualizations Google Charts API http://maps.google.it/ http://code.google.com/apis/chart/ MIT Simile Timeline & Timeplot http://simile.mit.edu/timeline/ http://simile.mit.edu/timeplot/ http://maps.yahoo.com/ 32 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Data Mashups offer simple programming abstractions 33 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Not everything boils down to plumbing 34 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • The LarKC project .eu ! u! ww larkc ///www..lark c.e http: /w p: Visiit htt Vis t [Source: Fensel, D., van Harmelen, F.: Unifying reasoning and search to web scale. IEEE Internet Computing 11(2) (2007)] 35 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Sustainable mobility as an example Urban Computing proposes a set of different • • How can we redevelop How can we redevelop issues, from technological to social ones. existing neighbourhoods and existing neighbourhoods and Our experience in the field make us believe business districts to improve business districts to improve the quality of life? that sustainable mobility is an exemplar the quality of life? case which we can elicit generalizable • • How can we create more How can we create more choices in housing, requirements from. choices in housing, accommodating diverse accommodating diverse Mobility demand has been growing steadily lifestyles and all income lifestyles and all income for decades and it will continue in the future. levels? levels? For many years, the primary way of dealing • • How can we reduce traffic How can we reduce traffic with this increasing demand has been the congestion yet stay congestion yet stay connected? increase of the roadway network capacity, by connected? building new roads or adding new lanes to • • How can we include citizens in How can we include citizens in planning their communities existing ones. planning their communities rather than limiting input to rather than limiting input to However, financial and ecological only those affected by the next only those affected by the next considerations are posing increasingly severe project? project? constraints on this process. • • How can we fund schools, How can we fund schools, Hence, there is a need for additional bridges, roads, and clean bridges, roads, and clean water while meeting short-term intelligent approaches designed to meet the water while meeting short-term costs of increased security? demand while more efficiently utilizing the costs of increased security? existing infrastructure and resources. 36 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • A Challenging Use Case 1/2 (planning) Actors: Varese Carlo: a citizen living in Varese. The day after, he has to go to Lombardy Region premises in Milano at 11.00. UCS: a fictitious Urban Computing ©2009 Google – Map Data @2009 Teleatlas – Terms of Usage System of Milano area Ways to Milano Milano Private Car FS railways Le Nord railways ©2009 Google – Map Data @2009 Teleatlas – Terms of Usage 37 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • A Challenging Use Case 2/2 (traveling) Actors: Varese Carlo: a citizen living in Varese. The day after, he has to go to Lombardy Region M premises in Milano at 11.00. UCS: a fictitious Urban Computing ©2009 Google – Map Data @2009 Teleatlas – Terms of Usage System of Milano area Ways to Milano Milano Private Car M FS railways Le Nord railways ©2009 Google – Map Data @2009 Teleatlas – Terms of Usage 38 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Requirements for LarKC Urban Computing (and Mobility Management) encompass sensing, actuation and computing requirements. Many previous work in the area of Pervasive and Ubiquitous Computing investigated requirements in sensing, actuation, and several aspects of computation (from hardware to software, from networks to devices) In this work we are focusing on reasoning requirements for LarKC, but also of general interest for the entire community working on the complex relationship of the Internet with space, places, people and content. Hereafter we exemplify how coping with representational, reasoning, and defaults heterogeneity scale time-dependency noisy, uncertain and inconsistent data 39 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Coping with representational heterogeneity It is an obvious requirement data always come in different formats (syntactic and structural heterogeneity) legacy data not in semantic formats will always exist! the problem of merging and aligning ontologies is a structural problem of knowledge engineering and it must be always considered when developing an application of semantic technologies. 40 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Coping with reasoning heterogeneity It means the systems allow for multiple paradigms of reasoners; e.g. approximate reasoning when precise and consistent inference for telling that at a calculating the probability of a given junction all vehicles, but traffic jam given the current public transportation ones, traffic conditions and the past must go straight history [ source http://senseable.mit.edu/ ] 41 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Coping with defaults heterogeneity 1/2 Open World Assumption vs. Close World Assumption While for the an entire city we cannot assume complete knowledge, for a time table of a bus station we can [source: http://gizmodo.com/photogallery/trafficsky/1003143552 ] 42 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Coping with defaults heterogeneity 2/2 Unique Name Assumption A square with several station for buses and subway can be considered a unique point for multimodal travel planning, but not when the problem is giving direction in that square to a pedestrian ©2009 Google – Map Data @2009 Teleatlas – Terms of Usage ©2009 Google – Imagery @2009 Teleatlas – Terms of Usage 43 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Coping with scale The advent of Pervasive Computing and Web 2.0 technologies led to a constantly growing amount of data about urban environments Although we encounter large scale data which are not manageable, it does not necessary mean that we have to deal with all of the data simultaneously. Usually, only very limited amount data are relevant for a single query/processing at a specific application. For example, when Carlo is driving to Milano, only part of the Milano map data are relevant. the local parking information may become active by a prediction of the known relation between bad weather conditions and destination parking lot re-planning. 44 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Coping with time-dependency Knowledge and data can change over the time. For instance, in Urban Computing names of streets, landmarks, kind of events, etc. change very slowly, whereas the number of cars that go through a traffic detector in five minutes changes very fast. This means that the system must have the notion of ''observation period'', defined as the period when we the system is subject to querying. Moreover the system, within a given observation period, must consider the following four different types of knowledge and data: Invariable knowledge Invariable data Periodically changing data that change according to a temporal law that can be Event driven changing data that are updated as a consequence of some external event. 45 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Invariable knowledge and data Invariable knowledge it includes obvious terminological knowledge such as an address is made up by a street name, a civic number, a city name and a ZIP code less obvious nomological knowledge that describes how the world is expected to be e.g., given traffic lights are switched off or certain streets are closed during the night to evolve e.g., traffic jams appears more often when it rains or when important sport events take place Invariable data do not change in the observation period, e.g. the names and lengths of the roads. ©2009 Google – Imagery @2009 Teleatlas – Terms of Usage 46 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Changing data Periodically changing data change according to a temporal law that can be Pure periodic law, e.g. every night at 10pm Milano overpasses close. Probabilistic law, e.g. traffic jam appear in the west side of Milano due to bad weather or when San Siro stadium hosts a soccer match. Event driven changing data are updated as a consequence of some external event. They can be further characterized by the mean time between changes: Slow, e.g. roads closed for scheduled works Medium, e.g. roads closed for accidents or congestion due to traffic Fast, e.g. the intensity of traffic for each street in a city ©2009 Google – Imagery @2009 Teleatlas – Terms of Usage 47 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Coping with noisy, uncertain and inconsistent data Traffic data are a very good example of such data. Different sensors observing the same road area give apparently inconsistent information. a traffic camera may say that the road is empty whereas an inductive loop traffic detector may tell 100 vehicles went over it The two information may be coherent if one consider that a traffic camera transmits an image per second with a delay of 15-30 seconds, whereas a traffic detector tells the number of vehicles that went over it in 5 minutes and the information may arrive 5-10 minutes later. Moreover, a single data coming from a sensor in a given moment may have no certain meaning. an inductive loop traffic detector, it tells you 0 car went over Is the road empty? Is the traffic completely stuck? Did somebody park the car above the sensor? Is the sensor broken? Combining multiple information from multiple sensors in a given time window can be the only reasonable way to reduce the uncertainty. 48 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Towards requirements satisfaction in LarKC The Large Knowledge Collider a platform for infinitely scalable reasoning on the data-web Pipeline 49 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • The first Data Mashup within_________ within Mobile Data Mashup Environment REST Pipeline request Config. LarKC platform SPARQL Interface query JSON SPARQL response result Request data Data PROBLEM: Which Milano monuments or events or friends can I quickly get to from here? http://www.larkc.eu People Traffic Events Monuments 50 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • A roadmap towards LarKC Urban use case Data Known: street topology, monuments/events/friends location, traffic situation (current data stream + historical time series) Inferred: traffic predictions, residual street capacity Formulating the query for LarKC Basic: shortest path from A to B Extended: shortest path from A to monuments/events/friends Advanced: considering traffic predictions and residual street capacity Configuring the pipeline Basic configuration Combining a SPARQL processor and a Graph Processor Using AllegroGraph GeoExtension as a selector Extended configuration DBpedia, EVDB, GoogleLatitude selector Advanced configuration: traffic predictions based on recurrent neural networks, residual street capacity based on data stream analysis 51 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • LarKC Early Adopters Workshop The public launch of the first The Large Knowledge open source LarKC platform Collider a platform for release will take place at the massive distributed incomplete reasoning forthcoming European Semantic http://www.larkc.eu Web Conference (ESWC 2009) Register for the event! More information at: http://earlyadopters.larkc.eu/ We are developing the Urban Baby LarKC as a showcase of the potentiality of such platform Everybody will be invited to run experiments over LarKC 52 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • The next Web of open, linked data Just research? What’s going on? Why should I care? From research to business: the Web of linked data Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009
  • Freebase “an open, shared database of the world’s information” Source: Freebase - http://www.freebase.com (2009) 54 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • OpenCalais Source: Thomson Reuters - http://www.opencalais.com/ (2009) 55 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • What’s next? Business point of view Organization today are used to produce lots of data… …and they have the problem of managing and making sense of them! More and more often they ask for Business Intelligence and related technologies to understand and decide But it also happens that, in order to fully understand what’s going on and to take informed decisions, the data within the organization should be integrated or enhanced with external knowledge This could definitely be a job for linked data technology! 56 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Linked data seen by the Web inventor “Stop hugging your data” Sir Tim Berners-Lee, 2009 Don’t let considerations about security or data ownership represent an obstacle to innovation and opportunities www.flickr.com/photos/_-amy-_/3167333250/ 57 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • What’s next? Technological point of view How Business Intelligence and similar techniques change when their basic assumptions are no more valid? Dynamically changing data sources (and data themselves…) Inconsistency typical of the Web (everything & the opposite of everything) Partial information More information than expected or than needed Linked data pose new challenges for existing technologies! 58 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • If I didn’t convince you… http://www.ted.com/index.php/talks/tim_berners_lee_on_the_next_web.html 59 Irene Celino – From research to business: the Web of linked data Poznan, 29th April 2009 – © CEFRIEL 2009
  • Thanks for your attention! Any question? Contacts: Irene Celino – Semantic Web Practice CEFRIEL – ICT Institute, Politecnico di Milano email: irene.celino@cefriel.it – web: http://swa.cefriel.it phone: +39-02-23954266 – fax: +39-02-23954466 Slides available at: http://www.slideshare.net/iricelino From research to business: the Web of linked data Enterprise X.0/Econom Workshops @ BIS 2009 – Poznan, 29th April 2009 - © CEFRIEL 2009