SlideShare a Scribd company logo
1 of 45
1
ANGIE in Wonderland
Nicoleta Preda
Motivating example
Long term goal: new intelligent applications such as
Applications that automatically compute vacations plans
Example:
• I would like to travel for 3 weeks in South America
• Visit UNESCO sites
• Old palaces
2
3
Automatic computation of vacation plans
Personal Calendar
Web Services API
Traveling Related Books
Web Services API
Flights
Web Services API
Countries, Cities, Airports
Web Services API
Web Service APIs available on the Web
ProgrammableWeb.com counts >12000 APIs from various domains:
• Search (3200 APIs)
• Social (3000 APIs)
• Traveling (1200 APIs)
• Music (1000 APIs)
• Financial (1200 APIs), Science (600 APIs), Weather (300 APIs)
4
Query examples
• Places in Peru listed as UNESCO heritage
• Books written by South American Nobel Prize Winners
• Memorial houses of Brazilian Kings
5
Our research
• Query Evaluation using Web Service APIs
• Mapping Web Services to Knowledge Bases
6
Web Services WWW
SUSIE
Web Services
ANGIE
KB
Web Services Knowledge
Base
DORIS
7
Web Services
ANGIE
KB
8
Problem Description
Given a query Q against
• a knowledge base (KB)
• a set of Web services F
• a bound Max for the number of Web service calls
compute answers for Q using at most Max calls
8
9
Representing functions of Web Service APIs
A function is a named parameterized conjunctive query where
• Inputs must be bound to entities before the call execution
• Outputs are bound as the result of the call
• Relations are from a global schema (knowledge base schema)
outputinput
parent
p_place
birthplace
?child
?c_place
birthplace
hasChild
getChildren(parent, p_place,?child, ?c_place)
getChildren(parent, p_place,?child, ?c_place) :- birthplace(parent, p_place),
hasChild(parent, ?child)
9
Query example
parent
p_place
birthplace
?child
?c_place
birthplace
hasChild
getChildren(parent, p_place,?child, ?c_place)
?place
birthplace
Pedro II of Brazil
Query
Pedro II of Brazil
Baseline Solution (aiming at completeness)
getChildren
birthplace
hasChild
getChildren
birthplace
hasChild
X
Brussels
birthplace
Isabella of Austria
getChildren
hasChild
….
getChildren birthplace
hasChild
birthplace
Palace of São
Cristóvão, Rio de
Janeiro
Pedro II of Brazil
birthplace
Kensington Palace,
London
Queen Victoria of the UK
But I only have a small budget of calls ! 
11
ANGIE Algorithm: the bang for the buck
birthplace
?place
Pedro II of Brazil
parent
p_place
birthplace
hasChild
Pedro II of Brazil
hasChild
Pedro I of Brazil
Ajuda, Lisbon
birthplace
hasChild
Juan VI of Portugal
parent
p_place
birthplace
hasChild
Querluz Palace, Lisbon
Palace of São
Cristóvão, Rio de
Janeiro
Juan VI of Portugal
Ajuda, Lisbon
Pedro I of Brazil
parent
p_place
birthplace
hasChild ?child
?c_place
birthplace
12
13
Property
For a pipeline of calls:
W1 < W2 <… Wi … Wn < Q
where the inputs are extracted using the local queries
Q1
KB Q2
KB … Qi
KB … Qn
KB
If the knowledge base has answers for Qi
KB then
execute only Wi … Wn
13
Web call composition graph
YAGO
Query
?place
birthplace
?personid
hasId
getInfoByPersonId
?idperson
getPersonId
hasId
GetChildren
Juan VI of Portugal, Ajuda
GetChildren
Pedro I of Brazil
Pedro II of Brazil
GetPersonId
GetInfoByPersonId
id_Pedro-II
14
0
100
200
300
400
500
600
0
4
8
12
16
20
24
28
32
36
40
44
48
52
56
59
63
67
71
75
79
83
87
91
95
99
103
DF
F-RDF
F-RDF-R
Number of answers
Numberofcalls
ANGIE
ANGIE
-cost
Books of French Nobel Prize winners
Experiments
15
50 real
Web services
from 3 domains:
• Music
• Books
• Movies
16
ANGIE: Active Knowledge & Interaction Exploration
Query Mediator
Dynamically computes the Web calls that answer the query
RDF Warehouse
• The local KB stores the results of all executed Web calls
• Stored call results may speed-up the evaluation of related queries
16
Active Knowledge : Dynamically Enriching RDF Knowledge Bases by Web Services.
with F. M. Suchanek, G. Kasneci, T. Neumann, W. Yuan, G. Weikum, SIGMOD 2010
SUSIE
17
Web Services WWW
Problem: Asymmetric accesses
• Consider a source publishing only the Web service:
getLeaderInfo(leader, type, country)
• And the queries:
Q1: getLeaderInfo(Pablo II, ?, ?)
Q2: getLeaderInfo(?, ?, Brazil)
Q3: getLeaderInfo(?, king, Brazil)
18
 Easy
Impossible
Impossible
DB of leaders
1 million calls and two will succeed
Our Approach: Use the Web as an Oracle
Example: implement “get head by country and type”
19
King, Brazil “King of Brazil?”
Lula
Pedro I
Pedro II
HTMLInformation
Extraction (IE)
getLeaderInfo
King, Brazil
getLeaderInfo
King, Brazil
getLeaderInfo
President, Brazil
3 calls and 2 will succeed
X
Model oracles as functions
20
HTML
Information
Extraction (IE)
[outputs (verified by WS)]
[country, head-type] “[type] of [country]”
oracleGetCandidates(person, type, country)
countryheadOf
?person country
type
type
Are inefficient plans automatically avoided?
21
New Query
22
?
countryheadOf
Brazil
King
type
oracleGetCandidates ?
inauguration
headOf
leader
date
inauguration
getInaugurationDay(leader, date) oracleGetCandidates(leader, type, country)
countryheadOf
leader country
type
type
Pedro I of Brazil
10 March 1826
getInaugurationDay
Consider the additional Web services
getCurrentLeader(country, leader)
countryheadOf
leader country
getPredecessor(leader, pLeader, pType, pDate, pCountry)
predecessor
leader
countryheadOf
leader country
type
type
date
inauguration
Relevant but inefficient results
24
getCurrentLeader(Brazil, leader1, type1, date1)
getPredecessor(leader1, leader2, type2, date2, country2)
getPredecessor(leader2, leader2, type3, date3, country3)
getPredecessor(leader2, leader, type4, date4, country4)
countryheadOf
type
King
Brazil
inauguration
Smart calls vs. relevant but “guess” plans
25
countryheadOf
type
King
inauguration
getCurrentLeader(Brazil)
getPredecessor(leader)
oracleGetCandidates(Brazil, King)
getInaugurationDay(leader)
Brazil
predecessor
Smart calls
Given a call Wi that belongs to a plan W1,… Wi,… Wn we say Wi
is a smart call if its consequences are:
• either included in the union of the consequences of the
previous functions Wi-1, ... W1
• or are atoms of the query
Property:
If a plan consists of only smart calls, and if every call has
results, then the plan will deliver an answer for the query.
26
27
Experiments
50 Web services from three domains:
• Books
• isbndb.org
• librarything.com
• abebooks.com
• Movies
• internetvideoarchive.com (IVA)
• Music
• musicbrainz.org
• last.fm
• discogs.com
• lyricWiki.org
27
Evaluation results
28
Get prize winners TD ANGIE SUSIE
Nobel Prize in Literature 0 0 14
Golden Pen Award 0 0 11
Franz Kafka Prize 0 0 5
American Book Medal 0 0 16
Jerusalem Prize 0 0 11
Get books of winners of prize TD ANGIE SUSIE
Nobel Prize Literature 0 0 198
Golden Pen Award 0 0 228
Franz Kafka Prize 0 0 132
Jerusalem Prize 0 0 220
Get books of winners by prize and country TD ANGIE SUSIE
Nobel Prize Literature, France 0 0 144
Franz Kafka Prize, UK 0 0 79
Related Work: Answering Queries using Views
• Maximal contained rewritings (MCR)
• Plans computing the largest number of answers
• Approaches based on reducing the number of irrelevant calls
•Benedict & al. PODS 2011, VLDB 2012
•S. Kambhampa, JIIC 2004
• SUSIE does not target maximal contained rewritings
• Relevant calls for MCR includes all calls that might return results
• Smart calls are a subset of relevant calls.
29
SUSIE
• Addressed the problem of asymmetric accesses
• A novel approach to answer such queries where the inputs
for the Web service call are extracted on the fly, from the Web
• New evaluation algorithm that prioritizes smart calls
• An experimental evaluation using a representative set of
queries and real data sources
30
SUSIE: Search Using Services and Information Extraction.
with F. M. Suchanek, W. Yuan, G. Weikum ICDE 2013
31
Ongoing work
Given a query Q and a set of function F compute all smart plans
(for which it can be proven that they return answers)
31
SUSIE
Web Services Knowledge
Base
DORIS
32
Web Service API
• Web Services for applications ≅ Web forms for humans
• An API = collection Web services
• A Web Service
• expects bindings for input parameters
• returns structured data: XML or JSON
33
<geonames>
<country>
<ccode> AR </ccode>
<cname> Argentina </cname>
<isonumeric>032</isonumeric>
<fipscode> ARG <fipscode>
<continent> SA </continent>
<continentName> Argentina
</continentName>
<capital> Buenos Aires </capital>
<cities>
<city>
<name>Buenos Aires</name>
Goals
For every Web service:
1) Compute a parameterized query (relations are from the KB)
2) Compute a transformation script XSLT
 to be applied for every call result
 XML result results for the parameterized query
34
1) Parameterized query for getCountryByName
35
getCountryByName(country, name, time-zone, capital, type, lat, lng
city, c_lat, c_lng)
label
country
hasCapital
time-zone
name
hasCity
type
city
label
c_lat
c_lng
lnglat
r
e
“Republic”
“ARS’’
“Argentina”
“Buenos
Aires”
f
“Buenos
Aires”
g h
“-34”
i
“-64” “Córdoba”
g h
“-31.40833”
i
“-64.18388”
f
dcba j l
“-34” “-64”
getCountryByName(Argentina)
r
e
“Republic”
“GMT+2’’
“Romania”
“Bucharest”
f
“Bucharest”
g h
“44.4”
i
“26.1” “Rm Valcea”
g h
“45.1”
i
“24”
f
dcba j l
“44.4” “26.1”
2) An XSLT transformation for all call results
getCountryByName(Romania, GMT+2, Bucharest, Republic,
44.4, 26.1, Bucharest, 44.4, 26.1)
getCountryByName(Romania, GMT+2, Bucharest, Republic,
44.4, 26.1, Rm Valcea, 45.1, 24)
General Challenges
• Heterogeneity: Every Web services has its schema for outputs
• Schemas are unknown
• >85% of Web services implemented using REST
• REST Web services do not expose schema descriptions
Our approach: use the overlapping between
Web services & Knowledge Bases
Intuition
38
r
e
“Republic”
“ARS’’
“Argentina”
“Buenos
Aires”
f
“Buenos
Aires”
g h
“-34”
i
“-64” “Córdoba”
g h
“-31.40833”
f
dcba j l
“-34” “-64”
label
URI
1
Argentina
hasCapital
URI
2
label
Buenos
Aires
URI
1
r
Three steps algorithm
1) Align root-to text-nodes to paths from the input in the KB
2) Compute class and relation alignment candidates satisfying
functional constraits
3) For each candidate compute transformation functions and check
inclusion and equivalence for the non-functional relations
Observation:
The first 2 steps alone lead to a precision/recall of of around 90%
39
40
DORIS: Some experimental results
More than 50 Web services from 4 domains
• Books
• Movies
• Music
• Geo data
KB Precision Recall
Classes Relations Classes Relations
YAGO 0.92 0.91 0.96 0.93
DBpedia 0.89 0.88 0.98 0.95
BNF 1 1 1 1
40
Summary
• Addressed the problem of inferring views
• An instance based approach to the schema matching problem
• An experimental evaluation using real Web sources
41
DORIS: Discovering ontological relations in sources.
with Mary Koutraki, Dan Vodislav, in preparation
getCountryByName(country, name, time-zone, capital, type,
lat, lng, city, c_lat, c_lng)
label
country
hasCapital
time-zone
name
hasCity
type
city
label
c_lat
c_lng
lnglat
<geonames>
<country>
<ccode> AR </ccode>
<cname> Argentina </cname>
<isonumeric>032</isonumeric>
<fipscode> ARG <fipscode>
<continent> SA </continent>
<continentName> Argentina
</continentName>
<capital> Buenos Aires </capital>
<areaInSqKM> <areaInSqKM>
Our work
• Query Evaluation using Web Service APIs
• Mapping Web Services to Knowledge Bases
42
Web Services WWW
SUSIE
Web Services
ANGIE
KB
Web Services Knowledge
Base
DORIS
Same plan as a graph
predecessor
getPredecessor
country
Henrique Cardoso Brazil
President
type
headOfState
1 January 1995
predecessor
getPredecessor
country
Lula da Silva Brazil
President
type
headOfState
1 January 2003
getCurrentHeadOfState
Dilma Rousseff
countryheadOfState
Brazil
King
type
President
1 January 2011
BrazilDilma Rousseff
Lula da Silva
IE: Authors who won prize X
44
Precision Recall Prize
38% 59% National Book
62% 44% Phoenix
23% 52% Jerusalem
78% 79% Pulizer
25% 73% Franz Kafka
31% 13% Prix Femina
28% 6% Prix Decembre
41% 29% Nobel Prize
25% 73% Golden Pen
Challenges of an instanced-based approach
• XML elements do not correspond to entities in KB
• Entities in KB are URIs and are not to be found in call results
• What is an entity in the XML call result?
• Spurious matches (Argentina is a capital and also a person)
45
Idea: align properties expressed as text or literals first

More Related Content

Similar to ANGIE in wonderland

Webinar: General Technical Overview of MongoDB for Ops Teams
Webinar: General Technical Overview of MongoDB for Ops TeamsWebinar: General Technical Overview of MongoDB for Ops Teams
Webinar: General Technical Overview of MongoDB for Ops TeamsMongoDB
 
213 event processingtalk-deviewkorea.key
213 event processingtalk-deviewkorea.key213 event processingtalk-deviewkorea.key
213 event processingtalk-deviewkorea.keyNAVER D2
 
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesIntroduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesChris Schalk
 
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...confluent
 
Gao cong geospatial social media data management and context-aware recommenda...
Gao cong geospatial social media data management and context-aware recommenda...Gao cong geospatial social media data management and context-aware recommenda...
Gao cong geospatial social media data management and context-aware recommenda...jins0618
 
Entity Search: The Last Decade and the Next
Entity Search: The Last Decade and the NextEntity Search: The Last Decade and the Next
Entity Search: The Last Decade and the Nextkrisztianbalog
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBNosh Petigara
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentationMurat Çakal
 
N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0Keshav Murthy
 
Neo4j Aura on AWS: The Customer Choice for Graph Databases
Neo4j Aura on AWS: The Customer Choice for Graph DatabasesNeo4j Aura on AWS: The Customer Choice for Graph Databases
Neo4j Aura on AWS: The Customer Choice for Graph DatabasesNeo4j
 
Building Your First App with MongoDB Stitch
Building Your First App with MongoDB StitchBuilding Your First App with MongoDB Stitch
Building Your First App with MongoDB StitchMongoDB
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWAnkur Raina
 
Building your first application w/mongoDB MongoSV2011
Building your first application w/mongoDB MongoSV2011Building your first application w/mongoDB MongoSV2011
Building your first application w/mongoDB MongoSV2011Steven Francia
 
Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.Keshav Murthy
 
Practical JSON in MySQL 5.7 and Beyond
Practical JSON in MySQL 5.7 and BeyondPractical JSON in MySQL 5.7 and Beyond
Practical JSON in MySQL 5.7 and BeyondIke Walker
 
Web5 - Open to Build - Block-TBD
Web5 - Open to Build - Block-TBDWeb5 - Open to Build - Block-TBD
Web5 - Open to Build - Block-TBDSSIMeetup
 
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep LearningRisk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep LearningCambridge Semantics
 
Nosh slides mongodb web application - mongo philly 2011
Nosh slides   mongodb web application - mongo philly 2011Nosh slides   mongodb web application - mongo philly 2011
Nosh slides mongodb web application - mongo philly 2011MongoDB
 
Building a web application with mongo db
Building a web application with mongo dbBuilding a web application with mongo db
Building a web application with mongo dbMongoDB
 
Abdul Salam's Resume
Abdul Salam's ResumeAbdul Salam's Resume
Abdul Salam's ResumeAbdul Salam
 

Similar to ANGIE in wonderland (20)

Webinar: General Technical Overview of MongoDB for Ops Teams
Webinar: General Technical Overview of MongoDB for Ops TeamsWebinar: General Technical Overview of MongoDB for Ops Teams
Webinar: General Technical Overview of MongoDB for Ops Teams
 
213 event processingtalk-deviewkorea.key
213 event processingtalk-deviewkorea.key213 event processingtalk-deviewkorea.key
213 event processingtalk-deviewkorea.key
 
Introduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologiesIntroduction to Google Cloud platform technologies
Introduction to Google Cloud platform technologies
 
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
 
Gao cong geospatial social media data management and context-aware recommenda...
Gao cong geospatial social media data management and context-aware recommenda...Gao cong geospatial social media data management and context-aware recommenda...
Gao cong geospatial social media data management and context-aware recommenda...
 
Entity Search: The Last Decade and the Next
Entity Search: The Last Decade and the NextEntity Search: The Last Decade and the Next
Entity Search: The Last Decade and the Next
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Building web applications with mongo db presentation
Building web applications with mongo db presentationBuilding web applications with mongo db presentation
Building web applications with mongo db presentation
 
N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0N1QL: What's new in Couchbase 5.0
N1QL: What's new in Couchbase 5.0
 
Neo4j Aura on AWS: The Customer Choice for Graph Databases
Neo4j Aura on AWS: The Customer Choice for Graph DatabasesNeo4j Aura on AWS: The Customer Choice for Graph Databases
Neo4j Aura on AWS: The Customer Choice for Graph Databases
 
Building Your First App with MongoDB Stitch
Building Your First App with MongoDB StitchBuilding Your First App with MongoDB Stitch
Building Your First App with MongoDB Stitch
 
Introduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUWIntroduction to MongoDB at IGDTUW
Introduction to MongoDB at IGDTUW
 
Building your first application w/mongoDB MongoSV2011
Building your first application w/mongoDB MongoSV2011Building your first application w/mongoDB MongoSV2011
Building your first application w/mongoDB MongoSV2011
 
Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.Couchbase N1QL: Language & Architecture Overview.
Couchbase N1QL: Language & Architecture Overview.
 
Practical JSON in MySQL 5.7 and Beyond
Practical JSON in MySQL 5.7 and BeyondPractical JSON in MySQL 5.7 and Beyond
Practical JSON in MySQL 5.7 and Beyond
 
Web5 - Open to Build - Block-TBD
Web5 - Open to Build - Block-TBDWeb5 - Open to Build - Block-TBD
Web5 - Open to Build - Block-TBD
 
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep LearningRisk Analytics Using Knowledge Graphs / FIBO with Deep Learning
Risk Analytics Using Knowledge Graphs / FIBO with Deep Learning
 
Nosh slides mongodb web application - mongo philly 2011
Nosh slides   mongodb web application - mongo philly 2011Nosh slides   mongodb web application - mongo philly 2011
Nosh slides mongodb web application - mongo philly 2011
 
Building a web application with mongo db
Building a web application with mongo dbBuilding a web application with mongo db
Building a web application with mongo db
 
Abdul Salam's Resume
Abdul Salam's ResumeAbdul Salam's Resume
Abdul Salam's Resume
 

More from INRIA-OAK

A Network-Aware Approach for Searching As-You-Type in Social Media
A Network-Aware Approach for Searching As-You-Type in Social MediaA Network-Aware Approach for Searching As-You-Type in Social Media
A Network-Aware Approach for Searching As-You-Type in Social MediaINRIA-OAK
 
Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...INRIA-OAK
 
Querying incomplete data
Querying incomplete dataQuerying incomplete data
Querying incomplete dataINRIA-OAK
 
On building more human query answering systems
On building more human query answering systemsOn building more human query answering systems
On building more human query answering systemsINRIA-OAK
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsINRIA-OAK
 
Web Data Management in RDF Age
Web Data Management in RDF AgeWeb Data Management in RDF Age
Web Data Management in RDF AgeINRIA-OAK
 
Rdf saturator
Rdf saturatorRdf saturator
Rdf saturatorINRIA-OAK
 
Rdf generator
Rdf generatorRdf generator
Rdf generatorINRIA-OAK
 
Rdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimationRdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimationINRIA-OAK
 
rdf query reformulation
rdf query reformulationrdf query reformulation
rdf query reformulationINRIA-OAK
 
postgres loader
postgres loaderpostgres loader
postgres loaderINRIA-OAK
 
Conjunctive queries
Conjunctive queriesConjunctive queries
Conjunctive queriesINRIA-OAK
 
CliqueSquare processing
CliqueSquare processingCliqueSquare processing
CliqueSquare processingINRIA-OAK
 
Clique square storage
Clique square storageClique square storage
Clique square storageINRIA-OAK
 

More from INRIA-OAK (20)

A Network-Aware Approach for Searching As-You-Type in Social Media
A Network-Aware Approach for Searching As-You-Type in Social MediaA Network-Aware Approach for Searching As-You-Type in Social Media
A Network-Aware Approach for Searching As-You-Type in Social Media
 
Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...Speeding up information extraction programs: a holistic optimizer and a learn...
Speeding up information extraction programs: a holistic optimizer and a learn...
 
Querying incomplete data
Querying incomplete dataQuerying incomplete data
Querying incomplete data
 
On building more human query answering systems
On building more human query answering systemsOn building more human query answering systems
On building more human query answering systems
 
Dynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data PlatformsDynamically Optimizing Queries over Large Scale Data Platforms
Dynamically Optimizing Queries over Large Scale Data Platforms
 
Web Data Management in RDF Age
Web Data Management in RDF AgeWeb Data Management in RDF Age
Web Data Management in RDF Age
 
Nautilus
NautilusNautilus
Nautilus
 
Warg
WargWarg
Warg
 
Vip2p
Vip2pVip2p
Vip2p
 
S4
S4S4
S4
 
Rdf saturator
Rdf saturatorRdf saturator
Rdf saturator
 
Rdf generator
Rdf generatorRdf generator
Rdf generator
 
Rdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimationRdf conjunctive query selectivity estimation
Rdf conjunctive query selectivity estimation
 
rdf query reformulation
rdf query reformulationrdf query reformulation
rdf query reformulation
 
postgres loader
postgres loaderpostgres loader
postgres loader
 
Plreuse
PlreusePlreuse
Plreuse
 
Paxquery
PaxqueryPaxquery
Paxquery
 
Conjunctive queries
Conjunctive queriesConjunctive queries
Conjunctive queries
 
CliqueSquare processing
CliqueSquare processingCliqueSquare processing
CliqueSquare processing
 
Clique square storage
Clique square storageClique square storage
Clique square storage
 

Recently uploaded

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookmanojkuma9823
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts ServiceSapana Sha
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 

Recently uploaded (20)

Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Call Girls In Mahipalpur O9654467111 Escorts Service
Call Girls In Mahipalpur O9654467111  Escorts ServiceCall Girls In Mahipalpur O9654467111  Escorts Service
Call Girls In Mahipalpur O9654467111 Escorts Service
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 

ANGIE in wonderland

  • 2. Motivating example Long term goal: new intelligent applications such as Applications that automatically compute vacations plans Example: • I would like to travel for 3 weeks in South America • Visit UNESCO sites • Old palaces 2
  • 3. 3 Automatic computation of vacation plans Personal Calendar Web Services API Traveling Related Books Web Services API Flights Web Services API Countries, Cities, Airports Web Services API
  • 4. Web Service APIs available on the Web ProgrammableWeb.com counts >12000 APIs from various domains: • Search (3200 APIs) • Social (3000 APIs) • Traveling (1200 APIs) • Music (1000 APIs) • Financial (1200 APIs), Science (600 APIs), Weather (300 APIs) 4
  • 5. Query examples • Places in Peru listed as UNESCO heritage • Books written by South American Nobel Prize Winners • Memorial houses of Brazilian Kings 5
  • 6. Our research • Query Evaluation using Web Service APIs • Mapping Web Services to Knowledge Bases 6 Web Services WWW SUSIE Web Services ANGIE KB Web Services Knowledge Base DORIS
  • 8. 8 Problem Description Given a query Q against • a knowledge base (KB) • a set of Web services F • a bound Max for the number of Web service calls compute answers for Q using at most Max calls 8
  • 9. 9 Representing functions of Web Service APIs A function is a named parameterized conjunctive query where • Inputs must be bound to entities before the call execution • Outputs are bound as the result of the call • Relations are from a global schema (knowledge base schema) outputinput parent p_place birthplace ?child ?c_place birthplace hasChild getChildren(parent, p_place,?child, ?c_place) getChildren(parent, p_place,?child, ?c_place) :- birthplace(parent, p_place), hasChild(parent, ?child) 9
  • 10. Query example parent p_place birthplace ?child ?c_place birthplace hasChild getChildren(parent, p_place,?child, ?c_place) ?place birthplace Pedro II of Brazil Query Pedro II of Brazil
  • 11. Baseline Solution (aiming at completeness) getChildren birthplace hasChild getChildren birthplace hasChild X Brussels birthplace Isabella of Austria getChildren hasChild …. getChildren birthplace hasChild birthplace Palace of São Cristóvão, Rio de Janeiro Pedro II of Brazil birthplace Kensington Palace, London Queen Victoria of the UK But I only have a small budget of calls !  11
  • 12. ANGIE Algorithm: the bang for the buck birthplace ?place Pedro II of Brazil parent p_place birthplace hasChild Pedro II of Brazil hasChild Pedro I of Brazil Ajuda, Lisbon birthplace hasChild Juan VI of Portugal parent p_place birthplace hasChild Querluz Palace, Lisbon Palace of São Cristóvão, Rio de Janeiro Juan VI of Portugal Ajuda, Lisbon Pedro I of Brazil parent p_place birthplace hasChild ?child ?c_place birthplace 12
  • 13. 13 Property For a pipeline of calls: W1 < W2 <… Wi … Wn < Q where the inputs are extracted using the local queries Q1 KB Q2 KB … Qi KB … Qn KB If the knowledge base has answers for Qi KB then execute only Wi … Wn 13
  • 14. Web call composition graph YAGO Query ?place birthplace ?personid hasId getInfoByPersonId ?idperson getPersonId hasId GetChildren Juan VI of Portugal, Ajuda GetChildren Pedro I of Brazil Pedro II of Brazil GetPersonId GetInfoByPersonId id_Pedro-II 14
  • 16. 16 ANGIE: Active Knowledge & Interaction Exploration Query Mediator Dynamically computes the Web calls that answer the query RDF Warehouse • The local KB stores the results of all executed Web calls • Stored call results may speed-up the evaluation of related queries 16 Active Knowledge : Dynamically Enriching RDF Knowledge Bases by Web Services. with F. M. Suchanek, G. Kasneci, T. Neumann, W. Yuan, G. Weikum, SIGMOD 2010
  • 18. Problem: Asymmetric accesses • Consider a source publishing only the Web service: getLeaderInfo(leader, type, country) • And the queries: Q1: getLeaderInfo(Pablo II, ?, ?) Q2: getLeaderInfo(?, ?, Brazil) Q3: getLeaderInfo(?, king, Brazil) 18  Easy Impossible Impossible DB of leaders 1 million calls and two will succeed
  • 19. Our Approach: Use the Web as an Oracle Example: implement “get head by country and type” 19 King, Brazil “King of Brazil?” Lula Pedro I Pedro II HTMLInformation Extraction (IE) getLeaderInfo King, Brazil getLeaderInfo King, Brazil getLeaderInfo President, Brazil 3 calls and 2 will succeed X
  • 20. Model oracles as functions 20 HTML Information Extraction (IE) [outputs (verified by WS)] [country, head-type] “[type] of [country]” oracleGetCandidates(person, type, country) countryheadOf ?person country type type
  • 21. Are inefficient plans automatically avoided? 21
  • 22. New Query 22 ? countryheadOf Brazil King type oracleGetCandidates ? inauguration headOf leader date inauguration getInaugurationDay(leader, date) oracleGetCandidates(leader, type, country) countryheadOf leader country type type Pedro I of Brazil 10 March 1826 getInaugurationDay
  • 23. Consider the additional Web services getCurrentLeader(country, leader) countryheadOf leader country getPredecessor(leader, pLeader, pType, pDate, pCountry) predecessor leader countryheadOf leader country type type date inauguration
  • 24. Relevant but inefficient results 24 getCurrentLeader(Brazil, leader1, type1, date1) getPredecessor(leader1, leader2, type2, date2, country2) getPredecessor(leader2, leader2, type3, date3, country3) getPredecessor(leader2, leader, type4, date4, country4)
  • 25. countryheadOf type King Brazil inauguration Smart calls vs. relevant but “guess” plans 25 countryheadOf type King inauguration getCurrentLeader(Brazil) getPredecessor(leader) oracleGetCandidates(Brazil, King) getInaugurationDay(leader) Brazil predecessor
  • 26. Smart calls Given a call Wi that belongs to a plan W1,… Wi,… Wn we say Wi is a smart call if its consequences are: • either included in the union of the consequences of the previous functions Wi-1, ... W1 • or are atoms of the query Property: If a plan consists of only smart calls, and if every call has results, then the plan will deliver an answer for the query. 26
  • 27. 27 Experiments 50 Web services from three domains: • Books • isbndb.org • librarything.com • abebooks.com • Movies • internetvideoarchive.com (IVA) • Music • musicbrainz.org • last.fm • discogs.com • lyricWiki.org 27
  • 28. Evaluation results 28 Get prize winners TD ANGIE SUSIE Nobel Prize in Literature 0 0 14 Golden Pen Award 0 0 11 Franz Kafka Prize 0 0 5 American Book Medal 0 0 16 Jerusalem Prize 0 0 11 Get books of winners of prize TD ANGIE SUSIE Nobel Prize Literature 0 0 198 Golden Pen Award 0 0 228 Franz Kafka Prize 0 0 132 Jerusalem Prize 0 0 220 Get books of winners by prize and country TD ANGIE SUSIE Nobel Prize Literature, France 0 0 144 Franz Kafka Prize, UK 0 0 79
  • 29. Related Work: Answering Queries using Views • Maximal contained rewritings (MCR) • Plans computing the largest number of answers • Approaches based on reducing the number of irrelevant calls •Benedict & al. PODS 2011, VLDB 2012 •S. Kambhampa, JIIC 2004 • SUSIE does not target maximal contained rewritings • Relevant calls for MCR includes all calls that might return results • Smart calls are a subset of relevant calls. 29
  • 30. SUSIE • Addressed the problem of asymmetric accesses • A novel approach to answer such queries where the inputs for the Web service call are extracted on the fly, from the Web • New evaluation algorithm that prioritizes smart calls • An experimental evaluation using a representative set of queries and real data sources 30 SUSIE: Search Using Services and Information Extraction. with F. M. Suchanek, W. Yuan, G. Weikum ICDE 2013
  • 31. 31 Ongoing work Given a query Q and a set of function F compute all smart plans (for which it can be proven that they return answers) 31
  • 33. Web Service API • Web Services for applications ≅ Web forms for humans • An API = collection Web services • A Web Service • expects bindings for input parameters • returns structured data: XML or JSON 33 <geonames> <country> <ccode> AR </ccode> <cname> Argentina </cname> <isonumeric>032</isonumeric> <fipscode> ARG <fipscode> <continent> SA </continent> <continentName> Argentina </continentName> <capital> Buenos Aires </capital> <cities> <city> <name>Buenos Aires</name>
  • 34. Goals For every Web service: 1) Compute a parameterized query (relations are from the KB) 2) Compute a transformation script XSLT  to be applied for every call result  XML result results for the parameterized query 34
  • 35. 1) Parameterized query for getCountryByName 35 getCountryByName(country, name, time-zone, capital, type, lat, lng city, c_lat, c_lng) label country hasCapital time-zone name hasCity type city label c_lat c_lng lnglat r e “Republic” “ARS’’ “Argentina” “Buenos Aires” f “Buenos Aires” g h “-34” i “-64” “Córdoba” g h “-31.40833” i “-64.18388” f dcba j l “-34” “-64” getCountryByName(Argentina)
  • 36. r e “Republic” “GMT+2’’ “Romania” “Bucharest” f “Bucharest” g h “44.4” i “26.1” “Rm Valcea” g h “45.1” i “24” f dcba j l “44.4” “26.1” 2) An XSLT transformation for all call results getCountryByName(Romania, GMT+2, Bucharest, Republic, 44.4, 26.1, Bucharest, 44.4, 26.1) getCountryByName(Romania, GMT+2, Bucharest, Republic, 44.4, 26.1, Rm Valcea, 45.1, 24)
  • 37. General Challenges • Heterogeneity: Every Web services has its schema for outputs • Schemas are unknown • >85% of Web services implemented using REST • REST Web services do not expose schema descriptions Our approach: use the overlapping between Web services & Knowledge Bases
  • 38. Intuition 38 r e “Republic” “ARS’’ “Argentina” “Buenos Aires” f “Buenos Aires” g h “-34” i “-64” “Córdoba” g h “-31.40833” f dcba j l “-34” “-64” label URI 1 Argentina hasCapital URI 2 label Buenos Aires URI 1 r
  • 39. Three steps algorithm 1) Align root-to text-nodes to paths from the input in the KB 2) Compute class and relation alignment candidates satisfying functional constraits 3) For each candidate compute transformation functions and check inclusion and equivalence for the non-functional relations Observation: The first 2 steps alone lead to a precision/recall of of around 90% 39
  • 40. 40 DORIS: Some experimental results More than 50 Web services from 4 domains • Books • Movies • Music • Geo data KB Precision Recall Classes Relations Classes Relations YAGO 0.92 0.91 0.96 0.93 DBpedia 0.89 0.88 0.98 0.95 BNF 1 1 1 1 40
  • 41. Summary • Addressed the problem of inferring views • An instance based approach to the schema matching problem • An experimental evaluation using real Web sources 41 DORIS: Discovering ontological relations in sources. with Mary Koutraki, Dan Vodislav, in preparation getCountryByName(country, name, time-zone, capital, type, lat, lng, city, c_lat, c_lng) label country hasCapital time-zone name hasCity type city label c_lat c_lng lnglat <geonames> <country> <ccode> AR </ccode> <cname> Argentina </cname> <isonumeric>032</isonumeric> <fipscode> ARG <fipscode> <continent> SA </continent> <continentName> Argentina </continentName> <capital> Buenos Aires </capital> <areaInSqKM> <areaInSqKM>
  • 42. Our work • Query Evaluation using Web Service APIs • Mapping Web Services to Knowledge Bases 42 Web Services WWW SUSIE Web Services ANGIE KB Web Services Knowledge Base DORIS
  • 43. Same plan as a graph predecessor getPredecessor country Henrique Cardoso Brazil President type headOfState 1 January 1995 predecessor getPredecessor country Lula da Silva Brazil President type headOfState 1 January 2003 getCurrentHeadOfState Dilma Rousseff countryheadOfState Brazil King type President 1 January 2011 BrazilDilma Rousseff Lula da Silva
  • 44. IE: Authors who won prize X 44 Precision Recall Prize 38% 59% National Book 62% 44% Phoenix 23% 52% Jerusalem 78% 79% Pulizer 25% 73% Franz Kafka 31% 13% Prix Femina 28% 6% Prix Decembre 41% 29% Nobel Prize 25% 73% Golden Pen
  • 45. Challenges of an instanced-based approach • XML elements do not correspond to entities in KB • Entities in KB are URIs and are not to be found in call results • What is an entity in the XML call result? • Spurious matches (Argentina is a capital and also a person) 45 Idea: align properties expressed as text or literals first

Editor's Notes

  1. A method represents for applications what Web forms are for Internet users Every method is a predefined but unknown parameterized query
  2. Heterogeneity of schemas Every web service method has its own schema
  3. Heterogeneity of schemas Every web service method has its own schema
  4. We use as gloabal schema a general purpose knowledge base We based on the data in order to infer the mappings and not on the schema – we do not have any schema information from the REST web services or there are not constranes that we can take into account. Related Works are based on the schema information Compute Overlapping between sources in the level of instances (leaves from XML literal nodes from RDF)