SlideShare a Scribd company logo
Keyword Search over RDF Graphs 
Shady Elbassuoni* and Roi Blanco** 
* Max-Planck Institute for Informatics 
** Yahoo! Research, Barcelona
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
RDF Datasets 
subject predicate object 
Traffic hasWonPrize Academy_Award 
Innerspace hasWonPrize Academy_Award 
Innerspace hasGenre Comedy 
Joe_Dante directed Innerspace 
Toy_Story hasWonPrize Academy_Award 
Road_Trip hasGenre Comedy 
Toy_Story hasGenre Comedy 
Tom_Hanks actedIn Toy_Story 
Diner hasWonPrize Academy_Award 
Diner type Comedy_films 
Steve_Guttenberg actedIn Diner 
The_Pink_Panther type Criminal_comedy_films 
The_Pink_Panther hasWonPrize Academy_Award 
Police_Academy type Comedy_films 
Steve_Guttenberg actedIn Police_Academy 
The_Darwin_Awards type Comedy_films
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Searching RDF Data 
 Structured triple-pattern queries (SPARQL) 
 Example: comedies that have won an 
academy award 
SELECT ?m 
WHERE {?m hasGenre Comedy . ?m hasWonPrize Academy_Award}
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Searching RDF Data 
 Triple-pattern queries are very expressive 
but are not that useable 
 Most users/ Search APIs prefer keyword queries 
Support keyword search over RDF graphs
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Keyword Search over RDF Data 
 How to process keyword queries? 
 Translate keyword queries into SPARQL 
 Directly process the queries over the RDF graph 
 What are the results to a keyword query? 
 Resources 
 Triples 
 Tuples of triples (subgraphs)
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Keyword Search over RDF Data 
 How to process keyword queries? 
 Translate keyword queries into SPARQL 
 Directly process the queries over the RDF graph 
 What are the results to a keyword query? 
 Resources 
 Triples 
 Tuples of triples (subgraphs)
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Keyword Search over RDF Data 
 How to process keyword queries? 
 Translate keyword queries into SPARQL 
 Directly process the queries over the RDF graph 
 What are the results to a keyword query? 
 Resources 
 Triples 
 Tuples of triples (subgraphs)
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Keyword Search over RDF Data 
 How to process keyword queries? 
 Translate keyword queries into SPARQL 
 Directly process the queries over the RDF graph 
 What are the results to a keyword query? 
 Resources 
 Triples 
 Tuples of triples (subgraphs)
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Processing Keyword Queries 
 Construct a document D(t) for each triple t 
 D(t) contains all literals in t and any text 
associated with the URIs in t 
 Example: 
t: Innerspace hasGenre Comedy 
innerspace USA1987 science fiction comedy film Joe 
Dante Michael Finnell Dennis Quaid Martin Short Meg 
Ryan academy award best visual effects … 
innerspace USA1987 science fiction comedy film Joe 
Dante Michael Finnell Dennis Quaid Martin Short Meg 
Ryan academy award best visual effects … 
We can now create triple-term indexes
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Retrieving Query Results 
 For each query keyword, retrieve a list of triples 
 Join the triples from different lists based on their URIs 
comedy award 
Innerspace hasGenre Comedy 
Road_Trip hasGenre Comedy 
Toy_Story hasGenre Comedy 
Diner type Comedy_films 
Police_Academy type Comedy_films 
The_Darwin_Awards type Comedy_films 
... 
Traffic hasWonPrize Academy_Award 
Innerspace hasWonPrize Academy_Award 
Toy_Story hasWonPrize Academy_Award 
Diner hasWonPrize Academy_Award 
The_Darwin_Awards type Comedy_films 
... 
` 
T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Retrieving Query Results 
 Retrieve a list of triples matching a query keyword 
 Join the triples from different lists based on their URIs 
comedy award 
Innerspace hasGenre Comedy 
Road_Trip hasGenre Comedy 
Toy_Story hasGenre Comedy 
Diner type Comedy_films 
Police_Academy type Comedy_films 
The_Darwin_Awards type Comedy_films 
... 
Traffic hasWonPrize Academy_Award 
Innerspace hasWonPrize Academy_Award 
Toy_Story hasWonPrize Academy_Award 
Diner hasWonPrize Academy_Award 
The_Darwin_Awards type Comedy_films 
... 
` 
T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award 
T: Toy_Story hasGenre Comedy . Toy_Story hasWonPrize Academy_Award
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Retrieving Query Results 
 Retrieve a list of triples matching a query keyword 
 Join the triples from different lists based on their URIs 
comedy award 
Innerspace hasGenre Comedy 
Road_Trip hasGenre Comedy 
Toy_Story hasGenre Comedy 
Diner type Comedy_films 
Police_Academy type Comedy_films 
The_Darwin_Awards type Comedy_films 
... 
Traffic hasWonPrize Academy_Award 
Innerspace hasWonPrize Academy_Award 
Toy_Story hasWonPrize Academy_Award 
Diner hasWonPrize Academy_Award 
The_Darwin_Awards type Comedy_films 
... 
` 
Result Ranking is crucial!! 
T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award 
T: Toy_Story hasGenre Comedy . Toy_Story hasWonPrize Academy_Award 
T: Police_Academy type Comedy_Films . The_Darwin_Awards type Comedy_Films
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Language Models for Triples 
D(t) 
t:Innerspace hasGenre Comedy 
Esitmate from 
w P(w|D(t)) 
innerspace 0.234 
1987 0.123 
science 0.012 
fiction 0.020 
comedy 0.111 
film 0.179 
classic 0.111 
meg 0.019 
ryan 0.019 
oscar 0.148 
. . . . . . 
w 
P(w)
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Ranking Model 
comedy award 
T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award 
but we treat triples as bags of words!
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Ranking Model 
comedy award 
T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award 
probability of the structure of triple t 
being relevant to keyword w
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Estimating Structural Relevance 
 For each keyword, construct a probability 
distribution over predicates 
 Example: award 
r P(r|w) 
hasWonPrize 0.459 
wasNominatedFor 0.387 
type 0.112 
directed 0.020 
actedIn 0.021 
producedIn 0.025 
bornIn 0.008 
. . . . . . 
estimated from the whole dataset 
P(Innerspace hasWonPrize Academy_Award|award) = P(hasWonPrize|award)
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Example Ranked Query Results 
comedy award 
Bag of Words 
Combat_Academy type Comedy_films . The_Darwin_Awards type Comedy_films 
Police_Academy type Comedy_films . The_Darwin_Awards type Comedy_films 
Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award 
Structure Aware 
Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award 
Toy_Story hasGenre Comedy . Toy_Story hasWonPrize Academy_Award 
Shrek hasWonPrize Academy_Award_Best_Animated_Feature . Shrek hasGenre Comedy
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Experimental Setup 
 User study over two RDF datasets: 
 movies from IMDB 
 books from LibraryThing 
 Models compared: 
 Structure Aware Approach 
 Bag of Words Approach 
 Language-model-based Object Retrieval 
 BANKS (keyword search over databases)
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Experimental Setup 
 30 evaluation queries 
 Gathered relevance assessments for the top- 
50 results retrieved by each model
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Experimental Results 
P-value < 0.05
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Conclusion 
 Keyword Search over RDF data is crucial 
 To support keyword search over RDF data 
 Combine structured triples with text 
 Construct a document for each triple 
 Retrieve meaningful query results 
 Tuples of joined triples 
 Can be extended to larger subgraphs of the RDF 
graph 
 Rank the retrieved results 
 A language model approach that uses both text and 
structure
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 
Ranking Model
RDF Graphs 
Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011

More Related Content

Viewers also liked

Chemical properties review
Chemical properties reviewChemical properties review
Chemical properties reviewmshenry
 
36 topsfield rd power point
36 topsfield rd power point36 topsfield rd power point
36 topsfield rd power point
jhoyle
 
ESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking SessionESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking Session
Erik Mannens
 
African Nova Scotian Community Economic Data
African Nova Scotian Community Economic DataAfrican Nova Scotian Community Economic Data
African Nova Scotian Community Economic Data
Halifax Partnership
 
Smart business Annual Report 2011
Smart business Annual Report 2011Smart business Annual Report 2011
Smart business Annual Report 2011
Halifax Partnership
 
Diabetes For Dummies, 3rd Edition by Alan L. Rubin, MD Index
Diabetes For Dummies, 3rd Edition by Alan L. Rubin, MD IndexDiabetes For Dummies, 3rd Edition by Alan L. Rubin, MD Index
Diabetes For Dummies, 3rd Edition by Alan L. Rubin, MD Index
AlanLRubinMD
 
#ForoEGovAR | Bases para las Políticas para las Sociedades del Conocimiento
#ForoEGovAR | Bases para las Políticas para las Sociedades del Conocimiento#ForoEGovAR | Bases para las Políticas para las Sociedades del Conocimiento
#ForoEGovAR | Bases para las Políticas para las Sociedades del Conocimiento
CESSI ArgenTIna
 
Donaldson Orientation
Donaldson OrientationDonaldson Orientation
Donaldson OrientationLeah Vestal
 
Flanders Open Data Day II - KeyNote - Erik Mannens
Flanders Open Data Day II - KeyNote - Erik MannensFlanders Open Data Day II - KeyNote - Erik Mannens
Flanders Open Data Day II - KeyNote - Erik MannensErik Mannens
 
Our back to school album
Our back to school albumOur back to school album
Our back to school albumLisa Baird
 
Networking 101
Networking 101Networking 101
Networking 101
Halifax Partnership
 
Albany Bassmasters Meeting Minutes - February 2011
Albany Bassmasters Meeting Minutes - February 2011Albany Bassmasters Meeting Minutes - February 2011
Albany Bassmasters Meeting Minutes - February 2011Felix Ortiz
 
Entity Linking via Graph-Distance Minimization
Entity Linking via Graph-Distance MinimizationEntity Linking via Graph-Distance Minimization
Entity Linking via Graph-Distance Minimization
Roi Blanco
 

Viewers also liked (19)

Chemical properties review
Chemical properties reviewChemical properties review
Chemical properties review
 
36 topsfield rd power point
36 topsfield rd power point36 topsfield rd power point
36 topsfield rd power point
 
ESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking SessionESWC 2015 - EU Networking Session
ESWC 2015 - EU Networking Session
 
African Nova Scotian Community Economic Data
African Nova Scotian Community Economic DataAfrican Nova Scotian Community Economic Data
African Nova Scotian Community Economic Data
 
Gic2011 aula5-ingles
Gic2011 aula5-inglesGic2011 aula5-ingles
Gic2011 aula5-ingles
 
Gic2011 aula6-ingles
Gic2011 aula6-inglesGic2011 aula6-ingles
Gic2011 aula6-ingles
 
Smart business Annual Report 2011
Smart business Annual Report 2011Smart business Annual Report 2011
Smart business Annual Report 2011
 
Gic2011 aula8-ingles
Gic2011 aula8-inglesGic2011 aula8-ingles
Gic2011 aula8-ingles
 
Diabetes For Dummies, 3rd Edition by Alan L. Rubin, MD Index
Diabetes For Dummies, 3rd Edition by Alan L. Rubin, MD IndexDiabetes For Dummies, 3rd Edition by Alan L. Rubin, MD Index
Diabetes For Dummies, 3rd Edition by Alan L. Rubin, MD Index
 
Mission mars
Mission mars Mission mars
Mission mars
 
#ForoEGovAR | Bases para las Políticas para las Sociedades del Conocimiento
#ForoEGovAR | Bases para las Políticas para las Sociedades del Conocimiento#ForoEGovAR | Bases para las Políticas para las Sociedades del Conocimiento
#ForoEGovAR | Bases para las Políticas para las Sociedades del Conocimiento
 
Presentation1
Presentation1Presentation1
Presentation1
 
Dentist appointment
Dentist appointmentDentist appointment
Dentist appointment
 
Donaldson Orientation
Donaldson OrientationDonaldson Orientation
Donaldson Orientation
 
Flanders Open Data Day II - KeyNote - Erik Mannens
Flanders Open Data Day II - KeyNote - Erik MannensFlanders Open Data Day II - KeyNote - Erik Mannens
Flanders Open Data Day II - KeyNote - Erik Mannens
 
Our back to school album
Our back to school albumOur back to school album
Our back to school album
 
Networking 101
Networking 101Networking 101
Networking 101
 
Albany Bassmasters Meeting Minutes - February 2011
Albany Bassmasters Meeting Minutes - February 2011Albany Bassmasters Meeting Minutes - February 2011
Albany Bassmasters Meeting Minutes - February 2011
 
Entity Linking via Graph-Distance Minimization
Entity Linking via Graph-Distance MinimizationEntity Linking via Graph-Distance Minimization
Entity Linking via Graph-Distance Minimization
 

Similar to Keyword Search over RDF Graphs

Introduction to Cypher
Introduction to Cypher Introduction to Cypher
Introduction to Cypher
Neo4j
 
Comparing Index Structures for Completeness Reasoning
Comparing Index Structures for Completeness ReasoningComparing Index Structures for Completeness Reasoning
Comparing Index Structures for Completeness Reasoning
Fariz Darari
 
lecture04_movie_discussion.pdf
lecture04_movie_discussion.pdflecture04_movie_discussion.pdf
lecture04_movie_discussion.pdf
KRISLAM4
 
HARE: A Hybrid SPARQL Engine to Enhance Query Answers via Crowdsourcing
HARE: A Hybrid SPARQL Engine to Enhance Query Answers via CrowdsourcingHARE: A Hybrid SPARQL Engine to Enhance Query Answers via Crowdsourcing
HARE: A Hybrid SPARQL Engine to Enhance Query Answers via Crowdsourcing
Maribel Acosta Deibe
 
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
Abhay Prakash
 
Introduction to Knowledge Graphs with Grakn and Graql
Introduction to Knowledge Graphs with Grakn and Graql Introduction to Knowledge Graphs with Grakn and Graql
Introduction to Knowledge Graphs with Grakn and Graql
Vaticle
 
2011 Search Query Rewrites - Synonyms & Acronyms
2011 Search Query Rewrites - Synonyms & Acronyms2011 Search Query Rewrites - Synonyms & Acronyms
2011 Search Query Rewrites - Synonyms & Acronyms
Brian Johnson
 
Leveraging Semantic Parsing for Relation Linking over Knowledge Bases
Leveraging Semantic Parsing for Relation Linking over Knowledge BasesLeveraging Semantic Parsing for Relation Linking over Knowledge Bases
Leveraging Semantic Parsing for Relation Linking over Knowledge Bases
Nandana Mihindukulasooriya
 
Mining Interesting Trivia for Entities from Wikipedia PART-II
Mining Interesting Trivia for Entities from Wikipedia PART-IIMining Interesting Trivia for Entities from Wikipedia PART-II
Mining Interesting Trivia for Entities from Wikipedia PART-II
Abhay Prakash
 

Similar to Keyword Search over RDF Graphs (9)

Introduction to Cypher
Introduction to Cypher Introduction to Cypher
Introduction to Cypher
 
Comparing Index Structures for Completeness Reasoning
Comparing Index Structures for Completeness ReasoningComparing Index Structures for Completeness Reasoning
Comparing Index Structures for Completeness Reasoning
 
lecture04_movie_discussion.pdf
lecture04_movie_discussion.pdflecture04_movie_discussion.pdf
lecture04_movie_discussion.pdf
 
HARE: A Hybrid SPARQL Engine to Enhance Query Answers via Crowdsourcing
HARE: A Hybrid SPARQL Engine to Enhance Query Answers via CrowdsourcingHARE: A Hybrid SPARQL Engine to Enhance Query Answers via Crowdsourcing
HARE: A Hybrid SPARQL Engine to Enhance Query Answers via Crowdsourcing
 
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
IJCAI 2015 Presentation: Did you know?- Mining Interesting Trivia for Entitie...
 
Introduction to Knowledge Graphs with Grakn and Graql
Introduction to Knowledge Graphs with Grakn and Graql Introduction to Knowledge Graphs with Grakn and Graql
Introduction to Knowledge Graphs with Grakn and Graql
 
2011 Search Query Rewrites - Synonyms & Acronyms
2011 Search Query Rewrites - Synonyms & Acronyms2011 Search Query Rewrites - Synonyms & Acronyms
2011 Search Query Rewrites - Synonyms & Acronyms
 
Leveraging Semantic Parsing for Relation Linking over Knowledge Bases
Leveraging Semantic Parsing for Relation Linking over Knowledge BasesLeveraging Semantic Parsing for Relation Linking over Knowledge Bases
Leveraging Semantic Parsing for Relation Linking over Knowledge Bases
 
Mining Interesting Trivia for Entities from Wikipedia PART-II
Mining Interesting Trivia for Entities from Wikipedia PART-IIMining Interesting Trivia for Entities from Wikipedia PART-II
Mining Interesting Trivia for Entities from Wikipedia PART-II
 

More from Roi Blanco

From Queries to Answers in the Web
From Queries to Answers in the WebFrom Queries to Answers in the Web
From Queries to Answers in the Web
Roi Blanco
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Roi Blanco
 
Mining Web content for Enhanced Search
Mining Web content for Enhanced Search Mining Web content for Enhanced Search
Mining Web content for Enhanced Search
Roi Blanco
 
Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement
Roi Blanco
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
Roi Blanco
 
Searching over the past, present and future
Searching over the past, present and futureSearching over the past, present and future
Searching over the past, present and future
Roi Blanco
 
Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations
Roi Blanco
 
Large-Scale Semantic Search
Large-Scale Semantic SearchLarge-Scale Semantic Search
Large-Scale Semantic Search
Roi Blanco
 
Extending BM25 with multiple query operators
Extending BM25 with multiple query operatorsExtending BM25 with multiple query operators
Extending BM25 with multiple query operators
Roi Blanco
 
Energy-Price-Driven Query Processing in Multi-center Web Search Engines
Energy-Price-Driven Query Processing in Multi-center WebSearch EnginesEnergy-Price-Driven Query Processing in Multi-center WebSearch Engines
Energy-Price-Driven Query Processing in Multi-center Web Search Engines
Roi Blanco
 
Effective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataEffective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF data
Roi Blanco
 
Caching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental IndicesCaching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental Indices
Roi Blanco
 
Finding support sentences for entities
Finding support sentences for entitiesFinding support sentences for entities
Finding support sentences for entities
Roi Blanco
 

More from Roi Blanco (13)

From Queries to Answers in the Web
From Queries to Answers in the WebFrom Queries to Answers in the Web
From Queries to Answers in the Web
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Mining Web content for Enhanced Search
Mining Web content for Enhanced Search Mining Web content for Enhanced Search
Mining Web content for Enhanced Search
 
Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement Influence of Timeline and Named-entity Components on User Engagement
Influence of Timeline and Named-entity Components on User Engagement
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Searching over the past, present and future
Searching over the past, present and futureSearching over the past, present and future
Searching over the past, present and future
 
Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations Beyond document retrieval using semantic annotations
Beyond document retrieval using semantic annotations
 
Large-Scale Semantic Search
Large-Scale Semantic SearchLarge-Scale Semantic Search
Large-Scale Semantic Search
 
Extending BM25 with multiple query operators
Extending BM25 with multiple query operatorsExtending BM25 with multiple query operators
Extending BM25 with multiple query operators
 
Energy-Price-Driven Query Processing in Multi-center Web Search Engines
Energy-Price-Driven Query Processing in Multi-center WebSearch EnginesEnergy-Price-Driven Query Processing in Multi-center WebSearch Engines
Energy-Price-Driven Query Processing in Multi-center Web Search Engines
 
Effective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF dataEffective and Efficient Entity Search in RDF data
Effective and Efficient Entity Search in RDF data
 
Caching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental IndicesCaching Search Engine Results over Incremental Indices
Caching Search Engine Results over Incremental Indices
 
Finding support sentences for entities
Finding support sentences for entitiesFinding support sentences for entities
Finding support sentences for entities
 

Recently uploaded

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 

Recently uploaded (20)

LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 

Keyword Search over RDF Graphs

  • 1. Keyword Search over RDF Graphs Shady Elbassuoni* and Roi Blanco** * Max-Planck Institute for Informatics ** Yahoo! Research, Barcelona
  • 2. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 RDF Datasets subject predicate object Traffic hasWonPrize Academy_Award Innerspace hasWonPrize Academy_Award Innerspace hasGenre Comedy Joe_Dante directed Innerspace Toy_Story hasWonPrize Academy_Award Road_Trip hasGenre Comedy Toy_Story hasGenre Comedy Tom_Hanks actedIn Toy_Story Diner hasWonPrize Academy_Award Diner type Comedy_films Steve_Guttenberg actedIn Diner The_Pink_Panther type Criminal_comedy_films The_Pink_Panther hasWonPrize Academy_Award Police_Academy type Comedy_films Steve_Guttenberg actedIn Police_Academy The_Darwin_Awards type Comedy_films
  • 3. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Searching RDF Data  Structured triple-pattern queries (SPARQL)  Example: comedies that have won an academy award SELECT ?m WHERE {?m hasGenre Comedy . ?m hasWonPrize Academy_Award}
  • 4. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Searching RDF Data  Triple-pattern queries are very expressive but are not that useable  Most users/ Search APIs prefer keyword queries Support keyword search over RDF graphs
  • 5. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Keyword Search over RDF Data  How to process keyword queries?  Translate keyword queries into SPARQL  Directly process the queries over the RDF graph  What are the results to a keyword query?  Resources  Triples  Tuples of triples (subgraphs)
  • 6. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Keyword Search over RDF Data  How to process keyword queries?  Translate keyword queries into SPARQL  Directly process the queries over the RDF graph  What are the results to a keyword query?  Resources  Triples  Tuples of triples (subgraphs)
  • 7. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Keyword Search over RDF Data  How to process keyword queries?  Translate keyword queries into SPARQL  Directly process the queries over the RDF graph  What are the results to a keyword query?  Resources  Triples  Tuples of triples (subgraphs)
  • 8. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Keyword Search over RDF Data  How to process keyword queries?  Translate keyword queries into SPARQL  Directly process the queries over the RDF graph  What are the results to a keyword query?  Resources  Triples  Tuples of triples (subgraphs)
  • 9. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Processing Keyword Queries  Construct a document D(t) for each triple t  D(t) contains all literals in t and any text associated with the URIs in t  Example: t: Innerspace hasGenre Comedy innerspace USA1987 science fiction comedy film Joe Dante Michael Finnell Dennis Quaid Martin Short Meg Ryan academy award best visual effects … innerspace USA1987 science fiction comedy film Joe Dante Michael Finnell Dennis Quaid Martin Short Meg Ryan academy award best visual effects … We can now create triple-term indexes
  • 10. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Retrieving Query Results  For each query keyword, retrieve a list of triples  Join the triples from different lists based on their URIs comedy award Innerspace hasGenre Comedy Road_Trip hasGenre Comedy Toy_Story hasGenre Comedy Diner type Comedy_films Police_Academy type Comedy_films The_Darwin_Awards type Comedy_films ... Traffic hasWonPrize Academy_Award Innerspace hasWonPrize Academy_Award Toy_Story hasWonPrize Academy_Award Diner hasWonPrize Academy_Award The_Darwin_Awards type Comedy_films ... ` T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award
  • 11. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Retrieving Query Results  Retrieve a list of triples matching a query keyword  Join the triples from different lists based on their URIs comedy award Innerspace hasGenre Comedy Road_Trip hasGenre Comedy Toy_Story hasGenre Comedy Diner type Comedy_films Police_Academy type Comedy_films The_Darwin_Awards type Comedy_films ... Traffic hasWonPrize Academy_Award Innerspace hasWonPrize Academy_Award Toy_Story hasWonPrize Academy_Award Diner hasWonPrize Academy_Award The_Darwin_Awards type Comedy_films ... ` T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award T: Toy_Story hasGenre Comedy . Toy_Story hasWonPrize Academy_Award
  • 12. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Retrieving Query Results  Retrieve a list of triples matching a query keyword  Join the triples from different lists based on their URIs comedy award Innerspace hasGenre Comedy Road_Trip hasGenre Comedy Toy_Story hasGenre Comedy Diner type Comedy_films Police_Academy type Comedy_films The_Darwin_Awards type Comedy_films ... Traffic hasWonPrize Academy_Award Innerspace hasWonPrize Academy_Award Toy_Story hasWonPrize Academy_Award Diner hasWonPrize Academy_Award The_Darwin_Awards type Comedy_films ... ` Result Ranking is crucial!! T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award T: Toy_Story hasGenre Comedy . Toy_Story hasWonPrize Academy_Award T: Police_Academy type Comedy_Films . The_Darwin_Awards type Comedy_Films
  • 13. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Language Models for Triples D(t) t:Innerspace hasGenre Comedy Esitmate from w P(w|D(t)) innerspace 0.234 1987 0.123 science 0.012 fiction 0.020 comedy 0.111 film 0.179 classic 0.111 meg 0.019 ryan 0.019 oscar 0.148 . . . . . . w P(w)
  • 14. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Ranking Model comedy award T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award but we treat triples as bags of words!
  • 15. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Ranking Model comedy award T: Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award probability of the structure of triple t being relevant to keyword w
  • 16. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Estimating Structural Relevance  For each keyword, construct a probability distribution over predicates  Example: award r P(r|w) hasWonPrize 0.459 wasNominatedFor 0.387 type 0.112 directed 0.020 actedIn 0.021 producedIn 0.025 bornIn 0.008 . . . . . . estimated from the whole dataset P(Innerspace hasWonPrize Academy_Award|award) = P(hasWonPrize|award)
  • 17. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Example Ranked Query Results comedy award Bag of Words Combat_Academy type Comedy_films . The_Darwin_Awards type Comedy_films Police_Academy type Comedy_films . The_Darwin_Awards type Comedy_films Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award Structure Aware Innerspace hasGenre Comedy . Innerspace hasWonPrize Academy_Award Toy_Story hasGenre Comedy . Toy_Story hasWonPrize Academy_Award Shrek hasWonPrize Academy_Award_Best_Animated_Feature . Shrek hasGenre Comedy
  • 18. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Experimental Setup  User study over two RDF datasets:  movies from IMDB  books from LibraryThing  Models compared:  Structure Aware Approach  Bag of Words Approach  Language-model-based Object Retrieval  BANKS (keyword search over databases)
  • 19. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Experimental Setup  30 evaluation queries  Gathered relevance assessments for the top- 50 results retrieved by each model
  • 20. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Experimental Results P-value < 0.05
  • 21. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Conclusion  Keyword Search over RDF data is crucial  To support keyword search over RDF data  Combine structured triples with text  Construct a document for each triple  Retrieve meaningful query results  Tuples of joined triples  Can be extended to larger subgraphs of the RDF graph  Rank the retrieved results  A language model approach that uses both text and structure
  • 22. Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011 Ranking Model
  • 23. RDF Graphs Shady Elbassuoni, Keyword Search over RDF Graphs, CIKM 2011

Editor's Notes

  1. If we zoom in on one of these datesets, they are basically just a set of triples with three fields : subject, predicate &amp; object. RDF is a very flexible means to encode structured information in a machine readable format … for instance the first triple hear states that the movie trafiic has won an Academy award. Note, that subjects and objects are URIs or literals and predicates are URIs.
  2. So how do we search RDF data? We use structured query languages like sparql where a query is a set of triple patterns. A triple pattern is just a triple with one variable. Let’s look at an example. Consider we are looking for comedies that have won an academy award. This can be expressed using the triple-pattern query in the pink or simon box … the ?m is a variable and the triples in the curly braces are triple patterns. In particular, the first one has predicate hasGenre and
  3. So triple patterns are really powerful and can be used to find very interesting information but do we really expect the regular users to use it? Unfortunately not! And since we computer scientists are really nice people and we always try to make the lives of the poor casual users easy, we need to enable them to search RDF data using keywords
  4. An RDF dataset can also be viewed as a graph where subjects and objects are nodes and predicates represent labeled edges. For example, the triple about Traffic winning an academy award is represented using this edge.