Keyword-Based Navigation and Search over the Linked Data Web

Keyword-Based Navigation and Search
over the Linked Data Web
Luca Matteis1, Aidan Hogan2, Roberto Navigli1
1 Sapienza University
of Rome
2 University of
Chile
General idea
• Browse the live linked data web using keywords
• Predicate resolution along the navigation to
increase matches
• Results are streamed back to users as quickly as
possible
• We measure how fast relevant triples are found at
each step of the navigation
Keyword-Based Navigation and Search over the Linked Data Web
Navigation
• Navigation starts from a list of starting URIs
• Users/agents provide keywords to search against
and guide the navigation
• Navigation is structured using a streaming
pipeline
Search
• Search occurs at each element of the pipeline
• Several RDF keyword search algorithms can be used
• Predicate resolution is used to increase number of
matches
Keyword-Based Navigation and Search over the Linked Data Web
Keyword-Based Navigation and Search over the Linked Data Web
SWGET comparison
• SWGET is an implementation of the NautiLOD
navigational language
• It allows to filter (through SPARQL) triples at each
step at the navigation
• We show that our pipeline streaming approach
results in faster response times
SWGET comparison
Results
• Total response time is under 10 seconds (varies
based on the number of keywords)
• Navigation hop time averages ~5 seconds
Discussion
• Results point to the fact that keyword-navigation
is achievable, although a bit sluggish.
• Experiments were on the live linked data web!
Servers optimized for concurrency and high-
throughput (triple pattern fragments) might yield
faster response times.
Final remarks
• Our approach incentives publishers to enrich their
structured data (using predicates with meaningful
descriptions)
• Concurrent resolution of many URIs at runtime to
find answers to queries is becoming more and
more viable; increase in bandwidth is going to
make this even more usable
• Upfront querying may not be the only way we
query the Web of Linked Data
Use case
Use case
Use case
dir suggestions
codirector (8)
redirection (4)
director (1)
nadir (1)
…
Use case
director 1 triple found (view)
Use case
director 1 triple found (view)
know suggestions
known for (17)
knows (6)
knowledge of (5)
…
Use case
director 1 triple found (view)
known for 17 triples found (view)
Use case
director 1 triple found (view)
known for 17 triples found (view)
Use case
director 1 triple found (view)
known for 17 triples found (view)
act suggestions
actor (56)
abstract (48)
…
Use case
director 1 triple found (view)
known for 17 triples found (view)
actor 56 triples found (view)
Users don't have to input URIs
(as they do when writing SPARQL)
Nor they have to know the exact
structure of the underlying dataset
(they simply type keywords)
SELECT * {
<http://viaf.org/viaf/177603646>
onto:mov100 ?movement .
?movement my:lab ?label .
}
http://viaf.org/viaf/177603646 /
movement /
name
Query federation is built-in
(we're simply following links)
http://viaf.org/viaf/177603646 /
movement /
same as /
movement of /
born < 1960 /
same as freebase /
name
} VIAF
} DBpedia
} Freebase
Future work
• Develop a functioning app (browser extension or
add-on to Tabulator)
• Use third-party services to assist the navigation by
matching synonyms or translations (BabelNet,
WordNet)
• Use other third-party services to assist in the
disambiguation of words using the context of the
data acquired along the navigation (Babelfy)
• Better methods for effectively crawling Linked
Datasets at runtime (that don't strain servers and
provide quick response times)
Thanks!
@lmatteis
http://lucaa.org
1 of 24

Recommended

Protégé4US: Harvesting Ontology Authoring Data with Protégé by
Protégé4US: Harvesting Ontology Authoring Data with ProtégéProtégé4US: Harvesting Ontology Authoring Data with Protégé
Protégé4US: Harvesting Ontology Authoring Data with ProtégéMarkel Vigo
1.3K views13 slides
NAMED ENTITY RECOGNITION by
NAMED ENTITY RECOGNITIONNAMED ENTITY RECOGNITION
NAMED ENTITY RECOGNITIONlive_and_let_live
3.9K views27 slides
Kudos - A Peer-to-Peer Discussion System Based on Social Voting by
Kudos - A Peer-to-Peer Discussion System Based on Social VotingKudos - A Peer-to-Peer Discussion System Based on Social Voting
Kudos - A Peer-to-Peer Discussion System Based on Social VotingLuca Matteis
53.5K views16 slides
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m... by
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...
RecSys 2015 Tutorial - Scalable Recommender Systems: Where Machine Learning m...Joaquin Delgado PhD.
3.5K views73 slides
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... by
 RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning... RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...
RecSys 2015 Tutorial – Scalable Recommender Systems: Where Machine Learning...S. Diana Hu
854 views73 slides
Saner17 sharma by
Saner17 sharmaSaner17 sharma
Saner17 sharmaAbhishek Sharma
140 views48 slides

More Related Content

Similar to Keyword-Based Navigation and Search over the Linked Data Web

MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me... by
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...Yongyao Jiang
156 views67 slides
CS6007 information retrieval - 5 units notes by
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notesAnandh Arumugakan
7.6K views55 slides
RDF Stream Processing: Let's React by
RDF Stream Processing: Let's ReactRDF Stream Processing: Let's React
RDF Stream Processing: Let's ReactJean-Paul Calbimonte
3.2K views30 slides
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and... by
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Lionel Briand
370 views22 slides
Disrupting Data Discovery by
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discoverymarkgrover
2.1K views76 slides
Ontology Based Approach for Semantic Information Retrieval System by
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval SystemIJTET Journal
797 views6 slides

Similar to Keyword-Based Navigation and Search over the Linked Data Web(20)

MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me... by Yongyao Jiang
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
Yongyao Jiang156 views
CS6007 information retrieval - 5 units notes by Anandh Arumugakan
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
Anandh Arumugakan7.6K views
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and... by Lionel Briand
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand370 views
Disrupting Data Discovery by markgrover
Disrupting Data DiscoveryDisrupting Data Discovery
Disrupting Data Discovery
markgrover2.1K views
Ontology Based Approach for Semantic Information Retrieval System by IJTET Journal
Ontology Based Approach for Semantic Information Retrieval SystemOntology Based Approach for Semantic Information Retrieval System
Ontology Based Approach for Semantic Information Retrieval System
IJTET Journal797 views
The data streaming processing paradigm and its use in modern fog architectures by Vincenzo Gulisano
The data streaming processing paradigm and its use in modern fog architecturesThe data streaming processing paradigm and its use in modern fog architectures
The data streaming processing paradigm and its use in modern fog architectures
Vincenzo Gulisano167 views
Strata sf - Amundsen presentation by Tao Feng
Strata sf - Amundsen presentationStrata sf - Amundsen presentation
Strata sf - Amundsen presentation
Tao Feng3.8K views
Measuring the end user by APNIC
Measuring the end userMeasuring the end user
Measuring the end user
APNIC3.7K views
Pdd crawler a focused web by csandit
Pdd crawler  a focused webPdd crawler  a focused web
Pdd crawler a focused web
csandit337 views
"PageRank" - "The Anatomy of a Large-Scale Hypertextual Web Search Engine” pr... by Stefan Adam
"PageRank" - "The Anatomy of a Large-Scale Hypertextual Web Search Engine” pr..."PageRank" - "The Anatomy of a Large-Scale Hypertextual Web Search Engine” pr...
"PageRank" - "The Anatomy of a Large-Scale Hypertextual Web Search Engine” pr...
Stefan Adam202 views
Distributed Tracing with Jaeger by Inho Kang
Distributed Tracing with JaegerDistributed Tracing with Jaeger
Distributed Tracing with Jaeger
Inho Kang790 views
Opentracing jaeger by Oracle Korea
Opentracing jaegerOpentracing jaeger
Opentracing jaeger
Oracle Korea2.7K views
SEMLIB Final Conference | DERI presentation by SemLib Project
SEMLIB Final Conference | DERI presentationSEMLIB Final Conference | DERI presentation
SEMLIB Final Conference | DERI presentation
SemLib Project710 views
Data council sf amundsen presentation by Tao Feng
Data council sf    amundsen presentationData council sf    amundsen presentation
Data council sf amundsen presentation
Tao Feng2.8K views
Introduction to Galaxy and RNA-Seq by Enis Afgan
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-Seq
Enis Afgan3.2K views

Recently uploaded

[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks by
[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks[DSC Europe 23] Aleksandar Tomcic - Adversarial Attacks
[DSC Europe 23] Aleksandar Tomcic - Adversarial AttacksDataScienceConferenc1
5 views20 slides
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx by
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptxDataScienceConferenc1
5 views16 slides
Advanced_Recommendation_Systems_Presentation.pptx by
Advanced_Recommendation_Systems_Presentation.pptxAdvanced_Recommendation_Systems_Presentation.pptx
Advanced_Recommendation_Systems_Presentation.pptxneeharikasingh29
5 views9 slides
Ukraine Infographic_22NOV2023_v2.pdf by
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdfAnastosiyaGurin
1.4K views3 slides
MOSORE_BRESCIA by
MOSORE_BRESCIAMOSORE_BRESCIA
MOSORE_BRESCIAFederico Karagulian
5 views8 slides
SAP-TCodes.pdf by
SAP-TCodes.pdfSAP-TCodes.pdf
SAP-TCodes.pdfmustafaghulam8181
10 views285 slides

Recently uploaded(20)

[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx by DataScienceConferenc1
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
[DSC Europe 23] Stefan Mrsic_Goran Savic - Evolving Technology Excellence.pptx
Advanced_Recommendation_Systems_Presentation.pptx by neeharikasingh29
Advanced_Recommendation_Systems_Presentation.pptxAdvanced_Recommendation_Systems_Presentation.pptx
Advanced_Recommendation_Systems_Presentation.pptx
Ukraine Infographic_22NOV2023_v2.pdf by AnastosiyaGurin
Ukraine Infographic_22NOV2023_v2.pdfUkraine Infographic_22NOV2023_v2.pdf
Ukraine Infographic_22NOV2023_v2.pdf
AnastosiyaGurin1.4K views
3196 The Case of The East River by ErickANDRADE90
3196 The Case of The East River3196 The Case of The East River
3196 The Case of The East River
ErickANDRADE9016 views
CRIJ4385_Death Penalty_F23.pptx by yvettemm100
CRIJ4385_Death Penalty_F23.pptxCRIJ4385_Death Penalty_F23.pptx
CRIJ4385_Death Penalty_F23.pptx
yvettemm1006 views
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M... by DataScienceConferenc1
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
[DSC Europe 23] Milos Grubjesic Empowering Business with Pepsico s Advanced M...
UNEP FI CRS Climate Risk Results.pptx by pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 views
SUPER STORE SQL PROJECT.pptx by khan888620
SUPER STORE SQL PROJECT.pptxSUPER STORE SQL PROJECT.pptx
SUPER STORE SQL PROJECT.pptx
khan88862012 views
Survey on Factuality in LLM's.pptx by NeethaSherra1
Survey on Factuality in LLM's.pptxSurvey on Factuality in LLM's.pptx
Survey on Factuality in LLM's.pptx
NeethaSherra16 views
Chapter 3b- Process Communication (1) (1)(1) (1).pptx by ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
ayeshabaig20046 views
Cross-network in Google Analytics 4.pdf by GA4 Tutorials
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdf
GA4 Tutorials6 views
Organic Shopping in Google Analytics 4.pdf by GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials14 views
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation by DataScienceConferenc1
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
Data Journeys Hard Talk workshop final.pptx by info828217
Data Journeys Hard Talk workshop final.pptxData Journeys Hard Talk workshop final.pptx
Data Journeys Hard Talk workshop final.pptx
info82821710 views
CRM stick or twist workshop by info828217
CRM stick or twist workshopCRM stick or twist workshop
CRM stick or twist workshop
info8282179 views

Keyword-Based Navigation and Search over the Linked Data Web

  • 1. Keyword-Based Navigation and Search over the Linked Data Web Luca Matteis1, Aidan Hogan2, Roberto Navigli1 1 Sapienza University of Rome 2 University of Chile
  • 2. General idea • Browse the live linked data web using keywords • Predicate resolution along the navigation to increase matches • Results are streamed back to users as quickly as possible • We measure how fast relevant triples are found at each step of the navigation
  • 4. Navigation • Navigation starts from a list of starting URIs • Users/agents provide keywords to search against and guide the navigation • Navigation is structured using a streaming pipeline
  • 5. Search • Search occurs at each element of the pipeline • Several RDF keyword search algorithms can be used • Predicate resolution is used to increase number of matches
  • 8. SWGET comparison • SWGET is an implementation of the NautiLOD navigational language • It allows to filter (through SPARQL) triples at each step at the navigation • We show that our pipeline streaming approach results in faster response times
  • 10. Results • Total response time is under 10 seconds (varies based on the number of keywords) • Navigation hop time averages ~5 seconds Discussion • Results point to the fact that keyword-navigation is achievable, although a bit sluggish. • Experiments were on the live linked data web! Servers optimized for concurrency and high- throughput (triple pattern fragments) might yield faster response times.
  • 11. Final remarks • Our approach incentives publishers to enrich their structured data (using predicates with meaningful descriptions) • Concurrent resolution of many URIs at runtime to find answers to queries is becoming more and more viable; increase in bandwidth is going to make this even more usable • Upfront querying may not be the only way we query the Web of Linked Data
  • 14. Use case dir suggestions codirector (8) redirection (4) director (1) nadir (1) …
  • 15. Use case director 1 triple found (view)
  • 16. Use case director 1 triple found (view) know suggestions known for (17) knows (6) knowledge of (5) …
  • 17. Use case director 1 triple found (view) known for 17 triples found (view)
  • 18. Use case director 1 triple found (view) known for 17 triples found (view)
  • 19. Use case director 1 triple found (view) known for 17 triples found (view) act suggestions actor (56) abstract (48) …
  • 20. Use case director 1 triple found (view) known for 17 triples found (view) actor 56 triples found (view)
  • 21. Users don't have to input URIs (as they do when writing SPARQL) Nor they have to know the exact structure of the underlying dataset (they simply type keywords) SELECT * { <http://viaf.org/viaf/177603646> onto:mov100 ?movement . ?movement my:lab ?label . } http://viaf.org/viaf/177603646 / movement / name
  • 22. Query federation is built-in (we're simply following links) http://viaf.org/viaf/177603646 / movement / same as / movement of / born < 1960 / same as freebase / name } VIAF } DBpedia } Freebase
  • 23. Future work • Develop a functioning app (browser extension or add-on to Tabulator) • Use third-party services to assist the navigation by matching synonyms or translations (BabelNet, WordNet) • Use other third-party services to assist in the disambiguation of words using the context of the data acquired along the navigation (Babelfy) • Better methods for effectively crawling Linked Datasets at runtime (that don't strain servers and provide quick response times)