From Exploratory Search to Web Search and back - PIKM 2010

PIKM 2010 – Workshop for Ph.D. Students in Information and Knowledge Management
October 30, 2010 – Fairmont Royal York, Toronto, Canada
FROM EXPLORATORY SEARCH
TO WEB SEARCH AND BACK
Politecnico di Bari
Via Orabona, 4
70125 Bari (ITALY)
Roberto Mirizzi, Tommaso Di Noia
mirizzi@deemail.poliba.it, t.dinoia@poliba.it

Outline
Tags to improve Web Search
Exploratory Search
LED (Lookup Explore Discover): exploratory
search in the Web (of Data)
DBpediaRanker: RDF ranking in DBpedia
Conclusion and Future work

Why we use tags?
and many
more…

What is Exploratory Search?
[Gary Marchionini. Exploratory Search: From Finding to understanding. Communications of the ACM, 49(4): 41-46, 2006]

Can Semantic tags support Exploratory search?
Plugged into the Web 3.0
Disambiguation
Relations among tags
Machine understandable
Semantic-aided query refinement
LED: Lookup Explore Discover
http://sisinflab.poliba.it/led/
If Semantic tags helped 10% of Internet users to save 10 minutes per month on their searches, this would save globally over 4,000,000 of working hours per year

LED: Lookup Explore Discover
Objectives
 Enable users to properly
explore the semantics of a
keyword
 Guide users to refine a
query suggesting related
topics/keywords
Improve lookup search to explore knowledge

What is behind LED? (i)

What is behind LED? (ii)
Comments
 DBpedia resources are
highly interconnected
in the RDF graph
 Not all the relevant
resources for a given
node are its direct
neighbors
1. Explore the
neighborhood of a
resource to discover
new relevant
resources not
directly connected to
it
2. Rank the results

DBpedia graph exploration in LED
Semantic_Web XML-based_standards
Knowledge_representation Data_management Internet_architecture
Triplestores Folksonomy
…
…
XML Computer_and_telecommunication_stantards
Web_services User_interface_markup_languages Scalable_Vector_GraphicsMicroformats
skos:subject skos:broaderCategoryArticle
Legend
……
…
Resource Description Framework
Microformat
RDFa
…
…

The functional architecture
Back-end
Query engine
Storage
GUI
Ext.InfoSources
DBpedia
Lookup
Service
Interface
Delicious
Yahoo!
Bing
Google
Graph
Explorer
SPARQL
Context
Analyzer
Ranker
Offline computation
Linked Data graph
exploration
Rank nodes exploiting
external information
Store results as pairs of
nodes together with their
similarity
Runtime Search
Start typing a query
Query the system for
relevant tags
(corresponding to DBpedia
resources) and aggregate
results
Show the semantic tag
cloud and the results
1
2
3
1
2
3
OfflinecomputationRuntimesearch
1
2
3
1
2
3
Tag Cloud
Generator
Meta-search
engine

DBpediaRanker: ranking
?r1 ?r2
isSimilar
v
hasValue
einfo_sourc2
21
1
21
einfo_sourc21
)(
),(
)(
),(
),(
rf
rrf
rf
rrf
rrsim 






viceversaandrandrbetweenwikilink,2
saor viceverrandrbetweenkwikilin,1
randrbetweenwikilinkno,0
),(
21
21
21
21 rrorewikilinkSc
)(
),(
),(
2
12
21
rl
rrl
rroreabstractSc 
Graph-based and text-based ranking
Ranking based on external sources

DBpediaRanker: an example (i)
wikilinkScore(RDFa, Resource_Description_Framework) = 2
abstractScore(RDFa, Resource_Description_Framework) = 1.0

DBpediaRanker: an example (ii)
sim(RDFa, Resource_Description_Framework)Google = 1.67e5 / 4.42e5 + 1.67e5 / 1.19e7 = 0.39
delicious

DBpediaRanker: context analysis
The same similarity measure is used in the context analysis
?r1
?c1
belongsTo
v
hasValue
?c2
?c…
?cN
C
Example:
C = {Programming Languages, Databases, Software}
Does Dennis Ritchie belongs to the given context?
Algorithm:
If(v>THRESHOLD) then
r1 belongs to the context;
add r1 to the graph exploration queue
Else
r1 does not belong to the context;
exclude r1 from graph exploration
EndIf

Evaluation (i)
http://sisinflab.poliba.it/evaluation
 Comparison of 5 different algorithms
 50 volunteers
 Researchers in the ICT area
 244 votes collected (on average 5 votes for each users)
 Average time to vote: 1min and 40secs

Evaluation (ii)
http://sisinflab.poliba.it/evaluation/data
3.91 - Good

Conclusion
 LED: a system for exploratory search and query
refinement on the (Semantic) Web
 DBpediaRanker: ranking algorithms for resources in
DBpedia
Future work
 Expose a RESTful API for building novel mashups and for
comparing with different systems
 Improve ranking algorithms
 Deal with cases where a single knowledge base in not
sufficient
 Combine a content-based recommendation and a
collaborative-filtering approach

FROM EXPLORATORY SEARCH TO WEB SEARCH AND BACK (PIKM 2010)
If you're interested in learning more…
1. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tags generation and retrieval for online
advertising. 19th ACM International Conference on Information and Knowledge Management (CIKM 2010)
2. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Ranking the Linked Data: the case of DBpedia. 10th
International Conference on Web Engineering (ICWE 2010)
3. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tag cloud generation via DBpedia. 11th
International Conference on Electronic Commerce and Web Technologies (EC-Web 2010)
4. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic tagging for crowd computing. 18th Italian
Symposium on Advanced Database Systems (SEBD 2010)
5. Roberto Mirizzi, Azzurra Ragone, Tommaso Di Noia, Eugenio Di Sciascio. Semantic Wonder Cloud: exploratory search in DBpedia.
2th International Workshop on Semantic Web Information Management (SWIM 2010) - Best Workshop Paper at International
Conference on Web Engineering (ICWE 2010)
Roberto Mirizzi - mirizzi@deemail.poliba.it
Thanks for your attention!

From Exploratory Search to Web Search and back - PIKM 2010

More Related Content

What's hot

Similar to From Exploratory Search to Web Search and back - PIKM 2010

Recently uploaded

From Exploratory Search to Web Search and back - PIKM 2010