SlideShare a Scribd company logo
1 of 70
Actively Learning to Rank Semantic
Associations for Personalized
Contextual Exploration of
Knowledge Graphs
Federico Bianchi, Matteo Palmonari, Marco Cremaschi and Elisabetta Fersini
federico.bianchi@disco.unimib.it
ITIS Lab – Innovative Technologies for Interaction and Services
Dipartimento di Informatica, Sistemistica e Comunicazione
Università degli Studi di Milano-Bicocca
1-6-2017, Portorož, Slovenia
Outline
2
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
Outline
3
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
Knowledge Graphs (KGs)
• Models used for knowledge
representation using graphs
• DBpedia, YAGO, Google KG, …
• Nodes represent real-world
entities
• Labelled edges represent
relations between them.
4
Bernie Sanders
Hillary Clinton
Democratic
Party
KGs may contain interesting relations for users
Relational Knowledge in KGs
and Semantic Associations
• KGs provide vast amount of
relational knowledge
• Semantic Associations (SAs)
• chains of relations between entities
• arbitrary length
• inverse properties included
5
Bernie Sanders
Democratic
Party
Hillary Clinton
party
party
Bernie_Sanders party > Democratic_Party < party Hilary_Clinton
Empowering Comprehension of Web Content
6
Who is Bernie Sanders? What is
his relation with Hillary Clinton?
Empowering Comprehension of Web Content
6
Who is Bernie Sanders? What is
his relation with Hillary Clinton?
7
Contextual Exploration of KGs
Support a user who is doing a familiar task, e.g., reading a news
article, to access content extracted from a KG, selected and pushed
to him in a proactive fashion.
Who is Bernie Sanders?
What is his relation with Hillary Clinton?
7
Contextual Exploration of KGs
Support a user who is doing a familiar task, e.g., reading a news
article, to access content extracted from a KG, selected and pushed
to him in a proactive fashion.
Bernie
Sanders
Democratic
Party
Hillary
Clinton
party
party
8
Contextual Exploration of DBpedia with DaCENA
www.dacena.org
(Palmonari&al, 2015)
Entity Extraction
9
Bernie Sanders has urged his supporters to
look beyond the Democratic presidential
nomination in a speech that stopped short of
fully endorsing Hillary Clinton but made
clear he was no longer actively challenging
her candidacy. In an anticlimatic speech that
signalled the effective end of a 14-month
campaign odyssey, the Vermont senator
insisted his “political revolution continues”
despite Clinton’s effective victory in the
delegate race.
Entities are extracted from text
Entity Extraction SAs Retrieval
Entity Extraction
9
Bernie Sanders has urged his supporters to
look beyond the Democratic presidential
nomination in a speech that stopped short of
fully endorsing Hillary Clinton but made
clear he was no longer actively challenging
her candidacy. In an anticlimatic speech that
signalled the effective end of a 14-month
campaign odyssey, the Vermont senator
insisted his “political revolution continues”
despite Clinton’s effective victory in the
delegate race.
Bernie Sanders, Hillary Clinton,
Democratic Party, Vermont…
Bernie Sanders has urged his supporters to
look beyond the Democratic presidential
nomination in a speech that stopped short of
fully endorsing Hillary Clinton but made
clear he was no longer actively challenging
her candidacy. In an anticlimatic speech that
signalled the effective end of a 14-month
campaign odyssey, the Vermont senator
insisted his “political revolution continues”
despite Clinton’s effective victory in the
delegate race.
Entities are extracted from text
Entity Extraction SAs Retrieval
Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Entity Extraction SAs Retrieval
Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
SPARQL
Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
SPARQL
Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
Bernie Sanders
Democratic
Party
party
SPARQL
Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
Bernie Sanders
Democratic
Party
party
Hillary Clinton
party
SPARQL
Information Overload in Contextual KG
Exploration
11
Too many associations from even small pieces of
text
• E.g., 40107 associations from an article
with 942 words
• Not fit in a single screen
• Users can explore only a limited number of
associations (≤ 100)
Crucial issue for KG exploration:
Which are the most interesting to show to users?
Ranking SAs by Estimated Interest: Serendipity
12
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
Ranking SAs by Estimated Interest: Serendipity
12
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
relevance (SA, Text) = cos(abstracts(SAs), Text)
- with TF-IDF weighting
Ranking SAs by Estimated Interest: Serendipity
12
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
(Aleman-Meza&al, 2005)relevance (SA, Text) = cos(abstracts(SAs), Text)
- with TF-IDF weighting
Ranking SAs by Estimated Interest: Serendipity
13
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
Ranking SAs by Estimated Interest: Serendipity
13
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
user can tune the weight assigned to relevance vs unexpectedness
Example of SAs ranked by Serendipity
14
Outline
16
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
Personalized Exploration of KGs
17
What if different users are interested in different SAs?
1. Learn a ranking function starting from explicit ratings given by the users
2. Ask users few ratings as possible (rating too many SAs can become a tedious task)
3. Speed up learning by sampling SAs that are estimated more infromative for training
the model
Definition of an active learning to rank model for personalized
contextual exploration of KGs
Active Learning to Rank for SAs
18
Active Learning to Rank for SAs
18
Ranking
Rank SVM algorithm:
• Derivated From SVM
(Support Vector Machine)
• Well known and widly
used in the literature
Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Final Ranking
no
Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Active
Sampling
SAs
Two algorithm used to find
meaningful SAs:
• Pairwise Sampling (PS)
(Qian&al, 2013)
• AUC-Based Sampling (AS)
(Donmez&al, 2009)
Final Ranking
no
yes
Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Active
Sampling
SAs
User
Rates
SAs
Final Ranking
no
yes
Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Active
Sampling
SAs
User
Rates
SAs
Final Ranking
no
yes
But, ranking models need to
be initialized with ranked SAs
(cold-start problem)
Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Active
Sampling
SAs
User
Rates
SAs
Final Ranking
no
yes
Bootstrapping
Clustering as Bootstrapping
19
• Use clustering algorithms on the
set of SAs
• For each cluster select the SA
that is closest to the cluster
average
• User rates all the SAs that
represent the clusters
Clustering as Bootstrapping
19
• Use clustering algorithms on the
set of SAs
• For each cluster select the SA
that is closest to the cluster
average
• User rates all the SAs that
represent the clusters
Clustering as Bootstrapping
19
• Use clustering algorithms on the
set of SAs
• For each cluster select the SA
that is closest to the cluster
average
• User rates all the SAs that
represent the clusters
Clustering as Bootstrapping
19
• Use clustering algorithms on the
set of SAs
• For each cluster select the SA
that is closest to the cluster
average
• User rates all the SAs that
represent the clusters
Serendipity as Bootstrapping
20
• User rates top-k SAs ranked by Serendipity
• Users are able to see an ordered set of SAs since the beginning
• Users rates SAs that are estimated to be interesting for them
Serendipity vs Clustering
Clustering:
• PROS: selected SAs are representative of the vector space
• CONS: rated SAs might not be interesting for the user
Serendipity:
• PROS: rated SAs are estimated to be interesting for a generic user
• CONS: heuristic function, no representativeness
21
Example: Rating of Most Serendipitous SAs (#0)
22
Hillary
Clinton
New
York
Donald
Trump
Hillary
Clinton
Bill
Clinton
Democ.
Party
Donald
Trump
Indepen.
Politic.
United
State
Senate
region birthPlace
spouse party
party
political
party
Rating given by the
user
Ideal rating for the
user
Example: Rating of Most Serendipitous SAs (#0)
22
Hillary
Clinton
New
York
Donald
Trump
Hillary
Clinton
Bill
Clinton
Democ.
Party
Donald
Trump
Indepen.
Politic.
United
State
Senate
region birthPlace
spouse party
party
political
party
Rating given by the
user
Ideal rating for the
user
Example: Rating of Most Serendipitous SAs (#0)
22
Hillary
Clinton
New
York
Donald
Trump
Hillary
Clinton
Bill
Clinton
Democ.
Party
Donald
Trump
Indepen.
Politic.
United
State
Senate
3
5
1
region birthPlace
spouse party
party
political
party
Rating given by the
user
Ideal rating for the
user
Example: Ranking Learned with RankSVM (#0)
23
Bernie
Sanders
New
York
Donald
Trump
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Repub.
Party
United
State
Senate
birthPlace birthPlace
other
party party
political
partyparty
Rating given by the
user
Ideal rating for the
user
Example: Ranking Learned with RankSVM (#0)
23
Bernie
Sanders
New
York
Donald
Trump
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Repub.
Party
United
State
Senate
3
6
5
birthPlace birthPlace
other
party party
political
partyparty
Rating given by the
user
Ideal rating for the
user
Example: Rating on Sampled SAs (#1)
24
Hillary
Clinton
Democ.
Party
Unites
State
Senate
Democ.
Party
Joe
Biden
Unites
State
Senate
political
party
leaderparty
party
Rating given by the
user
Ideal rating for the
user
Example: Rating on Sampled SAs (#1)
24
Hillary
Clinton
Democ.
Party
Unites
State
Senate
Democ.
Party
Joe
Biden
Unites
State
Senate
political
party
leaderparty
party
Rating given by the
user
Ideal rating for the
user
Example: Rating on Sampled SAs (#1)
24
Hillary
Clinton
Democ.
Party
Unites
State
Senate
Democ.
Party
Joe
Biden
Unites
State
Senate
5
1
political
party
leaderparty
party
Rating given by the
user
Ideal rating for the
user
Example: Ranking Learned with RankSVM (#1)
25
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Democ.
Party
United
State
Senate
Donald
Trump
Repub.
Party
United
States
Senate
other
party party
political
party
political
partyparty
party
Rating given by the
user
Ideal rating for the
user
Example: Ranking Learned with RankSVM (#1)
25
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Democ.
Party
United
State
Senate
Donald
Trump
Repub.
Party
United
States
Senate
6
6
5
other
party party
political
party
political
partyparty
party
Rating given by the
user
Ideal rating for the
user
Example: Ranking Learned with RankSVM (#1)
25
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Democ.
Party
United
State
Senate
Donald
Trump
Repub.
Party
United
States
Senate
6
6
5
This SA was second in
the previous ranking
other
party party
political
party
political
partyparty
party
Rating given by the
user
Ideal rating for the
user
Features for RankSVM
26
SAs are represented in the space using different features divided in three main categories:
• Topological Features:
PageRank on SAs (Page&al, 1999)
DBpedia PageRank (Thalhammer&al, 2016)
HITS (Kleinberg&al, 1999)
• Relevance Features:
Relevance (Palmonari&al, 2015),
Temporal Relevance (Bianchi&al, 2017)
• Predicate-Based Features:
Path Informativeness (Pirrò, 2015)
Path Pattern Informativeness (Pirrò, 2015)
Rarity (Aleman-Meza&al, 2005)
Outline
27
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
Experiments: Objectives
28
Validate personalization hypothesis
• Are different users interested in different SAs?
Evaluate the performance
• (Quick) improvement of the ranking quality with user
ratings with more iterations of feedback
• Comparison of different configurations and baseline
algorithms
Experiments: Settings
29
Gold standards: Ideal rankings collected by asking users to evaluate all
SAs extracted from articles or pieces of articles
Evaluation Settings:
Contextual Exploration: ratings on the whole dataset
Cross Validation: ratings on training data, ranking on test data
Measure: quality of generated rankings vs ideal rankings (nDCG)
Experiments: Data
30
Two different datasets:
LAFU (Large Articles, Few Users)
Complete articles (New York Times)
3 articles, 2 user => 3 ideal ranking
Average number of SAs for article => 2600
Rating from 1 to 3 (1 low interest, 3 high interest)
SAMU (Small Articles, Many Users)
Small pieces of text extracted from articles (New York Times, The Guardian)
5 articles, 14 users => 25 ideal ranking
Average number of SAs for article => 74
Rating from 1 to 6 (1 low interest, 6 high interest. Scale is symmetric)
31
Experiments: Alternative Configurations and
Baselines
Algorithm Bootstrapping Active Sampling Learning
Serendipity AS Serendipity AUC-Based Sampling RankSVM
Serendipity PS Serendipity Pairwise Sampling RankSVM
Dirichlet AS Dirichlet Clustering AUC-Based Sampling RankSVM
Dirichlet PS Dirichlet Clustering Pairwise Sampling RankSVM
Gaussian AS Gaussian Clustering AUC-Based Sampling RankSVM
Gaussian PS Gaussian Clustering Pairwise Sampling RankSVM
Random Random Random Random RankSVM
Random No Bootstrapping No Active Learning No Learning to Rank
Serendipity No Bootstrapping No Active Learning No Learning to Rank
Results: Personalization Hypothesis
32
Inter Rater Reliability measures to asses the level of agreement between users
with respect to the same items (SAs).
These measure are usually defined in a range [0, 1]:
• 0 => complete disagreement between users
• 1 => users give unanimous rates
Krippendorff's alpha 0.061
Kendall's W 0.26
Value are far from 1 => hypothesis validated
Results: Performance (nDCG@10)
33
Results: Performance (nDCG@10)
33
No PS active sampling, no
real time usage possible
Results: Performance (nDCG@10)
34
Results: Performance (nDCG@10)
34
Serendipity AS
(AUC Based Sampling)
Results: Performance (nDCG@10)
35
Random Random Baseline
Results: Performance (nDCG@10)
36
Random Baseline
Serendipity Baseline
Outline
37
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
Conclusions and Future Work
38
Conclusions:
1. Quick optimization of personalized ranking function with Active Learning to Rank
2. Active Learning to Rank can be initialized with Serendipity (+performance, +interaction flow)
Future Work:
1. Exploring new algorithms for the active learning to rank
2. Need to better understand how to design user interaction for the ALR model
Thank You
39
Questions?
Contacts: federico.bianchi@disco.unimib.it
www.dacena.org
ITIS Lab – Innovative Technologies for Interaction and Services
Dipartimento di Informatica, Sistemistica e Comunicazione
Università degli Studi di Milano-Bicocca
References
40
Joachims, T. (2002, July). Optimizing search engines using clickthrough data. InProceedings of the eighth ACM SIGKDD international conference on Knowledge discovery
and data mining (pp. 133-142). ACM.
Giuseppe Pirrò. Explaining and suggesting relatedness in knowledge graphs. In ISWC, pages 622–639. Springer, 2015.
Buyue Qian, Hongfei Li, Jun Wang, Xiang Wang, and Ian Davidson. Active learning to rank using pairwise supervision. In SIAM Int. Conf. Data Mining, pages 297–305.
SIAM, 2013.
Pinar Donmez and Jaime G Carbonell. Active sampling for rank learning via optimizing the area under the ROC curve. In ECIR, pages 78–89. Springer, 2009
Federico Bianchi, Matteo Palmonari, Marco Cremaschi, and Elisabetta Fersini. Actively learning to rank semantic associations for personalized contextual exploration of
knowledge graphs. In ESWC, 2017
Matteo Palmonari, Giorgio Uboldi, Marco Cremaschi, Daniele Ciminieri, and Federico Bianchi. Dacena: Serendipitous news reading with data contexts. In ESWC, pages
133–137. Springer, 2015
Page, Lawrence, et al. The PageRank citation ranking: Bringing order to the web. Stanford InfoLab, 1999.
Thalhammer, A., & Rettinger, A. (2016, May). PageRank on Wikipedia: towards general importance scores for entities. In International Semantic Web Conference (pp. 227-
240). Springer International Publishing.
Kleinberg, J. M., Kumar, R., Raghavan, P., Rajagopalan, S., & Tomkins, A. S. (1999, July). The web as a graph: measurements, models, and methods. In International
Computing and Combinatorics Conference (pp. 1-17). Springer Berlin Heidelberg.
Aleman-Meza, B., Halaschek-Weiner, C., Arpinar, I. B., Ramakrishnan, C., & Sheth, A. P. (2005). Ranking complex relationships on the semantic web. IEEE Internet
computing, 9(3), 37-44.

More Related Content

What's hot

Science of the Interwebs
Science of the InterwebsScience of the Interwebs
Science of the Interwebsnitchmarketing
 
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan KohlHaystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan KohlOpenSource Connections
 
Boolean Searching
Boolean SearchingBoolean Searching
Boolean SearchingTBogan
 
Boolean Searching
Boolean SearchingBoolean Searching
Boolean SearchingTBogan
 
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...OpenSource Connections
 
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...Cataldo Musto
 
Cs583 link-analysis
Cs583 link-analysisCs583 link-analysis
Cs583 link-analysisBorseshweta
 
PageRank Algorithm In data mining
PageRank Algorithm In data miningPageRank Algorithm In data mining
PageRank Algorithm In data miningMai Mustafa
 
Text Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextText Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextSeth Grimes
 
Better Information with Curation Markets
Better Information with Curation MarketsBetter Information with Curation Markets
Better Information with Curation Marketsslgraham
 

What's hot (12)

Science of the Interwebs
Science of the InterwebsScience of the Interwebs
Science of the Interwebs
 
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan KohlHaystack 2019 - Search-based recommendations at Politico - Ryan Kohl
Haystack 2019 - Search-based recommendations at Politico - Ryan Kohl
 
Pr
PrPr
Pr
 
SWMRA EF- 2011
SWMRA EF- 2011SWMRA EF- 2011
SWMRA EF- 2011
 
Boolean Searching
Boolean SearchingBoolean Searching
Boolean Searching
 
Boolean Searching
Boolean SearchingBoolean Searching
Boolean Searching
 
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
Haystack 2019 - Search Logs + Machine Learning = Auto-Tagging Inventory - Joh...
 
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
Semantics-aware Techniques for Social Media Analysis, User Modeling and Recom...
 
Cs583 link-analysis
Cs583 link-analysisCs583 link-analysis
Cs583 link-analysis
 
PageRank Algorithm In data mining
PageRank Algorithm In data miningPageRank Algorithm In data mining
PageRank Algorithm In data mining
 
Text Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's NextText Analytics Market Insights: What's Working and What's Next
Text Analytics Market Insights: What's Working and What's Next
 
Better Information with Curation Markets
Better Information with Curation MarketsBetter Information with Curation Markets
Better Information with Curation Markets
 

Similar to Actively Learning to Rank Semantic Associations for Personalized Contextual Exploration of Knowledge Graphs

DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...Università degli Studi di Milano-Bicocca
 
Master Minds on Data Science - Maarten de Rijke
Master Minds on Data Science - Maarten de RijkeMaster Minds on Data Science - Maarten de Rijke
Master Minds on Data Science - Maarten de RijkeMedia Perspectives
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationGong Cheng
 
Optimizing Search User Interfaces and Interactions within Professional Social...
Optimizing Search User Interfaces and Interactions within Professional Social...Optimizing Search User Interfaces and Interactions within Professional Social...
Optimizing Search User Interfaces and Interactions within Professional Social...Nik Spirin
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jMax De Marzi
 
Mining and analyzing social media part 1 - hicss47 tutorial - dave king
Mining and analyzing social media   part 1 - hicss47 tutorial - dave kingMining and analyzing social media   part 1 - hicss47 tutorial - dave king
Mining and analyzing social media part 1 - hicss47 tutorial - dave kingDave King
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Max De Marzi
 
JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...
JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...
JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...GUANGYUAN PIAO
 
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...Amsive
 
Linked Data Entity Summarization (PhD defense)
Linked Data Entity Summarization (PhD defense)Linked Data Entity Summarization (PhD defense)
Linked Data Entity Summarization (PhD defense)Andreas Thalhammer
 
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Andre Freitas
 
Natural Language Search with Knowledge Graphs (Chicago Meetup)
Natural Language Search with Knowledge Graphs (Chicago Meetup)Natural Language Search with Knowledge Graphs (Chicago Meetup)
Natural Language Search with Knowledge Graphs (Chicago Meetup)Trey Grainger
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataAndre Freitas
 
Knoesis-Semantic filtering-Tutorials
Knoesis-Semantic filtering-TutorialsKnoesis-Semantic filtering-Tutorials
Knoesis-Semantic filtering-TutorialsPavan Kapanipathi
 
Path 101 Opportunity
Path 101 OpportunityPath 101 Opportunity
Path 101 Opportunitypath101
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesDaniel Valcarce
 
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...Pablo Castells
 
Recommendation systems
Recommendation systems  Recommendation systems
Recommendation systems Badr Hirchoua
 
Everything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic DataEverything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic DataMOSTLY AI
 

Similar to Actively Learning to Rank Semantic Associations for Personalized Contextual Exploration of Knowledge Graphs (20)

DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
 
Master Minds on Data Science - Maarten de Rijke
Master Minds on Data Science - Maarten de RijkeMaster Minds on Data Science - Maarten de Rijke
Master Minds on Data Science - Maarten de Rijke
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and Summarization
 
Optimizing Search User Interfaces and Interactions within Professional Social...
Optimizing Search User Interfaces and Interactions within Professional Social...Optimizing Search User Interfaces and Interactions within Professional Social...
Optimizing Search User Interfaces and Interactions within Professional Social...
 
Bootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4jBootstrapping Recommendations with Neo4j
Bootstrapping Recommendations with Neo4j
 
Mining and analyzing social media part 1 - hicss47 tutorial - dave king
Mining and analyzing social media   part 1 - hicss47 tutorial - dave kingMining and analyzing social media   part 1 - hicss47 tutorial - dave king
Mining and analyzing social media part 1 - hicss47 tutorial - dave king
 
Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015Bootstrapping Recommendations OSCON 2015
Bootstrapping Recommendations OSCON 2015
 
JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...
JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...
JIST2015-Computing the Semantic Similarity of Resources in DBpedia for Recomm...
 
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
SEOktoberfest 2022 - Blending SEO, Discover, & Entity Extraction to Analyze D...
 
Linked Data Entity Summarization (PhD defense)
Linked Data Entity Summarization (PhD defense)Linked Data Entity Summarization (PhD defense)
Linked Data Entity Summarization (PhD defense)
 
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
Talking to your Data: Natural Language Interfaces for a schema-less world (Ke...
 
Natural Language Search with Knowledge Graphs (Chicago Meetup)
Natural Language Search with Knowledge Graphs (Chicago Meetup)Natural Language Search with Knowledge Graphs (Chicago Meetup)
Natural Language Search with Knowledge Graphs (Chicago Meetup)
 
Introduction to question answering for linked data & big data
Introduction to question answering for linked data & big dataIntroduction to question answering for linked data & big data
Introduction to question answering for linked data & big data
 
Knoesis-Semantic filtering-Tutorials
Knoesis-Semantic filtering-TutorialsKnoesis-Semantic filtering-Tutorials
Knoesis-Semantic filtering-Tutorials
 
Path 101 Opportunity
Path 101 OpportunityPath 101 Opportunity
Path 101 Opportunity
 
Information Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slidesInformation Retrieval Models for Recommender Systems - PhD slides
Information Retrieval Models for Recommender Systems - PhD slides
 
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
RSWeb @ ACM RecSys 2014 - Exploring social network effects on popularity bias...
 
Recommendation systems
Recommendation systems  Recommendation systems
Recommendation systems
 
BDACA - Lecture4
BDACA - Lecture4BDACA - Lecture4
BDACA - Lecture4
 
Everything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic DataEverything You Always Wanted to Know About Synthetic Data
Everything You Always Wanted to Know About Synthetic Data
 

Recently uploaded

Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024innovationoecd
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPirithiRaju
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxyaramohamed343013
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringPrajakta Shinde
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxnoordubaliya2003
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfWildaNurAmalia2
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxNandakishor Bhaurao Deshmukh
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptArshadWarsi13
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 

Recently uploaded (20)

Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024OECD bibliometric indicators: Selected highlights, April 2024
OECD bibliometric indicators: Selected highlights, April 2024
 
Volatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -IVolatile Oils Pharmacognosy And Phytochemistry -I
Volatile Oils Pharmacognosy And Phytochemistry -I
 
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdfPests of jatropha_Bionomics_identification_Dr.UPR.pdf
Pests of jatropha_Bionomics_identification_Dr.UPR.pdf
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Scheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docxScheme-of-Work-Science-Stage-4 cambridge science.docx
Scheme-of-Work-Science-Stage-4 cambridge science.docx
 
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort ServiceHot Sexy call girls in  Moti Nagar,🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Moti Nagar,🔝 9953056974 🔝 escort Service
 
Microteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical EngineeringMicroteaching on terms used in filtration .Pharmaceutical Engineering
Microteaching on terms used in filtration .Pharmaceutical Engineering
 
preservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptxpreservation, maintanence and improvement of industrial organism.pptx
preservation, maintanence and improvement of industrial organism.pptx
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdfBUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
BUMI DAN ANTARIKSA PROJEK IPAS SMK KELAS X.pdf
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptxTHE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
THE ROLE OF PHARMACOGNOSY IN TRADITIONAL AND MODERN SYSTEM OF MEDICINE.pptx
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Transposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.pptTransposable elements in prokaryotes.ppt
Transposable elements in prokaryotes.ppt
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 

Actively Learning to Rank Semantic Associations for Personalized Contextual Exploration of Knowledge Graphs

  • 1. Actively Learning to Rank Semantic Associations for Personalized Contextual Exploration of Knowledge Graphs Federico Bianchi, Matteo Palmonari, Marco Cremaschi and Elisabetta Fersini federico.bianchi@disco.unimib.it ITIS Lab – Innovative Technologies for Interaction and Services Dipartimento di Informatica, Sistemistica e Comunicazione Università degli Studi di Milano-Bicocca 1-6-2017, Portorož, Slovenia
  • 2. Outline 2 • Contextual Exploration of Knowledge Graphs • Actively Learning to Rank Semantic Associations • Experiments • Conclusions and Future Work
  • 3. Outline 3 • Contextual Exploration of Knowledge Graphs • Actively Learning to Rank Semantic Associations • Experiments • Conclusions and Future Work
  • 4. Knowledge Graphs (KGs) • Models used for knowledge representation using graphs • DBpedia, YAGO, Google KG, … • Nodes represent real-world entities • Labelled edges represent relations between them. 4 Bernie Sanders Hillary Clinton Democratic Party KGs may contain interesting relations for users
  • 5. Relational Knowledge in KGs and Semantic Associations • KGs provide vast amount of relational knowledge • Semantic Associations (SAs) • chains of relations between entities • arbitrary length • inverse properties included 5 Bernie Sanders Democratic Party Hillary Clinton party party Bernie_Sanders party > Democratic_Party < party Hilary_Clinton
  • 6. Empowering Comprehension of Web Content 6 Who is Bernie Sanders? What is his relation with Hillary Clinton?
  • 7. Empowering Comprehension of Web Content 6 Who is Bernie Sanders? What is his relation with Hillary Clinton?
  • 8. 7 Contextual Exploration of KGs Support a user who is doing a familiar task, e.g., reading a news article, to access content extracted from a KG, selected and pushed to him in a proactive fashion. Who is Bernie Sanders? What is his relation with Hillary Clinton?
  • 9. 7 Contextual Exploration of KGs Support a user who is doing a familiar task, e.g., reading a news article, to access content extracted from a KG, selected and pushed to him in a proactive fashion. Bernie Sanders Democratic Party Hillary Clinton party party
  • 10. 8 Contextual Exploration of DBpedia with DaCENA www.dacena.org (Palmonari&al, 2015)
  • 11. Entity Extraction 9 Bernie Sanders has urged his supporters to look beyond the Democratic presidential nomination in a speech that stopped short of fully endorsing Hillary Clinton but made clear he was no longer actively challenging her candidacy. In an anticlimatic speech that signalled the effective end of a 14-month campaign odyssey, the Vermont senator insisted his “political revolution continues” despite Clinton’s effective victory in the delegate race. Entities are extracted from text Entity Extraction SAs Retrieval
  • 12. Entity Extraction 9 Bernie Sanders has urged his supporters to look beyond the Democratic presidential nomination in a speech that stopped short of fully endorsing Hillary Clinton but made clear he was no longer actively challenging her candidacy. In an anticlimatic speech that signalled the effective end of a 14-month campaign odyssey, the Vermont senator insisted his “political revolution continues” despite Clinton’s effective victory in the delegate race. Bernie Sanders, Hillary Clinton, Democratic Party, Vermont… Bernie Sanders has urged his supporters to look beyond the Democratic presidential nomination in a speech that stopped short of fully endorsing Hillary Clinton but made clear he was no longer actively challenging her candidacy. In an anticlimatic speech that signalled the effective end of a 14-month campaign odyssey, the Vermont senator insisted his “political revolution continues” despite Clinton’s effective victory in the delegate race. Entities are extracted from text Entity Extraction SAs Retrieval
  • 13. Retrieval of Semantic Associations 10 SPARQL query to the DBpedia endpoint. SAs between all the entities • maximum number of hops = 2 Entity Extraction SAs Retrieval
  • 14. Retrieval of Semantic Associations 10 SPARQL query to the DBpedia endpoint. SAs between all the entities • maximum number of hops = 2 Between ( Bernie Sanders and Hillary Clinton ) Entity Extraction SAs Retrieval
  • 15. Retrieval of Semantic Associations 10 SPARQL query to the DBpedia endpoint. SAs between all the entities • maximum number of hops = 2 Between ( Bernie Sanders and Hillary Clinton ) Entity Extraction SAs Retrieval
  • 16. Retrieval of Semantic Associations 10 SPARQL query to the DBpedia endpoint. SAs between all the entities • maximum number of hops = 2 Between ( Bernie Sanders and Hillary Clinton ) Entity Extraction SAs Retrieval SPARQL
  • 17. Retrieval of Semantic Associations 10 SPARQL query to the DBpedia endpoint. SAs between all the entities • maximum number of hops = 2 Between ( Bernie Sanders and Hillary Clinton ) Entity Extraction SAs Retrieval SPARQL
  • 18. Retrieval of Semantic Associations 10 SPARQL query to the DBpedia endpoint. SAs between all the entities • maximum number of hops = 2 Between ( Bernie Sanders and Hillary Clinton ) Entity Extraction SAs Retrieval Bernie Sanders Democratic Party party SPARQL
  • 19. Retrieval of Semantic Associations 10 SPARQL query to the DBpedia endpoint. SAs between all the entities • maximum number of hops = 2 Between ( Bernie Sanders and Hillary Clinton ) Entity Extraction SAs Retrieval Bernie Sanders Democratic Party party Hillary Clinton party SPARQL
  • 20. Information Overload in Contextual KG Exploration 11 Too many associations from even small pieces of text • E.g., 40107 associations from an article with 942 words • Not fit in a single screen • Users can explore only a limited number of associations (≤ 100) Crucial issue for KG exploration: Which are the most interesting to show to users?
  • 21. Ranking SAs by Estimated Interest: Serendipity 12 Heuristic measure: try to find those associations that are relevant and may be unexpected to users Serendipity = relevance + unexpectedness Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
  • 22. Ranking SAs by Estimated Interest: Serendipity 12 Heuristic measure: try to find those associations that are relevant and may be unexpected to users Serendipity = relevance + unexpectedness Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA) relevance (SA, Text) = cos(abstracts(SAs), Text) - with TF-IDF weighting
  • 23. Ranking SAs by Estimated Interest: Serendipity 12 Heuristic measure: try to find those associations that are relevant and may be unexpected to users Serendipity = relevance + unexpectedness Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA) (Aleman-Meza&al, 2005)relevance (SA, Text) = cos(abstracts(SAs), Text) - with TF-IDF weighting
  • 24. Ranking SAs by Estimated Interest: Serendipity 13 Heuristic measure: try to find those associations that are relevant and may be unexpected to users Serendipity = relevance + unexpectedness Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
  • 25. Ranking SAs by Estimated Interest: Serendipity 13 Heuristic measure: try to find those associations that are relevant and may be unexpected to users Serendipity = relevance + unexpectedness Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA) user can tune the weight assigned to relevance vs unexpectedness
  • 26. Example of SAs ranked by Serendipity 14
  • 27. Outline 16 • Contextual Exploration of Knowledge Graphs • Actively Learning to Rank Semantic Associations • Experiments • Conclusions and Future Work
  • 28. Personalized Exploration of KGs 17 What if different users are interested in different SAs? 1. Learn a ranking function starting from explicit ratings given by the users 2. Ask users few ratings as possible (rating too many SAs can become a tedious task) 3. Speed up learning by sampling SAs that are estimated more infromative for training the model Definition of an active learning to rank model for personalized contextual exploration of KGs
  • 29. Active Learning to Rank for SAs 18
  • 30. Active Learning to Rank for SAs 18 Ranking Rank SVM algorithm: • Derivated From SVM (Support Vector Machine) • Well known and widly used in the literature
  • 31. Active Learning to Rank for SAs 18 Ranking Refine Ranking?
  • 32. Active Learning to Rank for SAs 18 Ranking Refine Ranking? Final Ranking no
  • 33. Active Learning to Rank for SAs 18 Ranking Refine Ranking? Active Sampling SAs Two algorithm used to find meaningful SAs: • Pairwise Sampling (PS) (Qian&al, 2013) • AUC-Based Sampling (AS) (Donmez&al, 2009) Final Ranking no yes
  • 34. Active Learning to Rank for SAs 18 Ranking Refine Ranking? Active Sampling SAs User Rates SAs Final Ranking no yes
  • 35. Active Learning to Rank for SAs 18 Ranking Refine Ranking? Active Sampling SAs User Rates SAs Final Ranking no yes But, ranking models need to be initialized with ranked SAs (cold-start problem)
  • 36. Active Learning to Rank for SAs 18 Ranking Refine Ranking? Active Sampling SAs User Rates SAs Final Ranking no yes Bootstrapping
  • 37. Clustering as Bootstrapping 19 • Use clustering algorithms on the set of SAs • For each cluster select the SA that is closest to the cluster average • User rates all the SAs that represent the clusters
  • 38. Clustering as Bootstrapping 19 • Use clustering algorithms on the set of SAs • For each cluster select the SA that is closest to the cluster average • User rates all the SAs that represent the clusters
  • 39. Clustering as Bootstrapping 19 • Use clustering algorithms on the set of SAs • For each cluster select the SA that is closest to the cluster average • User rates all the SAs that represent the clusters
  • 40. Clustering as Bootstrapping 19 • Use clustering algorithms on the set of SAs • For each cluster select the SA that is closest to the cluster average • User rates all the SAs that represent the clusters
  • 41. Serendipity as Bootstrapping 20 • User rates top-k SAs ranked by Serendipity • Users are able to see an ordered set of SAs since the beginning • Users rates SAs that are estimated to be interesting for them
  • 42. Serendipity vs Clustering Clustering: • PROS: selected SAs are representative of the vector space • CONS: rated SAs might not be interesting for the user Serendipity: • PROS: rated SAs are estimated to be interesting for a generic user • CONS: heuristic function, no representativeness 21
  • 43. Example: Rating of Most Serendipitous SAs (#0) 22 Hillary Clinton New York Donald Trump Hillary Clinton Bill Clinton Democ. Party Donald Trump Indepen. Politic. United State Senate region birthPlace spouse party party political party Rating given by the user Ideal rating for the user
  • 44. Example: Rating of Most Serendipitous SAs (#0) 22 Hillary Clinton New York Donald Trump Hillary Clinton Bill Clinton Democ. Party Donald Trump Indepen. Politic. United State Senate region birthPlace spouse party party political party Rating given by the user Ideal rating for the user
  • 45. Example: Rating of Most Serendipitous SAs (#0) 22 Hillary Clinton New York Donald Trump Hillary Clinton Bill Clinton Democ. Party Donald Trump Indepen. Politic. United State Senate 3 5 1 region birthPlace spouse party party political party Rating given by the user Ideal rating for the user
  • 46. Example: Ranking Learned with RankSVM (#0) 23 Bernie Sanders New York Donald Trump Hillary Clinton Repub. Party Donald Trump Donald Trump Repub. Party United State Senate birthPlace birthPlace other party party political partyparty Rating given by the user Ideal rating for the user
  • 47. Example: Ranking Learned with RankSVM (#0) 23 Bernie Sanders New York Donald Trump Hillary Clinton Repub. Party Donald Trump Donald Trump Repub. Party United State Senate 3 6 5 birthPlace birthPlace other party party political partyparty Rating given by the user Ideal rating for the user
  • 48. Example: Rating on Sampled SAs (#1) 24 Hillary Clinton Democ. Party Unites State Senate Democ. Party Joe Biden Unites State Senate political party leaderparty party Rating given by the user Ideal rating for the user
  • 49. Example: Rating on Sampled SAs (#1) 24 Hillary Clinton Democ. Party Unites State Senate Democ. Party Joe Biden Unites State Senate political party leaderparty party Rating given by the user Ideal rating for the user
  • 50. Example: Rating on Sampled SAs (#1) 24 Hillary Clinton Democ. Party Unites State Senate Democ. Party Joe Biden Unites State Senate 5 1 political party leaderparty party Rating given by the user Ideal rating for the user
  • 51. Example: Ranking Learned with RankSVM (#1) 25 Hillary Clinton Repub. Party Donald Trump Donald Trump Democ. Party United State Senate Donald Trump Repub. Party United States Senate other party party political party political partyparty party Rating given by the user Ideal rating for the user
  • 52. Example: Ranking Learned with RankSVM (#1) 25 Hillary Clinton Repub. Party Donald Trump Donald Trump Democ. Party United State Senate Donald Trump Repub. Party United States Senate 6 6 5 other party party political party political partyparty party Rating given by the user Ideal rating for the user
  • 53. Example: Ranking Learned with RankSVM (#1) 25 Hillary Clinton Repub. Party Donald Trump Donald Trump Democ. Party United State Senate Donald Trump Repub. Party United States Senate 6 6 5 This SA was second in the previous ranking other party party political party political partyparty party Rating given by the user Ideal rating for the user
  • 54. Features for RankSVM 26 SAs are represented in the space using different features divided in three main categories: • Topological Features: PageRank on SAs (Page&al, 1999) DBpedia PageRank (Thalhammer&al, 2016) HITS (Kleinberg&al, 1999) • Relevance Features: Relevance (Palmonari&al, 2015), Temporal Relevance (Bianchi&al, 2017) • Predicate-Based Features: Path Informativeness (Pirrò, 2015) Path Pattern Informativeness (Pirrò, 2015) Rarity (Aleman-Meza&al, 2005)
  • 55. Outline 27 • Contextual Exploration of Knowledge Graphs • Actively Learning to Rank Semantic Associations • Experiments • Conclusions and Future Work
  • 56. Experiments: Objectives 28 Validate personalization hypothesis • Are different users interested in different SAs? Evaluate the performance • (Quick) improvement of the ranking quality with user ratings with more iterations of feedback • Comparison of different configurations and baseline algorithms
  • 57. Experiments: Settings 29 Gold standards: Ideal rankings collected by asking users to evaluate all SAs extracted from articles or pieces of articles Evaluation Settings: Contextual Exploration: ratings on the whole dataset Cross Validation: ratings on training data, ranking on test data Measure: quality of generated rankings vs ideal rankings (nDCG)
  • 58. Experiments: Data 30 Two different datasets: LAFU (Large Articles, Few Users) Complete articles (New York Times) 3 articles, 2 user => 3 ideal ranking Average number of SAs for article => 2600 Rating from 1 to 3 (1 low interest, 3 high interest) SAMU (Small Articles, Many Users) Small pieces of text extracted from articles (New York Times, The Guardian) 5 articles, 14 users => 25 ideal ranking Average number of SAs for article => 74 Rating from 1 to 6 (1 low interest, 6 high interest. Scale is symmetric)
  • 59. 31 Experiments: Alternative Configurations and Baselines Algorithm Bootstrapping Active Sampling Learning Serendipity AS Serendipity AUC-Based Sampling RankSVM Serendipity PS Serendipity Pairwise Sampling RankSVM Dirichlet AS Dirichlet Clustering AUC-Based Sampling RankSVM Dirichlet PS Dirichlet Clustering Pairwise Sampling RankSVM Gaussian AS Gaussian Clustering AUC-Based Sampling RankSVM Gaussian PS Gaussian Clustering Pairwise Sampling RankSVM Random Random Random Random RankSVM Random No Bootstrapping No Active Learning No Learning to Rank Serendipity No Bootstrapping No Active Learning No Learning to Rank
  • 60. Results: Personalization Hypothesis 32 Inter Rater Reliability measures to asses the level of agreement between users with respect to the same items (SAs). These measure are usually defined in a range [0, 1]: • 0 => complete disagreement between users • 1 => users give unanimous rates Krippendorff's alpha 0.061 Kendall's W 0.26 Value are far from 1 => hypothesis validated
  • 62. Results: Performance (nDCG@10) 33 No PS active sampling, no real time usage possible
  • 66. Results: Performance (nDCG@10) 36 Random Baseline Serendipity Baseline
  • 67. Outline 37 • Contextual Exploration of Knowledge Graphs • Actively Learning to Rank Semantic Associations • Experiments • Conclusions and Future Work
  • 68. Conclusions and Future Work 38 Conclusions: 1. Quick optimization of personalized ranking function with Active Learning to Rank 2. Active Learning to Rank can be initialized with Serendipity (+performance, +interaction flow) Future Work: 1. Exploring new algorithms for the active learning to rank 2. Need to better understand how to design user interaction for the ALR model
  • 69. Thank You 39 Questions? Contacts: federico.bianchi@disco.unimib.it www.dacena.org ITIS Lab – Innovative Technologies for Interaction and Services Dipartimento di Informatica, Sistemistica e Comunicazione Università degli Studi di Milano-Bicocca
  • 70. References 40 Joachims, T. (2002, July). Optimizing search engines using clickthrough data. InProceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 133-142). ACM. Giuseppe Pirrò. Explaining and suggesting relatedness in knowledge graphs. In ISWC, pages 622–639. Springer, 2015. Buyue Qian, Hongfei Li, Jun Wang, Xiang Wang, and Ian Davidson. Active learning to rank using pairwise supervision. In SIAM Int. Conf. Data Mining, pages 297–305. SIAM, 2013. Pinar Donmez and Jaime G Carbonell. Active sampling for rank learning via optimizing the area under the ROC curve. In ECIR, pages 78–89. Springer, 2009 Federico Bianchi, Matteo Palmonari, Marco Cremaschi, and Elisabetta Fersini. Actively learning to rank semantic associations for personalized contextual exploration of knowledge graphs. In ESWC, 2017 Matteo Palmonari, Giorgio Uboldi, Marco Cremaschi, Daniele Ciminieri, and Federico Bianchi. Dacena: Serendipitous news reading with data contexts. In ESWC, pages 133–137. Springer, 2015 Page, Lawrence, et al. The PageRank citation ranking: Bringing order to the web. Stanford InfoLab, 1999. Thalhammer, A., & Rettinger, A. (2016, May). PageRank on Wikipedia: towards general importance scores for entities. In International Semantic Web Conference (pp. 227- 240). Springer International Publishing. Kleinberg, J. M., Kumar, R., Raghavan, P., Rajagopalan, S., & Tomkins, A. S. (1999, July). The web as a graph: measurements, models, and methods. In International Computing and Combinatorics Conference (pp. 1-17). Springer Berlin Heidelberg. Aleman-Meza, B., Halaschek-Weiner, C., Arpinar, I. B., Ramakrishnan, C., & Sheth, A. P. (2005). Ranking complex relationships on the semantic web. IEEE Internet computing, 9(3), 37-44.

Editor's Notes

  1. To generalize
  2. To generalize
  3. We defined an heuristic measure but we don’t know if this is the best mesasure for everybody and if a better measure exists
  4. Motivated by the fact that different users might be interested in different kind of SAs We thus want to learn a ranking funcion from explicit user rating, asking them few feedback as possible, since ranking an entire set of SAs might be tedious and boring Moreover we want a model that is able to quickly imporve and speed up the learning task
  5. Active sampling method proposed in different context More loops of ranking refinition!
  6. Active sampling method proposed in different context More loops of ranking refinition!
  7. Active sampling method proposed in different context More loops of ranking refinition!
  8. Active sampling method proposed in different context More loops of ranking refinition!
  9. Active sampling method proposed in different context More loops of ranking refinition!
  10. Active sampling method proposed in different context More loops of ranking refinition!
  11. Active sampling method proposed in different context More loops of ranking refinition!
  12. Active sampling method proposed in different context More loops of ranking refinition!
  13. The example is built over the complete set of ratings given by the user for an article
  14. The example is built over the complete set of ratings given by the user for an article
  15. The example is built over the complete set of ratings given by the user for an article
  16. Contextual: rating on the whole dataset and testing on the remaining part of the dataset -> replicate user behaviour on the application Cross Validation: evaluate the robustness of our model
  17. These are the configuration we tested, they can be divided in two major categories: the first one consists of the combination of the divverent bootstrapping methods and of active sampling. The bootstrapping has been tested with two different clustering algorithms that are Gaussian Mixture Model and the Dirichlet Gaussian Mixutre Model. The second category contains our baselines: the first one, namely Random Random, select random associations for both initialization and active sampling, is thus used to see if active learning is really useful in this context. The two remaining configuration do not use a learning to rank algorithm, so there’s no learning behind, and are used to see if ranking models can perform better than a simple heuristic, namely serendipity, and a random approach.
  18. The first result we tested was the validity of the personalization hypotesis within this context. We used Inter rater reliability measures that are usually used to assess the level of agreement between users. A 0 value can mean that there’s a complete disagreemeent between users while a 1 means that user give unanimous rates. Since value for both krippendorff’s apha and kendall’s W are distant for came to the conclusion that our hypotesis is valid
  19. These are the results that we obtained in five iterations of the loop I’ve shone you before. Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user. We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting. The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
  20. These are the results that we obtained in five iterations of the loop I’ve shone you before. Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user. We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting. The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
  21. These are the results that we obtained in five iterations of the loop I’ve shone you before. Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user. We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting. The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
  22. These are the results that we obtained in five iterations of the loop I’ve shone you before. Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user. We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting. The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
  23. These are the results that we obtained in five iterations of the loop I’ve shone you before. Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user. We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting. The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
  24. These are the results that we obtained in five iterations of the loop I’ve shone you before. Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user. We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting. The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.