The Semantic Web. ESWC 2017. Lecture Notes in Computer Science, vol 10249. Springer, Cham
Knowledge Graphs (KG) represent a large amount of Semantic Associations (SAs), i.e., chains of relations that may reveal interesting and unknown connections between different types of entities. Applications for the contextual exploration of KGs help users explore information extracted from a KG, including SAs, while they are reading an input text. Because of the large number of SAs that can be extracted from a text, a first challenge in these applications is to effectively determine which SAs are most interesting to the users, defining a suitable ranking function over SAs. However, since different users may have different interests, an additional challenge is to personalize this ranking function to match individual users’ preferences. In this paper we introduce a novel active learning to rank model to let a user rate small samples of SAs, which are used to iteratively learn a personalized ranking function. Experiments conducted with two data sets show that the approach is able to improve the quality of the ranking function with a limited number of user interactions.
The dark energy paradox leads to a new structure of spacetime.pptx
Actively Learning to Rank Semantic Associations for Personalized Contextual Exploration of Knowledge Graphs
1. Actively Learning to Rank Semantic
Associations for Personalized
Contextual Exploration of
Knowledge Graphs
Federico Bianchi, Matteo Palmonari, Marco Cremaschi and Elisabetta Fersini
federico.bianchi@disco.unimib.it
ITIS Lab – Innovative Technologies for Interaction and Services
Dipartimento di Informatica, Sistemistica e Comunicazione
Università degli Studi di Milano-Bicocca
1-6-2017, Portorož, Slovenia
2. Outline
2
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
3. Outline
3
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
4. Knowledge Graphs (KGs)
• Models used for knowledge
representation using graphs
• DBpedia, YAGO, Google KG, …
• Nodes represent real-world
entities
• Labelled edges represent
relations between them.
4
Bernie Sanders
Hillary Clinton
Democratic
Party
KGs may contain interesting relations for users
5. Relational Knowledge in KGs
and Semantic Associations
• KGs provide vast amount of
relational knowledge
• Semantic Associations (SAs)
• chains of relations between entities
• arbitrary length
• inverse properties included
5
Bernie Sanders
Democratic
Party
Hillary Clinton
party
party
Bernie_Sanders party > Democratic_Party < party Hilary_Clinton
8. 7
Contextual Exploration of KGs
Support a user who is doing a familiar task, e.g., reading a news
article, to access content extracted from a KG, selected and pushed
to him in a proactive fashion.
Who is Bernie Sanders?
What is his relation with Hillary Clinton?
9. 7
Contextual Exploration of KGs
Support a user who is doing a familiar task, e.g., reading a news
article, to access content extracted from a KG, selected and pushed
to him in a proactive fashion.
Bernie
Sanders
Democratic
Party
Hillary
Clinton
party
party
11. Entity Extraction
9
Bernie Sanders has urged his supporters to
look beyond the Democratic presidential
nomination in a speech that stopped short of
fully endorsing Hillary Clinton but made
clear he was no longer actively challenging
her candidacy. In an anticlimatic speech that
signalled the effective end of a 14-month
campaign odyssey, the Vermont senator
insisted his “political revolution continues”
despite Clinton’s effective victory in the
delegate race.
Entities are extracted from text
Entity Extraction SAs Retrieval
12. Entity Extraction
9
Bernie Sanders has urged his supporters to
look beyond the Democratic presidential
nomination in a speech that stopped short of
fully endorsing Hillary Clinton but made
clear he was no longer actively challenging
her candidacy. In an anticlimatic speech that
signalled the effective end of a 14-month
campaign odyssey, the Vermont senator
insisted his “political revolution continues”
despite Clinton’s effective victory in the
delegate race.
Bernie Sanders, Hillary Clinton,
Democratic Party, Vermont…
Bernie Sanders has urged his supporters to
look beyond the Democratic presidential
nomination in a speech that stopped short of
fully endorsing Hillary Clinton but made
clear he was no longer actively challenging
her candidacy. In an anticlimatic speech that
signalled the effective end of a 14-month
campaign odyssey, the Vermont senator
insisted his “political revolution continues”
despite Clinton’s effective victory in the
delegate race.
Entities are extracted from text
Entity Extraction SAs Retrieval
13. Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Entity Extraction SAs Retrieval
14. Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
15. Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
16. Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
SPARQL
17. Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
SPARQL
18. Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
Bernie Sanders
Democratic
Party
party
SPARQL
19. Retrieval of Semantic Associations
10
SPARQL query to the DBpedia endpoint.
SAs between all the entities
• maximum number of hops = 2
Between ( Bernie Sanders and Hillary Clinton )
Entity Extraction SAs Retrieval
Bernie Sanders
Democratic
Party
party
Hillary Clinton
party
SPARQL
20. Information Overload in Contextual KG
Exploration
11
Too many associations from even small pieces of
text
• E.g., 40107 associations from an article
with 942 words
• Not fit in a single screen
• Users can explore only a limited number of
associations (≤ 100)
Crucial issue for KG exploration:
Which are the most interesting to show to users?
21. Ranking SAs by Estimated Interest: Serendipity
12
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
22. Ranking SAs by Estimated Interest: Serendipity
12
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
relevance (SA, Text) = cos(abstracts(SAs), Text)
- with TF-IDF weighting
23. Ranking SAs by Estimated Interest: Serendipity
12
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
(Aleman-Meza&al, 2005)relevance (SA, Text) = cos(abstracts(SAs), Text)
- with TF-IDF weighting
24. Ranking SAs by Estimated Interest: Serendipity
13
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
25. Ranking SAs by Estimated Interest: Serendipity
13
Heuristic measure: try to find those associations that are relevant and may
be unexpected to users
Serendipity = relevance + unexpectedness
Serendipity(SA,Text) = α*relevance(SA,Text) + (1- α)*rarity(SA)
user can tune the weight assigned to relevance vs unexpectedness
27. Outline
16
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
28. Personalized Exploration of KGs
17
What if different users are interested in different SAs?
1. Learn a ranking function starting from explicit ratings given by the users
2. Ask users few ratings as possible (rating too many SAs can become a tedious task)
3. Speed up learning by sampling SAs that are estimated more infromative for training
the model
Definition of an active learning to rank model for personalized
contextual exploration of KGs
30. Active Learning to Rank for SAs
18
Ranking
Rank SVM algorithm:
• Derivated From SVM
(Support Vector Machine)
• Well known and widly
used in the literature
32. Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Final Ranking
no
33. Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Active
Sampling
SAs
Two algorithm used to find
meaningful SAs:
• Pairwise Sampling (PS)
(Qian&al, 2013)
• AUC-Based Sampling (AS)
(Donmez&al, 2009)
Final Ranking
no
yes
34. Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Active
Sampling
SAs
User
Rates
SAs
Final Ranking
no
yes
35. Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Active
Sampling
SAs
User
Rates
SAs
Final Ranking
no
yes
But, ranking models need to
be initialized with ranked SAs
(cold-start problem)
36. Active Learning to Rank for SAs
18
Ranking
Refine
Ranking?
Active
Sampling
SAs
User
Rates
SAs
Final Ranking
no
yes
Bootstrapping
37. Clustering as Bootstrapping
19
• Use clustering algorithms on the
set of SAs
• For each cluster select the SA
that is closest to the cluster
average
• User rates all the SAs that
represent the clusters
38. Clustering as Bootstrapping
19
• Use clustering algorithms on the
set of SAs
• For each cluster select the SA
that is closest to the cluster
average
• User rates all the SAs that
represent the clusters
39. Clustering as Bootstrapping
19
• Use clustering algorithms on the
set of SAs
• For each cluster select the SA
that is closest to the cluster
average
• User rates all the SAs that
represent the clusters
40. Clustering as Bootstrapping
19
• Use clustering algorithms on the
set of SAs
• For each cluster select the SA
that is closest to the cluster
average
• User rates all the SAs that
represent the clusters
41. Serendipity as Bootstrapping
20
• User rates top-k SAs ranked by Serendipity
• Users are able to see an ordered set of SAs since the beginning
• Users rates SAs that are estimated to be interesting for them
42. Serendipity vs Clustering
Clustering:
• PROS: selected SAs are representative of the vector space
• CONS: rated SAs might not be interesting for the user
Serendipity:
• PROS: rated SAs are estimated to be interesting for a generic user
• CONS: heuristic function, no representativeness
21
43. Example: Rating of Most Serendipitous SAs (#0)
22
Hillary
Clinton
New
York
Donald
Trump
Hillary
Clinton
Bill
Clinton
Democ.
Party
Donald
Trump
Indepen.
Politic.
United
State
Senate
region birthPlace
spouse party
party
political
party
Rating given by the
user
Ideal rating for the
user
44. Example: Rating of Most Serendipitous SAs (#0)
22
Hillary
Clinton
New
York
Donald
Trump
Hillary
Clinton
Bill
Clinton
Democ.
Party
Donald
Trump
Indepen.
Politic.
United
State
Senate
region birthPlace
spouse party
party
political
party
Rating given by the
user
Ideal rating for the
user
45. Example: Rating of Most Serendipitous SAs (#0)
22
Hillary
Clinton
New
York
Donald
Trump
Hillary
Clinton
Bill
Clinton
Democ.
Party
Donald
Trump
Indepen.
Politic.
United
State
Senate
3
5
1
region birthPlace
spouse party
party
political
party
Rating given by the
user
Ideal rating for the
user
46. Example: Ranking Learned with RankSVM (#0)
23
Bernie
Sanders
New
York
Donald
Trump
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Repub.
Party
United
State
Senate
birthPlace birthPlace
other
party party
political
partyparty
Rating given by the
user
Ideal rating for the
user
47. Example: Ranking Learned with RankSVM (#0)
23
Bernie
Sanders
New
York
Donald
Trump
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Repub.
Party
United
State
Senate
3
6
5
birthPlace birthPlace
other
party party
political
partyparty
Rating given by the
user
Ideal rating for the
user
48. Example: Rating on Sampled SAs (#1)
24
Hillary
Clinton
Democ.
Party
Unites
State
Senate
Democ.
Party
Joe
Biden
Unites
State
Senate
political
party
leaderparty
party
Rating given by the
user
Ideal rating for the
user
49. Example: Rating on Sampled SAs (#1)
24
Hillary
Clinton
Democ.
Party
Unites
State
Senate
Democ.
Party
Joe
Biden
Unites
State
Senate
political
party
leaderparty
party
Rating given by the
user
Ideal rating for the
user
50. Example: Rating on Sampled SAs (#1)
24
Hillary
Clinton
Democ.
Party
Unites
State
Senate
Democ.
Party
Joe
Biden
Unites
State
Senate
5
1
political
party
leaderparty
party
Rating given by the
user
Ideal rating for the
user
51. Example: Ranking Learned with RankSVM (#1)
25
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Democ.
Party
United
State
Senate
Donald
Trump
Repub.
Party
United
States
Senate
other
party party
political
party
political
partyparty
party
Rating given by the
user
Ideal rating for the
user
52. Example: Ranking Learned with RankSVM (#1)
25
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Democ.
Party
United
State
Senate
Donald
Trump
Repub.
Party
United
States
Senate
6
6
5
other
party party
political
party
political
partyparty
party
Rating given by the
user
Ideal rating for the
user
53. Example: Ranking Learned with RankSVM (#1)
25
Hillary
Clinton
Repub.
Party
Donald
Trump
Donald
Trump
Democ.
Party
United
State
Senate
Donald
Trump
Repub.
Party
United
States
Senate
6
6
5
This SA was second in
the previous ranking
other
party party
political
party
political
partyparty
party
Rating given by the
user
Ideal rating for the
user
54. Features for RankSVM
26
SAs are represented in the space using different features divided in three main categories:
• Topological Features:
PageRank on SAs (Page&al, 1999)
DBpedia PageRank (Thalhammer&al, 2016)
HITS (Kleinberg&al, 1999)
• Relevance Features:
Relevance (Palmonari&al, 2015),
Temporal Relevance (Bianchi&al, 2017)
• Predicate-Based Features:
Path Informativeness (Pirrò, 2015)
Path Pattern Informativeness (Pirrò, 2015)
Rarity (Aleman-Meza&al, 2005)
55. Outline
27
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
56. Experiments: Objectives
28
Validate personalization hypothesis
• Are different users interested in different SAs?
Evaluate the performance
• (Quick) improvement of the ranking quality with user
ratings with more iterations of feedback
• Comparison of different configurations and baseline
algorithms
57. Experiments: Settings
29
Gold standards: Ideal rankings collected by asking users to evaluate all
SAs extracted from articles or pieces of articles
Evaluation Settings:
Contextual Exploration: ratings on the whole dataset
Cross Validation: ratings on training data, ranking on test data
Measure: quality of generated rankings vs ideal rankings (nDCG)
58. Experiments: Data
30
Two different datasets:
LAFU (Large Articles, Few Users)
Complete articles (New York Times)
3 articles, 2 user => 3 ideal ranking
Average number of SAs for article => 2600
Rating from 1 to 3 (1 low interest, 3 high interest)
SAMU (Small Articles, Many Users)
Small pieces of text extracted from articles (New York Times, The Guardian)
5 articles, 14 users => 25 ideal ranking
Average number of SAs for article => 74
Rating from 1 to 6 (1 low interest, 6 high interest. Scale is symmetric)
59. 31
Experiments: Alternative Configurations and
Baselines
Algorithm Bootstrapping Active Sampling Learning
Serendipity AS Serendipity AUC-Based Sampling RankSVM
Serendipity PS Serendipity Pairwise Sampling RankSVM
Dirichlet AS Dirichlet Clustering AUC-Based Sampling RankSVM
Dirichlet PS Dirichlet Clustering Pairwise Sampling RankSVM
Gaussian AS Gaussian Clustering AUC-Based Sampling RankSVM
Gaussian PS Gaussian Clustering Pairwise Sampling RankSVM
Random Random Random Random RankSVM
Random No Bootstrapping No Active Learning No Learning to Rank
Serendipity No Bootstrapping No Active Learning No Learning to Rank
60. Results: Personalization Hypothesis
32
Inter Rater Reliability measures to asses the level of agreement between users
with respect to the same items (SAs).
These measure are usually defined in a range [0, 1]:
• 0 => complete disagreement between users
• 1 => users give unanimous rates
Krippendorff's alpha 0.061
Kendall's W 0.26
Value are far from 1 => hypothesis validated
67. Outline
37
• Contextual Exploration of Knowledge Graphs
• Actively Learning to Rank Semantic Associations
• Experiments
• Conclusions and Future Work
68. Conclusions and Future Work
38
Conclusions:
1. Quick optimization of personalized ranking function with Active Learning to Rank
2. Active Learning to Rank can be initialized with Serendipity (+performance, +interaction flow)
Future Work:
1. Exploring new algorithms for the active learning to rank
2. Need to better understand how to design user interaction for the ALR model
70. References
40
Joachims, T. (2002, July). Optimizing search engines using clickthrough data. InProceedings of the eighth ACM SIGKDD international conference on Knowledge discovery
and data mining (pp. 133-142). ACM.
Giuseppe Pirrò. Explaining and suggesting relatedness in knowledge graphs. In ISWC, pages 622–639. Springer, 2015.
Buyue Qian, Hongfei Li, Jun Wang, Xiang Wang, and Ian Davidson. Active learning to rank using pairwise supervision. In SIAM Int. Conf. Data Mining, pages 297–305.
SIAM, 2013.
Pinar Donmez and Jaime G Carbonell. Active sampling for rank learning via optimizing the area under the ROC curve. In ECIR, pages 78–89. Springer, 2009
Federico Bianchi, Matteo Palmonari, Marco Cremaschi, and Elisabetta Fersini. Actively learning to rank semantic associations for personalized contextual exploration of
knowledge graphs. In ESWC, 2017
Matteo Palmonari, Giorgio Uboldi, Marco Cremaschi, Daniele Ciminieri, and Federico Bianchi. Dacena: Serendipitous news reading with data contexts. In ESWC, pages
133–137. Springer, 2015
Page, Lawrence, et al. The PageRank citation ranking: Bringing order to the web. Stanford InfoLab, 1999.
Thalhammer, A., & Rettinger, A. (2016, May). PageRank on Wikipedia: towards general importance scores for entities. In International Semantic Web Conference (pp. 227-
240). Springer International Publishing.
Kleinberg, J. M., Kumar, R., Raghavan, P., Rajagopalan, S., & Tomkins, A. S. (1999, July). The web as a graph: measurements, models, and methods. In International
Computing and Combinatorics Conference (pp. 1-17). Springer Berlin Heidelberg.
Aleman-Meza, B., Halaschek-Weiner, C., Arpinar, I. B., Ramakrishnan, C., & Sheth, A. P. (2005). Ranking complex relationships on the semantic web. IEEE Internet
computing, 9(3), 37-44.
Editor's Notes
To generalize
To generalize
We defined an heuristic measure but we don’t know if this is the best mesasure for everybody and if a better measure exists
Motivated by the fact that different users might be interested in different kind of SAs
We thus want to learn a ranking funcion from explicit user rating, asking them few feedback as possible, since ranking an entire set of SAs might be tedious and boring
Moreover we want a model that is able to quickly imporve and speed up the learning task
Active sampling method proposed in different context
More loops of ranking refinition!
Active sampling method proposed in different context
More loops of ranking refinition!
Active sampling method proposed in different context
More loops of ranking refinition!
Active sampling method proposed in different context
More loops of ranking refinition!
Active sampling method proposed in different context
More loops of ranking refinition!
Active sampling method proposed in different context
More loops of ranking refinition!
Active sampling method proposed in different context
More loops of ranking refinition!
Active sampling method proposed in different context
More loops of ranking refinition!
The example is built over the complete set of ratings given by the user for an article
The example is built over the complete set of ratings given by the user for an article
The example is built over the complete set of ratings given by the user for an article
Contextual: rating on the whole dataset and testing on the remaining part of the dataset -> replicate user behaviour on the application
Cross Validation: evaluate the robustness of our model
These are the configuration we tested, they can be divided in two major categories: the first one consists of the combination of the divverent bootstrapping methods and of active sampling. The bootstrapping has been tested with two different clustering algorithms that are Gaussian Mixture Model and the Dirichlet Gaussian Mixutre Model. The second category contains our baselines: the first one, namely Random Random, select random associations for both initialization and active sampling, is thus used to see if active learning is really useful in this context. The two remaining configuration do not use a learning to rank algorithm, so there’s no learning behind, and are used to see if ranking models can perform better than a simple heuristic, namely serendipity, and a random approach.
The first result we tested was the validity of the personalization hypotesis within this context. We used Inter rater reliability measures that are usually used to assess the level of agreement between users. A 0 value can mean that there’s a complete disagreemeent between users while a 1 means that user give unanimous rates. Since value for both krippendorff’s apha and kendall’s W are distant for came to the conclusion that our hypotesis is valid
These are the results that we obtained in five iterations of the loop I’ve shone you before.
Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user.
We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting.
The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
These are the results that we obtained in five iterations of the loop I’ve shone you before.
Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user.
We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting.
The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
These are the results that we obtained in five iterations of the loop I’ve shone you before.
Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user.
We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting.
The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
These are the results that we obtained in five iterations of the loop I’ve shone you before.
Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user.
We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting.
The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
These are the results that we obtained in five iterations of the loop I’ve shone you before.
Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user.
We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting.
The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.
These are the results that we obtained in five iterations of the loop I’ve shone you before.
Results are computed using nDCG on the top ten results of the ranking list and are aggregated, so we took the mean value for each iteration with respect to each user.
We can see that the algorithm that performs better for both the experiments is the one that uses the serendipity heuristic and the AUC based sampling: this is interesting for two reason: in our application we are able to provide an ordered set of probably interesting semantic associations since the first iteration, moreover is the sub-optimal algorithms that is able to gain the best results. This si again importatn because you can see that the graphs on the right lack of the second algorithm for sampling, this is because the algorithms wasn’t able to compute the results in a time that could be considered good for user interaction, thus making in unusable in a live setting.
The last thing that is really interesting to notice is that active learning performs better then the baseline we have defined. The learning to rank approach feeded with random observations is always the least algorithms in the learning to rank group. The same goes for the last two baselines considered, those that did not use active learning.