Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
A Linked Data Recommender System using a Neighborhood-based Graph Kernel
1. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
A Linked Data Recommender System using a Neighborhood-based Graph Kernel
Vito Claudio Ostuni, Tommaso Di Noia, Roberto Mirizzi*, Eugenio Di Sciascio
{vitoclaudio.ostuni, tommaso.dinoia, eugenio.disciascio}@poliba.it, robertom@yahoo-inc.com
Polytechnic University of Bari - Bari (ITALY)
Yahoo! Sunnyvale, CA (US)
(*)
2. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Outline
Introduction and motivation
The proposed approach
Experimental Evaluation
Contributions and Conclusion
3. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Recommender Systems
Help users in dealing with Information/Choice Overload
4. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Model-based approach:
Feature vector about item content description
Learn a predictive user model from past user preferences
A definition
CB-RSs try to recommend items similar* to those a given user has liked in the past
[P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends. Recommender Systems Handbook.]
Content-based RSs
drama
action
Heat
Argo
The Godfather
Righteous Kill
(*) similar from a content-based perspective
5. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Motivation
Traditional Content-based Recommender Systems:
•base on keyword/attribute -based item representations
•rely on the quality of the content-analyzer to extract expressive item features
•lack of knowledge about the items
6. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Motivation
Traditional Content-based Recommender Systems:
•base on keyword/attribute -based item representations
•rely on the quality of the content-analyzer to extract expressive item features
•lack of knowledge about the items
•use Linked Open Data to obtain knowledge about items and richer item representations
7. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Linked Open Data
•Initiative for publishing and connecting data on the Web using Semantic Web technologies;
•>30 billion of RDF triples from hundreds of data sources;
•Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]
8. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Linked Open Data
•Initiative for publishing and connecting data on the Web using Semantic Web technologies;
•>30 billion of RDF triples from hundreds of data sources;
•Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]
9. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Graph-based Item Representation
The Godfather
Mafia_films
Gangster_films
American Gangster
Films_about_organized_crime_in_the_United_States
Best_Picture_Academy_Award_winners
Best_Thriller_Empire_Award_winners
Films_shot_in_New_York_City
subject
subject
subject
subject
subject
subject
subject
10. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Graph-based Item Representation
The Godfather
Mafia_films
Films_about_organized_crime
Gangster_films
American Gangster
Films_about_organized_crime_in_the_United_States
Films_about_organized_ crime_by_country
Best_Picture_Academy_Award_winners
Best_Thriller_Empire_Award_winners
Awards_for_best_film
Films_shot_in_New_York_City
subject
subject
subject
broader
broader
broader
broader
broader
subject
subject
subject
subject
11. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Graph-based Item Representation
The Godfather
Mafia_films
Films_about_organized_crime
Gangster_films
American Gangster
Films_about_organized_crime_in_the_United_States
Films_about_organized_ crime_by_country
Best_Picture_Academy_Award_winners
Best_Thriller_Empire_Award_winners
Awards_for_best_film
Films_shot_in_New_York_City
subject
subject
subject
broader
broader
broader
broader
broader
broader
subject
subject
subject
subject
12. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Graph-based Item Representation
The Godfather
Mafia_films
Films_about_organized_crime
Gangster_films
American Gangster
Films_about_organized_crime_in_the_United_States
Films_about_organized_ crime_by_country
Best_Picture_Academy_Award_winners
Best_Thriller_Empire_Award_winners
Awards_for_best_film
Films_shot_in_New_York_City
subject
subject
subject
broader
broader
broader
broader
broader
broader
subject
subject
subject
subject
Exploit entities descriptions
13. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
h-hop Item Neighborhood Graph
The Godfather
Mafia_films
Films_about_organized_crime
Gangster_films
Best_Picture_Academy_Award_winners
Awards_for_best_film
Films_shot_in_New_York_City
subject
subject
subject
broader
broader
broader
14. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Challenges
•learn the user model starting from semantic graph-based item representations (h-hop Item Neighborhood Graph)
•exploit the knowledge associated to the items
15. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Proposed Approach
•define an appropriate kernel on graph-based item representations
•use kernel methods for learning the user model
•learn the user model starting from semantic graph-based item representations (h-hop Item Neighborhood Graph)
•exploit the knowledge associated to the items
16. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Kernel Methods
Work by embedding data in a vector space and looking for linear patterns in such space
푥 → 휙(푥)
[Kernel Methods for General Pattern Analysis. Nello Cristianini . http://www.kernel-methods.net/tutorials/KMtalk.pdf]
휙(푥)
휙
푥
Input space
Feature space
We can work in the new space F by specifying an inner product function between points in it
푘푥푖,푥푗 = <휙(푥푖), 휙(푥푗)>
17. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
h-hop Item Neighborhood Graph Kernel
Explicit computation of the feature map
entity importance in the item neigh. graph
18. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
h-hop Item Neighborhood Graph Kernel
Explicit computation of the feature map
# edges involving em at l hop from i
frequency of the entity in the
item neigh. graph
proportional factor taking into account at which hop the entity appears
19. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Weights computation example
i
e1
e2
p3
p2
e4
e5
p3
p3
h=2
20. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Weights computation example
i
e1
e2
p3
p2
e4
e5
p3
p3
h=2
Informative entity about the item even if not directly related to it
21. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Experimental Settings
Trained a SVM Regression model for each user
Accuracy Evaluation: Precision, Recall,MRR (Rated Test Items protocol)
Novelty Evaluation: Entropy-based Novelty (All Items protocol) [the lower the better]
22. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Dataset
Subset of Movielens mapped to DBpedia
6,040 users
3,148 movies
Mappings of various recsys datasets to DBpedia
http://sisinflab.poliba.it/semanticweb/lod/recsys/datasets/
Three different train/test splits 20/80, 40/60, 80/20
23. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Kernel calibration – impact of alpha params. (i)
0,5
0,55
0,6
0,65
0,7
0,75
0,25
1
2
5
10
20
Prec@10 [20/80]
Prec@10 [40/60]
Prec@10 [80/20]
24. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Kernel calibration – impact of alpha params. (ii)
0
0,2
0,4
0,6
0,8
1
1,2
0,25
1
2
5
10
20
EBN@10 [20/80]
EBN@10 [40/60]
EBN@10 [80/20]
25. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Comparative approaches
•NB: 1-hop item neigh. + Naive Bayes classifier
•VSM: 1-hop item neigh. Vector Space Model (tf-idf) + SVM regr
•WK: 2-hop item neigh. Walk-based kernel + SVM regr
26. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Comparison with other approaches (i)
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
Prec@10 [20/80]
Prec@10 [40/60]
Prec@10 [80/20]
NK-bestPrec
NK-bestEntr
NB
VSM
WK
27. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Comparison with other approaches (ii)
0
0,2
0,4
0,6
0,8
1
1,2
1,4
1,6
1,8
EBN@10 [20/80]
EBN@10 [40/60]
EBN@10 [80/20]
NK-bestPrec
NK-bestEntr
NB
VSM
WK
28. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Contributions
A linked data RS based on kernel methods
Exploitation of semantic graph-based item descriptions from the Web of Data
Effective Item Neighborhood Graph Kernel
Combination of kernel methods and LOD based item descriptions for model-based Content-based recommendations
Future Work:
Evaluation of further kernel functions on graphs
Evaluation of different kernel methods
29. EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany
Q & A