EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
A Linked Data Recommender System using a Neighborhood-based Graph Kernel 
Vito Claudio Ostuni, Tommaso Di Noia, Roberto Mirizzi*, Eugenio Di Sciascio 
{vitoclaudio.ostuni, tommaso.dinoia, eugenio.disciascio}@poliba.it, robertom@yahoo-inc.com 
Polytechnic University of Bari - Bari (ITALY) 
Yahoo! Sunnyvale, CA (US) 
(*)
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Outline 
Introduction and motivation 
The proposed approach 
Experimental Evaluation 
Contributions and Conclusion
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Recommender Systems 
Help users in dealing with Information/Choice Overload
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Model-based approach: 
Feature vector about item content description 
Learn a predictive user model from past user preferences 
A definition 
CB-RSs try to recommend items similar* to those a given user has liked in the past 
[P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends. Recommender Systems Handbook.] 
Content-based RSs 
drama 
action 
Heat 
Argo 
The Godfather 
Righteous Kill 
(*) similar from a content-based perspective
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Motivation 
Traditional Content-based Recommender Systems: 
•base on keyword/attribute -based item representations 
•rely on the quality of the content-analyzer to extract expressive item features 
•lack of knowledge about the items
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Motivation 
Traditional Content-based Recommender Systems: 
•base on keyword/attribute -based item representations 
•rely on the quality of the content-analyzer to extract expressive item features 
•lack of knowledge about the items 
•use Linked Open Data to obtain knowledge about items and richer item representations
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Linked Open Data 
•Initiative for publishing and connecting data on the Web using Semantic Web technologies; 
•>30 billion of RDF triples from hundreds of data sources; 
•Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Linked Open Data 
•Initiative for publishing and connecting data on the Web using Semantic Web technologies; 
•>30 billion of RDF triples from hundreds of data sources; 
•Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Graph-based Item Representation 
The Godfather 
Mafia_films 
Gangster_films 
American Gangster 
Films_about_organized_crime_in_the_United_States 
Best_Picture_Academy_Award_winners 
Best_Thriller_Empire_Award_winners 
Films_shot_in_New_York_City 
subject 
subject 
subject 
subject 
subject 
subject 
subject
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Graph-based Item Representation 
The Godfather 
Mafia_films 
Films_about_organized_crime 
Gangster_films 
American Gangster 
Films_about_organized_crime_in_the_United_States 
Films_about_organized_ crime_by_country 
Best_Picture_Academy_Award_winners 
Best_Thriller_Empire_Award_winners 
Awards_for_best_film 
Films_shot_in_New_York_City 
subject 
subject 
subject 
broader 
broader 
broader 
broader 
broader 
subject 
subject 
subject 
subject
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Graph-based Item Representation 
The Godfather 
Mafia_films 
Films_about_organized_crime 
Gangster_films 
American Gangster 
Films_about_organized_crime_in_the_United_States 
Films_about_organized_ crime_by_country 
Best_Picture_Academy_Award_winners 
Best_Thriller_Empire_Award_winners 
Awards_for_best_film 
Films_shot_in_New_York_City 
subject 
subject 
subject 
broader 
broader 
broader 
broader 
broader 
broader 
subject 
subject 
subject 
subject
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Graph-based Item Representation 
The Godfather 
Mafia_films 
Films_about_organized_crime 
Gangster_films 
American Gangster 
Films_about_organized_crime_in_the_United_States 
Films_about_organized_ crime_by_country 
Best_Picture_Academy_Award_winners 
Best_Thriller_Empire_Award_winners 
Awards_for_best_film 
Films_shot_in_New_York_City 
subject 
subject 
subject 
broader 
broader 
broader 
broader 
broader 
broader 
subject 
subject 
subject 
subject 
Exploit entities descriptions
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
h-hop Item Neighborhood Graph 
The Godfather 
Mafia_films 
Films_about_organized_crime 
Gangster_films 
Best_Picture_Academy_Award_winners 
Awards_for_best_film 
Films_shot_in_New_York_City 
subject 
subject 
subject 
broader 
broader 
broader
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Challenges 
•learn the user model starting from semantic graph-based item representations (h-hop Item Neighborhood Graph) 
•exploit the knowledge associated to the items
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Proposed Approach 
•define an appropriate kernel on graph-based item representations 
•use kernel methods for learning the user model 
•learn the user model starting from semantic graph-based item representations (h-hop Item Neighborhood Graph) 
•exploit the knowledge associated to the items
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Kernel Methods 
Work by embedding data in a vector space and looking for linear patterns in such space 
푥 → 휙(푥) 
[Kernel Methods for General Pattern Analysis. Nello Cristianini . http://www.kernel-methods.net/tutorials/KMtalk.pdf] 
휙(푥) 
휙 
푥 
Input space 
Feature space 
We can work in the new space F by specifying an inner product function between points in it 
푘푥푖,푥푗 = <휙(푥푖), 휙(푥푗)>
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
h-hop Item Neighborhood Graph Kernel 
Explicit computation of the feature map 
entity importance in the item neigh. graph
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
h-hop Item Neighborhood Graph Kernel 
Explicit computation of the feature map 
# edges involving em at l hop from i 
frequency of the entity in the 
item neigh. graph 
proportional factor taking into account at which hop the entity appears
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Weights computation example 
i 
e1 
e2 
p3 
p2 
e4 
e5 
p3 
p3 
h=2
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Weights computation example 
i 
e1 
e2 
p3 
p2 
e4 
e5 
p3 
p3 
h=2 
Informative entity about the item even if not directly related to it
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Experimental Settings 
Trained a SVM Regression model for each user 
Accuracy Evaluation: Precision, Recall,MRR (Rated Test Items protocol) 
Novelty Evaluation: Entropy-based Novelty (All Items protocol) [the lower the better]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Dataset 
Subset of Movielens mapped to DBpedia 
6,040 users 
3,148 movies 
Mappings of various recsys datasets to DBpedia 
http://sisinflab.poliba.it/semanticweb/lod/recsys/datasets/ 
Three different train/test splits 20/80, 40/60, 80/20
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Kernel calibration – impact of alpha params. (i) 
0,5 
0,55 
0,6 
0,65 
0,7 
0,75 
0,25 
1 
2 
5 
10 
20 
Prec@10 [20/80] 
Prec@10 [40/60] 
Prec@10 [80/20]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Kernel calibration – impact of alpha params. (ii) 
0 
0,2 
0,4 
0,6 
0,8 
1 
1,2 
0,25 
1 
2 
5 
10 
20 
EBN@10 [20/80] 
EBN@10 [40/60] 
EBN@10 [80/20]
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Comparative approaches 
•NB: 1-hop item neigh. + Naive Bayes classifier 
•VSM: 1-hop item neigh. Vector Space Model (tf-idf) + SVM regr 
•WK: 2-hop item neigh. Walk-based kernel + SVM regr
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Comparison with other approaches (i) 
0 
0,1 
0,2 
0,3 
0,4 
0,5 
0,6 
0,7 
0,8 
Prec@10 [20/80] 
Prec@10 [40/60] 
Prec@10 [80/20] 
NK-bestPrec 
NK-bestEntr 
NB 
VSM 
WK
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Comparison with other approaches (ii) 
0 
0,2 
0,4 
0,6 
0,8 
1 
1,2 
1,4 
1,6 
1,8 
EBN@10 [20/80] 
EBN@10 [40/60] 
EBN@10 [80/20] 
NK-bestPrec 
NK-bestEntr 
NB 
VSM 
WK
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Contributions 
A linked data RS based on kernel methods 
Exploitation of semantic graph-based item descriptions from the Web of Data 
Effective Item Neighborhood Graph Kernel 
Combination of kernel methods and LOD based item descriptions for model-based Content-based recommendations 
Future Work: 
Evaluation of further kernel functions on graphs 
Evaluation of different kernel methods
EC-Web 2014 –The 15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany 
Q & A

A Linked Data Recommender System using a Neighborhood-based Graph Kernel

  • 1.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany A Linked Data Recommender System using a Neighborhood-based Graph Kernel Vito Claudio Ostuni, Tommaso Di Noia, Roberto Mirizzi*, Eugenio Di Sciascio {vitoclaudio.ostuni, tommaso.dinoia, eugenio.disciascio}@poliba.it, robertom@yahoo-inc.com Polytechnic University of Bari - Bari (ITALY) Yahoo! Sunnyvale, CA (US) (*)
  • 2.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Outline Introduction and motivation The proposed approach Experimental Evaluation Contributions and Conclusion
  • 3.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Recommender Systems Help users in dealing with Information/Choice Overload
  • 4.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Model-based approach: Feature vector about item content description Learn a predictive user model from past user preferences A definition CB-RSs try to recommend items similar* to those a given user has liked in the past [P. Lops, M. de Gemmis, G. Semeraro. Content-based Recommender Systems: State of the Art and Trends. Recommender Systems Handbook.] Content-based RSs drama action Heat Argo The Godfather Righteous Kill (*) similar from a content-based perspective
  • 5.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Motivation Traditional Content-based Recommender Systems: •base on keyword/attribute -based item representations •rely on the quality of the content-analyzer to extract expressive item features •lack of knowledge about the items
  • 6.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Motivation Traditional Content-based Recommender Systems: •base on keyword/attribute -based item representations •rely on the quality of the content-analyzer to extract expressive item features •lack of knowledge about the items •use Linked Open Data to obtain knowledge about items and richer item representations
  • 7.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Linked Open Data •Initiative for publishing and connecting data on the Web using Semantic Web technologies; •>30 billion of RDF triples from hundreds of data sources; •Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]
  • 8.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Linked Open Data •Initiative for publishing and connecting data on the Web using Semantic Web technologies; •>30 billion of RDF triples from hundreds of data sources; •Semantic Web done right [ http://www.w3.org/2008/Talks/0617-lod-tbl/#(3) ]
  • 9.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Graph-based Item Representation The Godfather Mafia_films Gangster_films American Gangster Films_about_organized_crime_in_the_United_States Best_Picture_Academy_Award_winners Best_Thriller_Empire_Award_winners Films_shot_in_New_York_City subject subject subject subject subject subject subject
  • 10.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Graph-based Item Representation The Godfather Mafia_films Films_about_organized_crime Gangster_films American Gangster Films_about_organized_crime_in_the_United_States Films_about_organized_ crime_by_country Best_Picture_Academy_Award_winners Best_Thriller_Empire_Award_winners Awards_for_best_film Films_shot_in_New_York_City subject subject subject broader broader broader broader broader subject subject subject subject
  • 11.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Graph-based Item Representation The Godfather Mafia_films Films_about_organized_crime Gangster_films American Gangster Films_about_organized_crime_in_the_United_States Films_about_organized_ crime_by_country Best_Picture_Academy_Award_winners Best_Thriller_Empire_Award_winners Awards_for_best_film Films_shot_in_New_York_City subject subject subject broader broader broader broader broader broader subject subject subject subject
  • 12.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Graph-based Item Representation The Godfather Mafia_films Films_about_organized_crime Gangster_films American Gangster Films_about_organized_crime_in_the_United_States Films_about_organized_ crime_by_country Best_Picture_Academy_Award_winners Best_Thriller_Empire_Award_winners Awards_for_best_film Films_shot_in_New_York_City subject subject subject broader broader broader broader broader broader subject subject subject subject Exploit entities descriptions
  • 13.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany h-hop Item Neighborhood Graph The Godfather Mafia_films Films_about_organized_crime Gangster_films Best_Picture_Academy_Award_winners Awards_for_best_film Films_shot_in_New_York_City subject subject subject broader broader broader
  • 14.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Challenges •learn the user model starting from semantic graph-based item representations (h-hop Item Neighborhood Graph) •exploit the knowledge associated to the items
  • 15.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Proposed Approach •define an appropriate kernel on graph-based item representations •use kernel methods for learning the user model •learn the user model starting from semantic graph-based item representations (h-hop Item Neighborhood Graph) •exploit the knowledge associated to the items
  • 16.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Kernel Methods Work by embedding data in a vector space and looking for linear patterns in such space 푥 → 휙(푥) [Kernel Methods for General Pattern Analysis. Nello Cristianini . http://www.kernel-methods.net/tutorials/KMtalk.pdf] 휙(푥) 휙 푥 Input space Feature space We can work in the new space F by specifying an inner product function between points in it 푘푥푖,푥푗 = <휙(푥푖), 휙(푥푗)>
  • 17.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany h-hop Item Neighborhood Graph Kernel Explicit computation of the feature map entity importance in the item neigh. graph
  • 18.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany h-hop Item Neighborhood Graph Kernel Explicit computation of the feature map # edges involving em at l hop from i frequency of the entity in the item neigh. graph proportional factor taking into account at which hop the entity appears
  • 19.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Weights computation example i e1 e2 p3 p2 e4 e5 p3 p3 h=2
  • 20.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Weights computation example i e1 e2 p3 p2 e4 e5 p3 p3 h=2 Informative entity about the item even if not directly related to it
  • 21.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Experimental Settings Trained a SVM Regression model for each user Accuracy Evaluation: Precision, Recall,MRR (Rated Test Items protocol) Novelty Evaluation: Entropy-based Novelty (All Items protocol) [the lower the better]
  • 22.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Dataset Subset of Movielens mapped to DBpedia 6,040 users 3,148 movies Mappings of various recsys datasets to DBpedia http://sisinflab.poliba.it/semanticweb/lod/recsys/datasets/ Three different train/test splits 20/80, 40/60, 80/20
  • 23.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Kernel calibration – impact of alpha params. (i) 0,5 0,55 0,6 0,65 0,7 0,75 0,25 1 2 5 10 20 Prec@10 [20/80] Prec@10 [40/60] Prec@10 [80/20]
  • 24.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Kernel calibration – impact of alpha params. (ii) 0 0,2 0,4 0,6 0,8 1 1,2 0,25 1 2 5 10 20 EBN@10 [20/80] EBN@10 [40/60] EBN@10 [80/20]
  • 25.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Comparative approaches •NB: 1-hop item neigh. + Naive Bayes classifier •VSM: 1-hop item neigh. Vector Space Model (tf-idf) + SVM regr •WK: 2-hop item neigh. Walk-based kernel + SVM regr
  • 26.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Comparison with other approaches (i) 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 Prec@10 [20/80] Prec@10 [40/60] Prec@10 [80/20] NK-bestPrec NK-bestEntr NB VSM WK
  • 27.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Comparison with other approaches (ii) 0 0,2 0,4 0,6 0,8 1 1,2 1,4 1,6 1,8 EBN@10 [20/80] EBN@10 [40/60] EBN@10 [80/20] NK-bestPrec NK-bestEntr NB VSM WK
  • 28.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Contributions A linked data RS based on kernel methods Exploitation of semantic graph-based item descriptions from the Web of Data Effective Item Neighborhood Graph Kernel Combination of kernel methods and LOD based item descriptions for model-based Content-based recommendations Future Work: Evaluation of further kernel functions on graphs Evaluation of different kernel methods
  • 29.
    EC-Web 2014 –The15th International Conference on Electronic Commerce and Web Technologies September 1-4, 2014 Munich, Germany Q & A