MOVIE RECOMMENDATION WITH DBPEDIARoberto Mirizzi, Tommaso Di Noia, Azzurra Ragone, Vito Claudio Ostuni, Eugenio Di Sciasci...
Outline DBpedia: a nucleus for a Web of Open Data    Social knowledge bases for similarity detection Semantic Vector Sp...
What is Linked Data?                                                               Linked Data is about using             ...
DBpedia: a Nucleus for a Web of Data (i)          3rd Italian Information Retrieval Workshop (IIR 2012) – Bari            ...
DBpedia: a Nucleus for a Web of Data (ii)                                                                         The DBpe...
Social KBs for similarity detection                                                                                Catheri...
Semantic Vector Space Model (i)                                                                Quick recap on Vector Space...
Semantic Vector Space Model (ii)                                                                                         V...
Semantic Vector Space Model (iii)                                   George                                    George     C...
Semantic Vector Space Model (iv)                                      1        49184wgc,o1 2  tf gc,o1 2  idf gc   log...
MORE: More than Movie Recommendation                                                               MORE is a Facebook appl...
Semantic Content-based RecommenderGiven a user profile, defined as:                                                      ...
Training the systemIn order to identify the best possible values for the coefficients p (i.e., the weightsassociated to e...
Evaluation: Precision & Recall                                                      Rec @ N  TestSet                     ...
Conclusion & Future directions The huge amount of data available on DBpedia can be successfully exploited to  build conte...
Q?                                           A!     3rd Italian Information Retrieval Workshop (IIR 2012) – Bari          ...
Upcoming SlideShare
Loading in …5
×

Movie Recommendation with DBpedia - IIR 2012

1,204 views

Published on

Movie Recommendation with DBpedia
Roberto Mirizzi, Tommaso Di Noia, Azzurra Ragone, Vito Claudio Ostuni, Eugenio Di Sciascio

3rd Italian Information Retrieval Workshop (IIR 2012) - Bari

January 26, 2012

In this paper we present MORE (acronym of MORE than MOvie REcommendation), a Facebook application that semantically recommends movies to the user leveraging the knowledge within Linked Data and the information elicited from her profile. MORE exploits the power of social knowledge bases (e.g. DBpedia) to detect semantic sim- ilarities among movies. These similarities are computed by a Semantic version of the classical Vector Space Model (sVSM), applied to semantic datasets. Precision and recall experiments prove the validity of our ap- proach for movie recommendation. MORE is freely available as a Facebook application.

Published in: Technology, Education
1 Comment
0 Likes
Statistics
Notes
  • blessing_11111@yahoo.com

    My name is Blessing
    i am a young lady with a kind and open heart,
    I enjoy my life,but life can't be complete if you don't have a person to share it
    with. blessing_11111@yahoo.com

    Hoping To Hear From You
    Yours Blessing
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total views
1,204
On SlideShare
0
From Embeds
0
Number of Embeds
11
Actions
Shares
0
Downloads
37
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide

Movie Recommendation with DBpedia - IIR 2012

  1. 1. MOVIE RECOMMENDATION WITH DBPEDIARoberto Mirizzi, Tommaso Di Noia, Azzurra Ragone, Vito Claudio Ostuni, Eugenio Di Sciascio mirizzi@deemail.poliba.it, t.dinoia@poliba.it , azzurra.ragone@exprivia.it, ostuni@deemail.poliba.it, disciascio@poliba.it Politecnico di Bari Via Orabona, 4 70125 Bari (ITALY) 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  2. 2. Outline DBpedia: a nucleus for a Web of Open Data  Social knowledge bases for similarity detection Semantic Vector Space Model  Vector Space Model adapted to RDF graphs MORE: More than Movie Recommendation  Content-based recommendation in action Evaluation  Precision and Recall experiments with MovieLens Conclusion 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  3. 3. What is Linked Data? Linked Data is about using the Web to connect related data that wasnt previously linked, or using the Web to lower the barriers to linking data currently linked using other methods. More specifically, Wikipedia defines Linked Data as “a term used to describe a recommended best practice for exposing, sharing, and connecting pieces of data, information, and knowledge on the Semantic Web using URIs and RDF.” [www.linkeddata.org]3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  4. 4. DBpedia: a Nucleus for a Web of Data (i) 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  5. 5. DBpedia: a Nucleus for a Web of Data (ii) The DBpedia knowledge base currently describes more than 3.64 million things, highly interconnected in the RDF graph. Let’s use all this knowledge to build smarter content-based recommender systems 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  6. 6. Social KBs for similarity detection Catherine Crime Zeta-Jones George Clooney Ocean’s Twelve Ocean’s Eleven Brad Pitt Steven Soderbergh2000s crime films American criminal comedy films Crime films 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  7. 7. Semantic Vector Space Model (i) Quick recap on Vector Space Model Vector Space Model is an algebraic model for representing both text documents and queries as vectors of index terms wt,d that are positive and non-binary. T vd   w1,d , w2,d ,..., wN ,d    wt ,d  tft ,d  idft nt ,d D tft ,d  idft  log  k nk ,d d  D t  d  [http://en.wikipedia.org/wiki/File:Vector_space_model.jpg]  N d j dq wi , j  wi ,q sim(d j , q)   i 1   N N dj q i 1 w2 i , j  i 1 w2 i , q 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  8. 8. Semantic Vector Space Model (ii) Vector Space Model Ocean’s Eleven applied to RDF graphs Ocean’s Twelve George Clooney Each resource (movie) is Brad Pitt expressed as a tensor in aCatherine Zeta-Jones multi-dimensional space Steven Soderberg where each dimension 2000s crime films Crime films corresponds to a specific American criminal… genre property of the considered subject/broader datasets (e.g., starring, Crime director American criminal… starring subject/broader, director, Catherine Zeta-Jones Crime Ocean’s Eleven Ocean’s Twelve Brad Pitt Steven Soderberg George Clooney Crime films 2000s crime films genre, …) Ocean’s Eleven Ocean’s Twelve starring Brad Pitt George Clooney therine Zeta-Jones 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  9. 9. Semantic Vector Space Model (iii) George George Catherine Catherine Brad Brad George Catherine Brad STARRING Clooney [gc] Z. Jones [czj] STARRING Clooney [gc] Z. Jones [czj] Pitt [bp] STARRING (38 movies) (22 Jones [czj] (35Pitt[bp] Clooney [gc] Z. movies) Pitt [bp] movies) (38 movies) (38 movies) (22 movies) (35 movies) (22 movies) (35 movies) Ocean’s Ocean’s Ocean’s Eleven [o11]] Eleven [o11 ]     Eleven [o11    (13 actors) (13 actors) (13 actors) Ocean’s Ocean’s Ocean’s Twelve [o12]] Twelve [o12 ]     Twelve [o12   Ocean’s Eleven (15 actors) (15 actors) Ocean’s Twelve (15 actors) wactorx ,moviey  tf actorx ,moviey  idf actorx wgc ,o12  wgc ,o11  wczj ,o12  wczj ,o11  wbp ,o12  wbp ,o11 simstarring (o12 , o11 )  wgc ,o12  wczj ,o12  wbp ,o12  wgc ,o11  wbp ,o11 2 2 2 2 2 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  10. 10. Semantic Vector Space Model (iv) 1 49184wgc,o1 2  tf gc,o1 2  idf gc   log  0.207 15 38  starring  simstarring (o12 , o11 ) + 1 49184wgc,o1 1  tf gc,o1 1  idf gc   log  0.239 13 38 1 49184  genre  simgenre (o12 , o11 ) +wczj ,o1 2  tf czj ,o1 2  idf czj   log  0.223 15 22 49184wczj ,o1 1  tf czj ,o1 1  idf czj  0  log 0  subject  simsubject (o12 , o11 ) + 22 1 49184wbp,o1 2  tf bp,o1 2  idf bp   log  0.210 15 35 … = 1 49184wbp,o1 1  tf bp,o1 1  idf bp   log  0.242 13 35 sim(o12 , o11) 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  11. 11. MORE: More than Movie Recommendation MORE is a Facebook application that semantically recommends movies to the user leveraging the knowledge within DBpedia. MORE supports the user in exploratory browsing tasks by guiding their search through a semantic knowledge space. Similarities between movies are computed by a Semantic version of the classical Vector Space Model (sVSM), applied to semantic datasets. http://apps.facebook.com/movie-recommendation/ 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  12. 12. Semantic Content-based RecommenderGiven a user profile, defined as:  profile(u)  m j u likes m j We compute a similarity between mi and the information encoded in profile(u): 1  (u ) P  p  simp (m j , mi ) m j  profile p r (u, mi )  profile(u )If this similarity is greater or equal to 0.5, we suggest the movie mi to the user u. 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  13. 13. Training the systemIn order to identify the best possible values for the coefficients p (i.e., the weightsassociated to each property), we train the system via a genetic algorithm adopting an N-fold cross validation approach (with N = 5) on the 100k MovieLens dataset.At the end we obtain a set Ap = {p1, …, p5} of 5 different values for each p, e.g.:Then, we evaluate the performances with standard precision and recall tests, when pis one of the following:min( Ap ) max( Ap ) avg ( Ap ) median( Ap ) lowestError ( Ap ) 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  14. 14. Evaluation: Precision & Recall Rec @ N  TestSet Rec @ N  TestSet P@ N  R@ N  N TestSet N  3, 4,5, 6, 7 The figure shows high values of Precision and Recall. The best values are obtained choosing the lowest misclassification error on Ap for the coefficients p.We also evaluated the importance of thesubject/broader property. The information of thisproperty is peculiar of ontological datasets.As shown in the figure, the performances drasticallydecrease if we do not consider this property. 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  15. 15. Conclusion & Future directions The huge amount of data available on DBpedia can be successfully exploited to build content-based recommender systems. We have presented MORE, a Facebook application that leverages the knowledge within DBpedia to produce movie recommendations by means of a semantic version of the classical vector space model (sVSM). Evaluation against historical datasets and high values of precision and recall prove the validity of our approach. We are currently working on:  Testing the approach with different domains  Improving the recommendation with a hybrid approach (content-based and collaborative filtering) We acknowledge partial support of HP IRP 2011. Grant CW267313. 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012
  16. 16. Q? A! 3rd Italian Information Retrieval Workshop (IIR 2012) – Bari January 26, 2012

×