Your SlideShare is downloading. ×
Enhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Enhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides

437
views

Published on

Enhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 Conference - User Modeling Personalization and Adaptation - Montreal (Canada) - Joint Work University …

Enhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 Conference - User Modeling Personalization and Adaptation - Montreal (Canada) - Joint Work University of Bari and Philips Research, presented at the Industrial Track of the Conference


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
437
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
24
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. UMAP 2012 - Industrial Track Montréal (Canada), 19.07.2012Cataldo Musto, Fedelucio Narducci, Pasquale Lops, Giovanni Semeraro, Marco de Gemmis (University of Bari, Aldo Moro) Mauro Barbieri, Jan Korst,Verus Pronk and Ramon Clout (Philips Research, Eindhoven, The Netherlands) Enhanced Semantic TV-Show Representation for Personalized Electronic Program Guides
  • 2. exponential growth of available TV assetsC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 3. Some stats 4 hours watched every day out of 3000 hours of broadcast TV shows ratio 0.013% source: Nielsen Survey, 2011C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 4. Information OverloadC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 5. what TV shows should I watch?C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 6. industrial scenario how does Philips cope with the overload of TV shows?C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 7. solution personalization.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 8. recommender systemsC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 9. content-based recommenders key concepts • Each item (TV show) has to be described through a set of features • Description of TV shows, plot of the movie and so on. • Each user is described through the features that occur in TV shows she watched (liked) in the past • Recommendations are provided by calculating the overlap between the textual description of the TV show and the features stored in the user profileC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 10. content-based recommenders example: TV shows recommendations user profile recommendations ♥ basketball nba (basketball) ♥ football documentaryC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 11. content-based recommenders example: TV shows recommendations user profile recommendations ♥ basketball nba (basketball) X ♥ football documentaryC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 12. content-based recommenders example: TV shows recommendations user profile recommendations ♥ basketball nba (basketball) X ♥ football documentaryC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 13. personal channels ‘in vitro’ experiments concept Idea: combining boolean filters to filter TV shows and recommenders to rank them.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 14. Watchmi plug-in developed by Aprico.tv ‘in vitro’ experimentsC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 15. problem descriptions of TV shows are often too short or poorly meaningful to feed a content-based recommendation algorithmC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 16. solution feature generation techniques based on open knowledge sourcesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 17. solution feature generation techniques based on open knowledge sourcesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 18. explicit semantic analysis • Explicit Semantic Analysis (ESA) (Gabrilovitch and Markovitch, 2006) • Goals To introduce a methodology for representing the knowledge stored in Wikipedia • To define a relationship between terms in natural language and Wikipedia articles • Insights • ESA provides a vector-space representation for each term • Terms are represented as rows in a matrix (called ESA matrix) where each column is a Wikipedia concept (article)C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 19. ESA representation term/document matrix a1 a2 a3 a4 a5 a6 a7 a8 a9 t1 ✔ ✔ ✔ ✔ t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ t4 ✔ ✔ ✔ ✔C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 20. ESA representation term/Wikipedia articles matrix a1 a2 MotoGP a4 a5 a6 a7 a8 a9 t1 ✔ ✔ ✔ ✔ t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ t4 ✔ ✔ ✔ ✔C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 21. ESA representation MotoGp Cat$[0.92]+ Superbike (0.92) Every Wikipedia article is a concept Leopard$[0.84]+ grand prix (0.76) Each concept is represented through the TF-IDF scores of the terms that occur in the valentino rossi Roar$[0.77]+ article (0.59)C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 22. ESA representation term/Wikipedia Articles matrix Politics MotoGP Basketball M.Biaggi V.Rossi Superbike ✔ ✔ ✔ t2 ✔ t3 ✔ t4 ✔ ✔ ✔C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 23. ESA representation Each term can be defined upon the Wikipedia concepts it occurs in MotoGP Max Biaggi Jane. Bridgestone Superbike Cat$ Cat$ Panthera( Fonda$ [0.95]$ (0.92) (0.63) [0.92]( (0.43) [0.07]$ “ the semantics of a term is the vector of its associations with Wikipedia articles” the whole vector is called Semantic Interpretation VectorC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 24. ESA representation semantics of text fragments Mouse+ Mouse+ Mickey% John+ mouse& rodent* computing* Mouse% Steinbeck& [0.91]& [0.89]& [0.81]% [0.17]& Dick+ Mouse+ Game% Bu#on& bu#on& [0.93]& Bu#on& computing* Controller% [0.84]& [0.81]& [0.32]% DragB+ mouse++ Mouse+ Mouse+ IBM& andB computing* rodent* PS/2* bu#on& [0.85]& [0.46]& [0.35]& drop& [0.32]& calculated as the centroid vector of the semantic interpretations vectors that compose the fragmentC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 25. ESA has already been adopted for text classification, information retrieval and semantic relatedness computation Research Question How can we exploit ESA for performing feature generation in the scenario of EPGs personalization?C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 26. From BOW to eBOW Given a description of a TV show, we exploit ESA to obtain an enhanced representation The original set of features is enriched with the set of Wikipedia articles related the most with the TV showC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 27. From BOW to eBOW algorithm Concept$ Concept$ Concept$ Concept$ centroid BOW$ 1! 47! 50! n$ vector [0.85]$ [0.46]$ [0.35]$ [0.32]$ The centroid vector of the whole description of the TV show is calculated The n most related Wikipedia concepts are extracted Concepts are added to the original BOW to obtain an enhanced BOW (e-BOW)C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 28. From BOW to eBOW example Wikipedia(Articles( großer&preis&von&italien& (motorrad)& großer&preis&von&malaysia& (motorrad)& großer&preis&von&tschechien& (motorrad)& scuderia&ferrari& valen8no&rossi& motorrad9wm9saison&2005& motorrad9wm9saison&2006& max&biaggi& TV SHOW großer&preis&der&usa&(motorrad)& Rad an Rad motorrad9wm9saison&2008& Die besten Duelle der MotoGP rad&(heraldik)& (Wheel to wheel loris&capirossi& The best duels in the MotoGP) shin’ya&nakano& motogp&C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 29. what about the advantages?C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 30. example BOW representation user profile tv show motogp sports 2012 Superbike motorbike Italian Grand Prix ... competitionC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 31. example BOW representation X user profile tv show motogp sports 2012 Superbike No matching! Italian Grand motorbike Prix ... competitionC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 32. example eBOW representation user profile tv show motogp superbike 2012 Superbike sports motorbike Italian Grand formula 1 Prix ... competitionC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 33. example eBOW representation ✔ user profile tv show motogp superbike 2012 Superbike sports Matching! Italian Grand motorbike formula 1 Prix ... competitionC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 34. ESA advantages knowledge is fluid.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 35. ESA advantages knowledge is fluid. it is necessary to exploit open and always updated knowledge sourcesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 36. example concept: ‘American Politics’ Year Enrichment 2000 Clinton 2005 Bush 2011 ObamaC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 37. (counter)example concept: ‘Italian Politics’ Year Enrichment 2000 Berlusconi 2005 Berlusconi 2011 BerlusconiC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 38. experiments.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 39. design of the experiments task • retrieval task • Given a set of program types and a repository of TV shows • We want to retrieve the shows that belong to a specific program type MovieC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 40. dataset Aprico.tv data • Dataset • 47 German-language Channels provided by Axel Springer • 133k TV Shows, 17 program types • Textual features: title, synopsis, description, program type • Explicit Semantic Analysis • Dump: October, 2010 • 814,013 terms (rows) and 484,218 articles (colums)C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 41. design of the experiments learning methods • Two state-of-the-art learning methods have been compared • Random Indexing • Vector Space Model (VSM)-based representation • Incremental approach to compress the representation in an effective way • Both TV shows and user profile are points in a vector space • Logistic Regression • Supervised Learning Method, state of the art for Text Classification • Each TV show is classified as relevant or not relevant for the user, according to user profile • TV shows can be ranked according to their probability scoresC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 42. design of the experiments research questions 1. Which one is the learning method than can provide the best recommendations ? 2. Does the idea of enriching the BOWs with ESA improve the accuracy of the suggestions ?C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 43. experiment 1 results 100 Logistic Regression Random Indexing87,5 7562,5 50 P@5% P@10% P@25% P@50% P@75% P@100%C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 44. experiment 2 results 95 BOW eBOW (+20) eBOW (+40) eBOW (+60)89,75 84,579,25 74 P@5% P@10% P@25% P@50% P@75% P@100%C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 45. experiment 2 results 95 BOW eBOW (+20) eBOW (+40) eBOW (+60)89,75 84,5 Differences between BOW and eBOW(+40, +60) are statistically significant79,25 (Mann-Whitney Test, p<0,005) 74 P@5% P@10% P@25% P@50% P@75% P@100%C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 46. Recap • Content-based Personalization Techniques for Electronic Program Guides • Joint work: Philips Research - Aprico.tv - University of Bari • Feature generation to enrich textual descriptions of TV shows • Exploitation of ESA: Explicit Semantic Analysis • Introducing eBOW for content representation • BOW + Wikipedia concepts related to the textual descriptionC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 47. Conclusions • Linear Regression can provide good accuracy in retrieving related TV shows • Almost 90% in precision. • Feature Generation techniques based on Wikipedia can improve the precision of a content-based recommendation approach • eBOW representation overcomes the classical BOW representation • Good results: 94% in precisionC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 19.07.12
  • 48. questions?C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis, M. Barbieri, J. Korst,V. Pronk, R. CloutEnhanced Semantic TV-Shows Representation for Personalized Electronic Program Guides - UMAP 2012 - 18.07.12