Distributional Models vs. Linked Data: leveraging crowdsourcing to personalize music playlists

591 views
547 views

Published on

Italian Information Retrieval 2013 - Workshop (http://iir2013.isti.cnr.it) - Distributional Models vs. Linked Data: leveraging crowdsourcing to personalize music playlists

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
591
On SlideShare
0
From Embeds
0
Number of Embeds
7
Actions
Shares
0
Downloads
12
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Distributional Models vs. Linked Data: leveraging crowdsourcing to personalize music playlists

  1. 1. IIR 2013 - 4th Italian Information Retrieval Workshop Pisa (Italy), 17.01.2013 Cataldo Musto, Fedelucio Narducci, Giovanni Semeraro, Pasquale Lops, Marco de GemmisDistributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists
  2. 2. exponential growth of the available musicC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  3. 3. Some stats 28,000,000 songs available on iTunes Store (*) around 31,000 hours of music a typical user spends 1.5 hours for day listening to music = 56 years to listen to the whole iTunes Library (*) http://www.digitalmusicnews.com/permalink/2012/120425itunesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  4. 4. Information OverloadC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  5. 5. what music should I listen to?C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  6. 6. solution personalization.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  7. 7. solution personalized music playlistsC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  8. 8. Is this something new? No.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  9. 9. Amazon.com RecommendationsC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  10. 10. Genius @iTunes RecommendationsC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  11. 11. RecommendationsC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  12. 12. All the state of the art platforms share an important drawback.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  13. 13. training is a bottleneck.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  14. 14. need for explicit information about user interests.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  15. 15. social media provide information about user preferencesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  16. 16. example user preferences in music from FacebookC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  17. 17. Our contribution Play.meC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  18. 18. Play.me personalized music playlists • Goal • To provide users with personalized music playlists • Insights • Extraction of explicit user preferences from Facebook • Playlist creation by enriching explicit user preferences. • New artists are added to those explicitly extracted from Facebook • Comparison of two enrichment techniquesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  19. 19. Play.me architectureC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  20. 20. Play.me architectureC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  21. 21. Play.me pre-processing • Crawling from Last.fm • Public API • Content-based features • Name of the artist + Social tags • Noise processing • Information locally storedC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  22. 22. Play.me pre-processing Sigur Ròs tag cloud from Last.fmC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  23. 23. Play.me architectureC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  24. 24. Play.me data extraction from FacebookC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  25. 25. Play.me data extraction from Facebook explicit preferencesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  26. 26. Play.me data extraction from Facebook implicit preferencesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  27. 27. Play.me architectureC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  28. 28. Play.me enrichment • Rationale • Given a set of explicit preferences extracted from Facebook • Play.me enrichs this set • Extraction of artists similar to those the user explicity likesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  29. 29. Play.me enrichment example Coldplay extracted from Facebook enrichment radiohead red hot chili peppers kings of leonC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  30. 30. Play.me architectureC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  31. 31. Play.me playlistMost popular songs of the artists extracted from Last.fm (as well as those added through the enrichment) are proposed to the user.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  32. 32. let’s go deeperC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  33. 33. Play.me enrichment • Comparison of two approaches • Content-based strategy • Distributional Models • Linked DataC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  34. 34. Play.me enrichment based on Distributional Models • Content-based strategy • Each artist is modeled through a set of tags • Each artist is represented as a point in a semantic geometrical space • Distributional Models • Similarity calculations to extract the most similar artists.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  35. 35. distributional models “meaning is its use” L.Wittgenstein (Austrian philosopher)C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  36. 36. distributional models insightby analyzing large corpus of textual data it is possibleto infer information about the usage (about the meaning)of the terms. exampleC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  37. 37. distributional models term/context matrix (WordSpace) c1 c2 c3 c4 c5 c6 c7 c8 c9 t1 ✔ ✔ ✔ ✔ t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ t4 ✔ ✔ ✔ ✔C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  38. 38. distributional models beer vs. glass: good overlap c1 c2 c3 c4 c5 c6 c7 c8 c9 t1 ✔ ✔ ✔ ✔ t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ t4 ✔ ✔ ✔ ✔C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  39. 39. distributional models beer vs. spoon: no overlap c1 c2 c3 c4 c5 c6 c7 c8 c9 t1 ✔ ✔ ✔ ✔ t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ t4 ✔ ✔ ✔ ✔C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  40. 40. distributional models rock vs. post rock = good overlap c1 c2 c3 c4 c5 c6 rock ✔ ✔ ✔ post rock ✔ ✔ jazz ✔ classical ✔ ✔ ✔C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  41. 41. distributional models rock vs. classical = no overlap c1 c2 c3 c4 c5 c6 rock ✔ ✔ ✔ post rock ✔ ✔ jazz ✔ classical ✔ ✔ ✔C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  42. 42. representation of documents (*) can be inferred by combining the representation of the terms (**) occurring in the document. (*) documents = artists (**) terms = tagsC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  43. 43. distributional models term/context matrix (DocSpace) c1 c2 c3 c4 c5 c6 c7 c8 c9 t2 ✔ ✔ ✔ ✔ t3 ✔ ✔ ✔ d1 ✔ ✔ ✔ ✔ ✔C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  44. 44. Play.me enrichment based on Distributional Models Coldplay Radiohead Kings of Leon Lady GagaC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  45. 45. Play.me enrichment based on Distributional Models input: vector space representation output: artists with the highest cosine similarity radiohead the killers kings of leonC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  46. 46. Linked Open Data CloudC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  47. 47. Linked Open Data Cloud Structured (RDF) representation of the information stored in Wikipedia.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  48. 48. Play.me enrichment based on Linked Data Coldplay play Alternative RockC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  49. 49. Play.me RDF triple Relationships are explictly encoded in RDF.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  50. 50. Play.me enrichment based on Linked Data • Linked Open Data Cloud • Each artist is mapped on a DBpedia node. • univocal URI • Relationship between artists (nodes) are explicitly encoded • e.g. genre, artist category, etc. • Use of SPARQL to extract artists (nodes) that share the same featuresC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  51. 51. Play.me enrichment based on Linked Data input: SPARQL query output: artists sharing the same properties radiohead the smiths the verveC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  52. 52. recap enrichment process input: artist output: similar artists coldplay the smiths Linked Data radiohead the verve kings of leon Distributional Models radioheadC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  53. 53. experimental evaluation.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  54. 54. experimental design • Experiment • Which one is the enrichment technique that can provide users with the best playlists ?C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  55. 55. experimental design settings • 30 users • Heterogeneous musical knowledge • Last.fm crawl: 228,878 artists • Extraction & Recommendation step • 325 artists extracted • 11 per user, on averageC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  56. 56. experimental setup Given a playlist, each user can freely express her own feedback (like/dislike) on the proposed tracks.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  57. 57. experimental setup Experiment repeated three times (one run with Linked Data enrichment, another one with Distributional Models, one with a simple baseline based on popularity).C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  58. 58. experimental setup Users were unaware of the adopted configuration.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  59. 59. experimental design results 76,3 80 75,2 Linked Data Distributional Models Baseline (Popularity) 73,75 69,7 67,5 65,9 64,6 61,25 63,2 58 58 58 55 n=1 n=2 n=3 n = number of artists added for each extracted artistC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  60. 60. experimental design results 76,3 80 75,2 Linked Data Distributional Models Baseline (Popularity) 73,75 69,7 67,5 65,9 64,6 61,25 63,2 58 58 58 55 n=1 n=2 n=3 distributional models overcome linked dataC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  61. 61. experimental design results 76,3 80 75,2 Linked Data Distributional Models Baseline (Popularity) 73,75 69,7 67,5 65,9 64,6 61,25 63,2 58 58 58 55 n=1 n=2 n=3precision in distributional models drops down more rapidlyC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  62. 62. experimental design results 76,3 80 75,2 Linked Data Distributional Models Baseline (Popularity) 73,75 69,7 67,5 65,9 64,6 61,25 63,2 58 58 58 55 n=1 n=2 n=3 good results for baseline, as well (poor music knowledge?)C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  63. 63. conclusions.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  64. 64. both enrichment techniques overcome the baselineC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  65. 65. distributional models overcome linked dataC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  66. 66. future research.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  67. 67. merging different enrichment techniquesC. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  68. 68. evaluation with user-based metrics (serendipity, novelty, unexpectedness)C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  69. 69. modeling context.C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13
  70. 70. questions?C. Musto, F. Narducci, G. Semeraro, P. Lops, M. de Gemmis.Distributional models vs. Linked Data: exploiting crowdsourcing to personalize music playlists - IIR 2013 - 17.01.13

×