More Like This: Machine Learning Approaches to Music similarity

1,080 views

Published on

The slides from my dissertation talk. Thesis available at http://cseweb.ucsd.edu/~bmcfee/papers/bmcfee_dissertation.pdf

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,080
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
0
Likes
4
Embeds 0
No embeds

No notes for slide

More Like This: Machine Learning Approaches to Music similarity

  1. 1. More Like This:Machine Learning Approaches to Music Similarity Brian McFeeComputer Science & EngineeringUniversity of California, San Diego
  2. 2. Music discovery in days of yore...
  3. 3. Music discovery 2.0: the present f• ~20 million songs available• Discovery is still largely human-powered
  4. 4. A Google for music?
  5. 5. A Google for music?• Standard text search can work with meta-data• Can we predict meta-data from audio? ⁃ [Turnbull, 2008], [Barrington, 2011]
  6. 6. Query by example• Natural, user-friendly alternative to text search
  7. 7. Query by example• Natural, user-friendly alternative to text search
  8. 8. Query by example• Natural, user-friendly alternative to text search
  9. 9. This talk• Learning algorithms for QBE, geared toward music discovery• Well look at two consumption models: Active browsing Passive listening (search & ranking) (playlist generation)• Evaluation derived from user behavior
  10. 10. Learning similarity
  11. 11. Defining similarity: semantics? Song similarity = tag similarity?
  12. 12. Defining similarity: semantics?• Drawbacks: - Choosing, weighting vocabulary is surprisingly difficult - Hard to maintain quality at scale
  13. 13. Defining similarity: human judgements? [M. & Lanckriet, 2009, 2011]• Which is more similar?
  14. 14. Defining similarity: human judgements? [M. & Lanckriet, 2009, 2011]• Which is more similar?• Drawbacks: ambiguity, subjectivity, scale
  15. 15. Collaborative filter similarity• Collect listening histories for (lots of!) users• Song similarity = portion of users in common
  16. 16. Collaborative filter similarity• Collaborative filters perform well... - ... for tagging [Kim, Tomasik, & Turnbull, 2009] - ... and playlisting [Barrington, Oda, & Lanckriet, 2009] - ... and recommendation (Yahoo, Last.fm, iTunes...)• Implicit feedback requires no additional effort from users• ... but fails on unpopular items: the cold start problem!
  17. 17. Learning from a collaborative filter [M., Barrington, & Lanckriet, 2010, 2012] 1. 2. 3.
  18. 18. Learning from a collaborative filter [M., Barrington, & Lanckriet, 2010, 2012] 1. 2. 3.
  19. 19. Learning from a collaborative filter [M., Barrington, & Lanckriet, 2010, 2012] 1. 2. 3.
  20. 20. Metric learning to rank• The goal: Rankings in Rankings in = audio space CF space
  21. 21. Metric learning to rank [M. & Lanckriet, 2010]• The goal: Ranking by Target = (learned) distance rankings
  22. 22. Metric learning to rank [M. & Lanckriet, 2010]• The goal: Ranking by Target = (learned) distance rankings• Optimize a linear transformation for ranking
  23. 23. Structure prediction: nearest neighbors• Setup: database , rankings• PSD matrix transforms features• Order by distance from :
  24. 24. Structure prediction: nearest neighbors• Setup: database , rankings• PSD matrix transforms features• Order by distance from :• encodes each (query, ranking) pair
  25. 25. Metric learning to rank (MLR) Score for target ranking > Score ranking + Prediction other for any error• Supported losses Δ: AUC, KNN, MAP, MRR, NDCG, Prec@k
  26. 26. MLR solver• Cutting-plane algorithm based on 1-slack Structural SVM [Joachims, et al. 2009]• Repeat until convergence: Constraint Semi-definite generation programming (DP)
  27. 27. MLR solver• Cutting-plane algorithm based on 1-slack Structural SVM [Joachims, et al. 2009]• Repeat until convergence: Constraint Semi-definite generation programming (DP) Sequence of QPs
  28. 28. MLR solver• Cutting-plane algorithm based on 1-slack Structural SVM [Joachims, et al. 2009]• Repeat until convergence: Constraint Semi-definite generation programming (DP) Sequence of QPs• Multiple kernel extensions: [Galleguillos, M., Belongie, & Lanckriet 2011]
  29. 29. Audio pipeline Audio signal
  30. 30. Audio pipeline Audio signal 1. Feature Bag of ΔMFCCs extraction
  31. 31. Audio pipeline Audio signal 1. Feature Bag of ΔMFCCs extraction 2. Vector quantization Codeword hist.
  32. 32. Audio pipeline Audio signal 1. Feature Bag of ΔMFCCs extraction 2. Vector quantization PPK Codeword hist. 3. Probability product kernel
  33. 33. Audio pipeline Audio signal CF similarity Supervision PPK MLR Features
  34. 34. Evaluation: CAL10K• Last.fm collaborative filter [Celma, 2008] - 360K users, 186K artists• CAL10K songs [Tingle, Turnbull, & Kim, 2010] - 5.4K songs, 2K artists (after CF matching)
  35. 35. Evaluation: CAL10K• Last.fm collaborative filter [Celma, 2008] - 360K users, 186K artists• CAL10K songs [Tingle, Turnbull, & Kim, 2010] - 5.4K songs, 2K artists (after CF matching)• Evaluation: - Split artists into train/val/test - Target rankings: top-10 most similar train artists
  36. 36. Evaluation: comparison• Gaussian mixture models + KL divergence - 8 component, diagonal covariance GMM per song• Auto-tags: predict 149 semantic tags from audio [Turnbull, 2008]• [Our method] VQ+MLR: 1024 codewords• Expert tags: 1053 tags from Pandora [Tingle, et al., 2009]
  37. 37. Similarity learning: results GMM (KL) Auto-tags Auto-tags + MLR Audio VQ Audio VQ + MLR Expert tags (cos) Expert tags + MLR 0.65 0.70 0.75 0.80 0.85 0.90 0.95 AUC
  38. 38. Example playlists The Ramones - Go Mental Def Leppard - Promises The Buzzcocks - Harmony In My Head Los Lonely Boys - Roses Wolfmother - Colossal Judas Priest - Diamonds and Rust (live)
  39. 39. Example playlists The Ramones - Go Mental Def Leppard - Promises The Buzzcocks - Harmony In My Head Los Lonely Boys - Roses Wolfmother - Colossal Judas Priest - Diamonds and Rust (live) The Buzzcocks - Harmony In My Head Mötley Crüe - Same Ol Situation The Offspring - Gotta Get Away MLR The Misfits - Skulls AC/DC - Who Made Who (live)
  40. 40. Example playlists Fats Waller - Winter Weather Dizzy Gillespie - Shes Funny That Way Enrique Morente - Solea Chet Atkins - In the Mood Rachmaninov - Piano Concerto #4 Eluvium - Radio Ballet
  41. 41. Example playlists Fats Waller - Winter Weather Dizzy Gillespie - Shes Funny That Way Enrique Morente - Solea Chet Atkins - In the Mood Rachmaninov - Piano Concerto #4 Eluvium - Radio Ballet Chet Atkins - In the Mood Charlie Parker - What Is This Thing Called Love? Bud Powell - Oblivion Bob Wills & His Texas Playboys - Lyla Lou Bob Wills & His Texas Playboys - Sittin On Top of the World
  42. 42. Scaling up: fast retrieval [M. & Lanckriet, 2011]• Audio similarity search for a million songs?• Idea: Index data with spatial trees• 100-NN search over 900K songs: - Brute force: 2.4s - 50% recall: 0.14s 17x speedup - 20% recall: 0.02s 120x speedup
  43. 43. Similarity learning: summary• Collaborative filters provide user-centric music similarity• CF similarity can be approximated by audio features• Audio search can be done quickly at large-scale
  44. 44. Playlist generation
  45. 45. Playlist generation• Goal: generate a "good" song sequence - Music auto-pilot (given context)• Many existing algorithms, but no standard evaluation• What makes one algorithm better than another?
  46. 46. Playlist evaluation 1: Human survey• Idea: generate playlists, ask for opinions• Impractical at large-scale: - Huge search space - User taste, expertise can be problematic - Slow, expensive• Does not facilitate rapid evaluation and optimization
  47. 47. Playlist evaluation 2: Information retrieval• Idea: - Define "good" and "bad" playlists - Predict the next song, measure accuracy• But what makes a bad playlist?• Do users agree on good/bad?
  48. 48. A generative approach [M. & Lanckriet, 2011b]• Playlist algorithm = distribution over playlists• Dont evaluate synthetic playlists• Do evaluate the likelihood of generating real playlists
  49. 49. The playlist collection: AOTM-2011• Art of the Mix - 13 years of playlists - ~210K playlist segments - ~100K songs from MSD• Top 25 playlist categories: - Genre: Punk, Hip-hop, Reggae... - Context: Road trip, Break-up, Sleep... - Other: Mixed genre, Alternating DJ...
  50. 50. A simple playlist model 1. Start with a set of songs
  51. 51. A simple playlist model 2. Select a subset (e.g., jazz songs)
  52. 52. A simple playlist model 3. Select a song
  53. 53. A simple playlist model 4. Select a new subset
  54. 54. A simple playlist model 4. Select a new subset
  55. 55. A simple playlist model 5. Select a new song
  56. 56. A simple playlist model 6. Repeat...
  57. 57. A simple playlist model 6. Repeat...
  58. 58. Connecting the dots...• Random walk on a hypergraph - Vertices = songs - Edges = subsets• Edges derived from: - Audio clusters, tags, lyrics, era, popularity, CF - or combinations/intersections• Goal: optimize edge weights from example playlists
  59. 59. Playlist model exp. prior edge weights transitions playlists
  60. 60. Playlist generation: evaluation• Setup: - Split playlist collection into train/test - Learn edge weights on training playlists - Evaluate average likelihood of test playlists• Train per category, or all together• Compare against uniform shuffle baseline
  61. 61. Random walk results ALL Mixed Global model Theme Category-specific Rock-pop Alternating DJ Indie Single artist Romantic Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues 0% 5% 10% 1 5% 20% 25% Log-likelihood gain over random shuffle
  62. 62. Stationary model results ALL Mixed Global model Theme Category-specific Rock-pop Alternating DJ Indie Single artist Romantic Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues -15% -10% -5% 0% 5% 10% 15% 20% Log-likelihood gain over random shuffle
  63. 63. Example playlists Rhythm & Blues 70s & soul Lyn Collins - Think Audio #14 & funk Isaac Hayes - No Name Bar DECADE 1965 & soul Michael Jackson - My Girl Electronic music Audio #11 & downtempo Everything But The Girl - Blame DECADE 1990 & trip-hop Massive Attack - Spying Glass Audio #11 & electronica Björk - Hunter
  64. 64. Playlist generation summary• Generative approach simplifies evaluation• AOTM-2011 collection facilitates learning and evaluation• Robust, efficient and transparent feature integration
  65. 65. The future
  66. 66. Directions for future work• Audio features: coding, dynamics and rhythm• Playlist models: mixtures, long-range interactions• UI models: interactive, context-aware, diversity
  67. 67. Personalized recommendation [M., Bertin-Mahieux, Ellis, & Lanckriet, 2012]• The Million Song Dataset Challenge• Listening histories for 1.1M users, 380K songs• Task: personalized song recommendation
  68. 68. Conclusion• MLR can optimize distance metrics for ranking, QBE retrieval• Audio similarity can approximate a collaborative filter• Generative playlist model integrates data, models dynamics• User-centric evaluation makes it all possible
  69. 69. Thanks!
  70. 70. Metric partial order feature • Score is large when distances match ranking
  71. 71. Playlist weights: 6390 edges ALL Mixed Theme Rock-pop Alternating DJ Indie Single Artist Romantic RoadTrip Punk Depression Break Up Narrative Hip-hop Sleep Electronic music Dance-houseRhythm and Blues Country Cover Hardcore Rock Jazz Folk Reggae Blues Audio CF Era Familiarity Lyrics Tags Uniform • Audio & CF: k-means (16/64/256) • Lyrics: LDA (k=32, top-1/3/5) • Era: year, decade, decade+5 • Tags: Last.fm top-10 • Familiarity: high/med/low • Conjunctions

×