• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
More Like This: Machine Learning Approaches to Music similarity
 

More Like This: Machine Learning Approaches to Music similarity

on

  • 718 views

The slides from my dissertation talk. Thesis available at http://cseweb.ucsd.edu/~bmcfee/papers/bmcfee_dissertation.pdf

The slides from my dissertation talk. Thesis available at http://cseweb.ucsd.edu/~bmcfee/papers/bmcfee_dissertation.pdf

Statistics

Views

Total Views
718
Views on SlideShare
718
Embed Views
0

Actions

Likes
2
Downloads
6
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    More Like This: Machine Learning Approaches to Music similarity More Like This: Machine Learning Approaches to Music similarity Presentation Transcript

    • More Like This:Machine Learning Approaches to Music Similarity Brian McFeeComputer Science & EngineeringUniversity of California, San Diego
    • Music discovery in days of yore...
    • Music discovery 2.0: the present f• ~20 million songs available• Discovery is still largely human-powered
    • A Google for music?
    • A Google for music?• Standard text search can work with meta-data• Can we predict meta-data from audio? ⁃ [Turnbull, 2008], [Barrington, 2011]
    • Query by example• Natural, user-friendly alternative to text search
    • Query by example• Natural, user-friendly alternative to text search
    • Query by example• Natural, user-friendly alternative to text search
    • This talk• Learning algorithms for QBE, geared toward music discovery• Well look at two consumption models: Active browsing Passive listening (search & ranking) (playlist generation)• Evaluation derived from user behavior
    • Learning similarity
    • Defining similarity: semantics? Song similarity = tag similarity?
    • Defining similarity: semantics?• Drawbacks: - Choosing, weighting vocabulary is surprisingly difficult - Hard to maintain quality at scale
    • Defining similarity: human judgements? [M. & Lanckriet, 2009, 2011]• Which is more similar?
    • Defining similarity: human judgements? [M. & Lanckriet, 2009, 2011]• Which is more similar?• Drawbacks: ambiguity, subjectivity, scale
    • Collaborative filter similarity• Collect listening histories for (lots of!) users• Song similarity = portion of users in common
    • Collaborative filter similarity• Collaborative filters perform well... - ... for tagging [Kim, Tomasik, & Turnbull, 2009] - ... and playlisting [Barrington, Oda, & Lanckriet, 2009] - ... and recommendation (Yahoo, Last.fm, iTunes...)• Implicit feedback requires no additional effort from users• ... but fails on unpopular items: the cold start problem!
    • Learning from a collaborative filter [M., Barrington, & Lanckriet, 2010, 2012] 1. 2. 3.
    • Learning from a collaborative filter [M., Barrington, & Lanckriet, 2010, 2012] 1. 2. 3.
    • Learning from a collaborative filter [M., Barrington, & Lanckriet, 2010, 2012] 1. 2. 3.
    • Metric learning to rank• The goal: Rankings in Rankings in = audio space CF space
    • Metric learning to rank [M. & Lanckriet, 2010]• The goal: Ranking by Target = (learned) distance rankings
    • Metric learning to rank [M. & Lanckriet, 2010]• The goal: Ranking by Target = (learned) distance rankings• Optimize a linear transformation for ranking
    • Structure prediction: nearest neighbors• Setup: database , rankings• PSD matrix transforms features• Order by distance from :
    • Structure prediction: nearest neighbors• Setup: database , rankings• PSD matrix transforms features• Order by distance from :• encodes each (query, ranking) pair
    • Metric learning to rank (MLR) Score for target ranking > Score ranking + Prediction other for any error• Supported losses Δ: AUC, KNN, MAP, MRR, NDCG, Prec@k
    • MLR solver• Cutting-plane algorithm based on 1-slack Structural SVM [Joachims, et al. 2009]• Repeat until convergence: Constraint Semi-definite generation programming (DP)
    • MLR solver• Cutting-plane algorithm based on 1-slack Structural SVM [Joachims, et al. 2009]• Repeat until convergence: Constraint Semi-definite generation programming (DP) Sequence of QPs
    • MLR solver• Cutting-plane algorithm based on 1-slack Structural SVM [Joachims, et al. 2009]• Repeat until convergence: Constraint Semi-definite generation programming (DP) Sequence of QPs• Multiple kernel extensions: [Galleguillos, M., Belongie, & Lanckriet 2011]
    • Audio pipeline Audio signal
    • Audio pipeline Audio signal 1. Feature Bag of ΔMFCCs extraction
    • Audio pipeline Audio signal 1. Feature Bag of ΔMFCCs extraction 2. Vector quantization Codeword hist.
    • Audio pipeline Audio signal 1. Feature Bag of ΔMFCCs extraction 2. Vector quantization PPK Codeword hist. 3. Probability product kernel
    • Audio pipeline Audio signal CF similarity Supervision PPK MLR Features
    • Evaluation: CAL10K• Last.fm collaborative filter [Celma, 2008] - 360K users, 186K artists• CAL10K songs [Tingle, Turnbull, & Kim, 2010] - 5.4K songs, 2K artists (after CF matching)
    • Evaluation: CAL10K• Last.fm collaborative filter [Celma, 2008] - 360K users, 186K artists• CAL10K songs [Tingle, Turnbull, & Kim, 2010] - 5.4K songs, 2K artists (after CF matching)• Evaluation: - Split artists into train/val/test - Target rankings: top-10 most similar train artists
    • Evaluation: comparison• Gaussian mixture models + KL divergence - 8 component, diagonal covariance GMM per song• Auto-tags: predict 149 semantic tags from audio [Turnbull, 2008]• [Our method] VQ+MLR: 1024 codewords• Expert tags: 1053 tags from Pandora [Tingle, et al., 2009]
    • Similarity learning: results GMM (KL) Auto-tags Auto-tags + MLR Audio VQ Audio VQ + MLR Expert tags (cos) Expert tags + MLR 0.65 0.70 0.75 0.80 0.85 0.90 0.95 AUC
    • Example playlists The Ramones - Go Mental Def Leppard - Promises The Buzzcocks - Harmony In My Head Los Lonely Boys - Roses Wolfmother - Colossal Judas Priest - Diamonds and Rust (live)
    • Example playlists The Ramones - Go Mental Def Leppard - Promises The Buzzcocks - Harmony In My Head Los Lonely Boys - Roses Wolfmother - Colossal Judas Priest - Diamonds and Rust (live) The Buzzcocks - Harmony In My Head Mötley Crüe - Same Ol Situation The Offspring - Gotta Get Away MLR The Misfits - Skulls AC/DC - Who Made Who (live)
    • Example playlists Fats Waller - Winter Weather Dizzy Gillespie - Shes Funny That Way Enrique Morente - Solea Chet Atkins - In the Mood Rachmaninov - Piano Concerto #4 Eluvium - Radio Ballet
    • Example playlists Fats Waller - Winter Weather Dizzy Gillespie - Shes Funny That Way Enrique Morente - Solea Chet Atkins - In the Mood Rachmaninov - Piano Concerto #4 Eluvium - Radio Ballet Chet Atkins - In the Mood Charlie Parker - What Is This Thing Called Love? Bud Powell - Oblivion Bob Wills & His Texas Playboys - Lyla Lou Bob Wills & His Texas Playboys - Sittin On Top of the World
    • Scaling up: fast retrieval [M. & Lanckriet, 2011]• Audio similarity search for a million songs?• Idea: Index data with spatial trees• 100-NN search over 900K songs: - Brute force: 2.4s - 50% recall: 0.14s 17x speedup - 20% recall: 0.02s 120x speedup
    • Similarity learning: summary• Collaborative filters provide user-centric music similarity• CF similarity can be approximated by audio features• Audio search can be done quickly at large-scale
    • Playlist generation
    • Playlist generation• Goal: generate a "good" song sequence - Music auto-pilot (given context)• Many existing algorithms, but no standard evaluation• What makes one algorithm better than another?
    • Playlist evaluation 1: Human survey• Idea: generate playlists, ask for opinions• Impractical at large-scale: - Huge search space - User taste, expertise can be problematic - Slow, expensive• Does not facilitate rapid evaluation and optimization
    • Playlist evaluation 2: Information retrieval• Idea: - Define "good" and "bad" playlists - Predict the next song, measure accuracy• But what makes a bad playlist?• Do users agree on good/bad?
    • A generative approach [M. & Lanckriet, 2011b]• Playlist algorithm = distribution over playlists• Dont evaluate synthetic playlists• Do evaluate the likelihood of generating real playlists
    • The playlist collection: AOTM-2011• Art of the Mix - 13 years of playlists - ~210K playlist segments - ~100K songs from MSD• Top 25 playlist categories: - Genre: Punk, Hip-hop, Reggae... - Context: Road trip, Break-up, Sleep... - Other: Mixed genre, Alternating DJ...
    • A simple playlist model 1. Start with a set of songs
    • A simple playlist model 2. Select a subset (e.g., jazz songs)
    • A simple playlist model 3. Select a song
    • A simple playlist model 4. Select a new subset
    • A simple playlist model 4. Select a new subset
    • A simple playlist model 5. Select a new song
    • A simple playlist model 6. Repeat...
    • A simple playlist model 6. Repeat...
    • Connecting the dots...• Random walk on a hypergraph - Vertices = songs - Edges = subsets• Edges derived from: - Audio clusters, tags, lyrics, era, popularity, CF - or combinations/intersections• Goal: optimize edge weights from example playlists
    • Playlist model exp. prior edge weights transitions playlists
    • Playlist generation: evaluation• Setup: - Split playlist collection into train/test - Learn edge weights on training playlists - Evaluate average likelihood of test playlists• Train per category, or all together• Compare against uniform shuffle baseline
    • Random walk results ALL Mixed Global model Theme Category-specific Rock-pop Alternating DJ Indie Single artist Romantic Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues 0% 5% 10% 1 5% 20% 25% Log-likelihood gain over random shuffle
    • Stationary model results ALL Mixed Global model Theme Category-specific Rock-pop Alternating DJ Indie Single artist Romantic Road trip Punk Depression Break up Narrative Hip-hop Sleep Electronic Dance-house R&B Country Cover songs Hardcore Rock Jazz Folk Reggae Blues -15% -10% -5% 0% 5% 10% 15% 20% Log-likelihood gain over random shuffle
    • Example playlists Rhythm & Blues 70s & soul Lyn Collins - Think Audio #14 & funk Isaac Hayes - No Name Bar DECADE 1965 & soul Michael Jackson - My Girl Electronic music Audio #11 & downtempo Everything But The Girl - Blame DECADE 1990 & trip-hop Massive Attack - Spying Glass Audio #11 & electronica Björk - Hunter
    • Playlist generation summary• Generative approach simplifies evaluation• AOTM-2011 collection facilitates learning and evaluation• Robust, efficient and transparent feature integration
    • The future
    • Directions for future work• Audio features: coding, dynamics and rhythm• Playlist models: mixtures, long-range interactions• UI models: interactive, context-aware, diversity
    • Personalized recommendation [M., Bertin-Mahieux, Ellis, & Lanckriet, 2012]• The Million Song Dataset Challenge• Listening histories for 1.1M users, 380K songs• Task: personalized song recommendation
    • Conclusion• MLR can optimize distance metrics for ranking, QBE retrieval• Audio similarity can approximate a collaborative filter• Generative playlist model integrates data, models dynamics• User-centric evaluation makes it all possible
    • Thanks!
    • Metric partial order feature • Score is large when distances match ranking
    • Playlist weights: 6390 edges ALL Mixed Theme Rock-pop Alternating DJ Indie Single Artist Romantic RoadTrip Punk Depression Break Up Narrative Hip-hop Sleep Electronic music Dance-houseRhythm and Blues Country Cover Hardcore Rock Jazz Folk Reggae Blues Audio CF Era Familiarity Lyrics Tags Uniform • Audio & CF: k-means (16/64/256) • Lyrics: LDA (k=32, top-1/3/5) • Era: year, decade, decade+5 • Tags: Last.fm top-10 • Familiarity: high/med/low • Conjunctions