Improving perceptual tempo estimation   with crowd-sourced annotations        Mark Levy, 26 October 2011
Tempo EstimationTerminology: tempo = beats per minute = bpm
Tempo EstimationUse crowd-sourcing: quantify influence of metrical ambiguity  on tempo perception improve evaluation im...
Perceived TempoMetrical ambiguity: listeners don’t agree about bpm typically in two camps perceived values differ by fa...
Perceived Tempo            Metrical ambiguity:listeners                                  listeners                     bpm...
Machine-Estimated TempoAlso affected by metrical ambiguity: makes estimation difficult natural to see multiple bpm value...
Crowd SourcingWeb-based questionnaire: capture label choices capture bpm from mean tapping interval capture comparative...
Crowd Sourcing
Crowd Sourcing Music:  over 4000 songs  30-second clips• rock, country, pop, soul, funk and rnb, jazz,   latin, reggae, ...
ResponseFirst week (reported/released): 4k tracks annotated by 2k listeners 20k labels and bpm estimatesTo date: 6k tra...
Analysis: ambiguityWhen people tap to a song at different bpm do they really disagree about whether it’s  slow or fast?In...
Analysis: ambiguitySubset of slow/fast songs: labelled by at least five listeners majority label “slow” or “fast”
Analysis: ambiguitybpm vs speed labelall estimates for slow/fast songs
Analysis: ambiguitybpm vs speed label            people can tap slowly to fast songsall estimates for slow/fast songs
Analysis: ambiguityLabels for fast songs from slow-tappers
Analysis: ambiguityQuantify disagreement over labels: model conflict, extremity of tempo conflict coefficient           ...
Analysis: ambiguityDistribution of conflict coefficient C            C > 0 means slow and fastall songs with at least five...
Analysis: ambiguitySubset of metrically ambiguous songs: at least 30% of listeners tap at half/twice the  majority estima...
Evaluation metricsMIREX: capture metrical ambiguity replicate human disagreementAmbiguity considered unhelpful: automat...
Evaluation metricsApplication-oriented : compare with majority* human estimate    (*median in most popular bin)   catego...
Analysis: evaluationSources: BPM List (DJ kit, human-moderated)    Donny Brusca, 7th edition, 2011   EchoNest/MSD (close...
Analysis: machine vs human    80%    70%    60%    50%                                               BPM List    40%      ...
Analysis: controlled testControlled comparison: exploit experience from website A/B testing use this to improve algorith...
Analysis: controlled testWhen visitor arrives at the page: choose a source S at random choose a bpm value at random cho...
Analysis: controlled testNull Hypothesis: there will be presentation effects listeners will attend to subtle differences...
Analysis: controlled test     100%     90%     80%     70%     60%     50%                                 different     4...
Analysis: improving estimatesAdjust bpm based on class: imagine an accurate slow/fast classifier       Hockmann and Fujin...
Analysis: adjusted vs human    80%    70%    60%    50%                                               BPM List    40%     ...
ConclusionsCrowd sourcing: gather thousands of data points in a few  days, half a million over time humans agree over sl...
Thanks!mark@last.fm      @gamboviolhttp://mir-in-action.blogspot.comhttp://playground.last.fm/demo/speedohttp://users.last...
Upcoming SlideShare
Loading in...5
×

Crowd sourcing for tempo estimation

2,796

Published on

Slides for presentation at ISMIR 2011 of the paper "Improving perceptual tempo estimation with crowd-source annotations".

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,796
On Slideshare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
7
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Crowd sourcing for tempo estimation

  1. 1. Improving perceptual tempo estimation with crowd-sourced annotations Mark Levy, 26 October 2011
  2. 2. Tempo EstimationTerminology: tempo = beats per minute = bpm
  3. 3. Tempo EstimationUse crowd-sourcing: quantify influence of metrical ambiguity on tempo perception improve evaluation improve algorithms
  4. 4. Perceived TempoMetrical ambiguity: listeners don’t agree about bpm typically in two camps perceived values differ by factor of 2 or 3McKinney and Moelants: 24-40 subjects released experimental data
  5. 5. Perceived Tempo Metrical ambiguity:listeners listeners bpm bpm McKinney and Moelants, 2004
  6. 6. Machine-Estimated TempoAlso affected by metrical ambiguity: makes estimation difficult natural to see multiple bpm values estimated values often out by factor of 2 or 3 (“octave error”)
  7. 7. Crowd SourcingWeb-based questionnaire: capture label choices capture bpm from mean tapping interval capture comparative judgements
  8. 8. Crowd Sourcing
  9. 9. Crowd Sourcing Music:  over 4000 songs  30-second clips• rock, country, pop, soul, funk and rnb, jazz, latin, reggae, disco, rap, punk, electronic, trance, industrial, house, folk, ...• recent releases back to 60s
  10. 10. ResponseFirst week (reported/released): 4k tracks annotated by 2k listeners 20k labels and bpm estimatesTo date: 6k tracks annotated by 27k listeners 200k labels and bpm estimates
  11. 11. Analysis: ambiguityWhen people tap to a song at different bpm do they really disagree about whether it’s slow or fast?Investigation: inspect labels from people who tap differently quantify disagreement for ambiguous songs
  12. 12. Analysis: ambiguitySubset of slow/fast songs: labelled by at least five listeners majority label “slow” or “fast”
  13. 13. Analysis: ambiguitybpm vs speed labelall estimates for slow/fast songs
  14. 14. Analysis: ambiguitybpm vs speed label people can tap slowly to fast songsall estimates for slow/fast songs
  15. 15. Analysis: ambiguityLabels for fast songs from slow-tappers
  16. 16. Analysis: ambiguityQuantify disagreement over labels: model conflict, extremity of tempo conflict coefficient min(Ls , L f ) Ls Lf C max(Ls , L f ) L Ls, Lf, L: number of slow, fast, all labels for a song
  17. 17. Analysis: ambiguityDistribution of conflict coefficient C C > 0 means slow and fastall songs with at least five labels
  18. 18. Analysis: ambiguitySubset of metrically ambiguous songs: at least 30% of listeners tap at half/twice the majority estimateCompared to the rest: no significant difference in C
  19. 19. Evaluation metricsMIREX: capture metrical ambiguity replicate human disagreementAmbiguity considered unhelpful: automatic playlisting DJ tools, production tools jogging
  20. 20. Evaluation metricsApplication-oriented : compare with majority* human estimate (*median in most popular bin) categorise machine estimates  same as humans  twice as fast  twice as slow  three times as fast  and so on  unrelated to humans
  21. 21. Analysis: evaluationSources: BPM List (DJ kit, human-moderated) Donny Brusca, 7th edition, 2011 EchoNest/MSD (closed-source algorithm) maybe Jehan et al,? VAMP (open-source algorithm) Davies and Landone, 2007-
  22. 22. Analysis: machine vs human 80% 70% 60% 50% BPM List 40% VAMP 30% EchoNest 20% 10% 0% x2 same /2 unrelated other
  23. 23. Analysis: controlled testControlled comparison: exploit experience from website A/B testing use this to improve algorithm iterativelyResult is independent of any quality metric
  24. 24. Analysis: controlled testWhen visitor arrives at the page: choose a source S at random choose a bpm value at random choose two songs given that value by S display them togetherThen ask which sounds faster!
  25. 25. Analysis: controlled testNull Hypothesis: there will be presentation effects listeners will attend to subtle differencesbut these effects are independent of the source of bpm estimates if the quality of the sources is the same
  26. 26. Analysis: controlled test 100% 90% 80% 70% 60% 50% different 40% same 30% 20% 10% 0% BPM List VAMP EchoNest
  27. 27. Analysis: improving estimatesAdjust bpm based on class: imagine an accurate slow/fast classifier Hockmann and Fujinaga, 2010 adjust as follows: bpm:= bpm/2 if slow and bpm > 100 bpm:= bpm*2 if fast and bpm < 100 otherwise don’t adjust simulation: accept majority human label
  28. 28. Analysis: adjusted vs human 80% 70% 60% 50% BPM List 40% VAMP 30% EchoNest 20% 10% 0% x2 same /2 unrelated other
  29. 29. ConclusionsCrowd sourcing: gather thousands of data points in a few days, half a million over time humans agree over slow/fast labels, even when they tap at different bpmImproving machine estimates: use controlled testing exploit a slow/fast classifier
  30. 30. Thanks!mark@last.fm @gamboviolhttp://mir-in-action.blogspot.comhttp://playground.last.fm/demo/speedohttp://users.last.fm/~mark/speedo.tgzWe are looking for interns/research fellows!
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×