Crowd sourcing for tempo estimation
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Crowd sourcing for tempo estimation

on

  • 2,932 views

Slides for presentation at ISMIR 2011 of the paper "Improving perceptual tempo estimation with crowd-source annotations".

Slides for presentation at ISMIR 2011 of the paper "Improving perceptual tempo estimation with crowd-source annotations".

Statistics

Views

Total Views
2,932
Views on SlideShare
1,763
Embed Views
1,169

Actions

Likes
2
Downloads
5
Comments
0

37 Embeds 1,169

http://mir-in-action.blogspot.com 484
http://mir-in-action.blogspot.co.uk 247
http://mir-in-action.blogspot.de 101
http://mir-in-action.blogspot.fi 57
http://mediafutures.cs.ucl.ac.uk 51
http://mir-in-action.blogspot.fr 23
http://mir-in-action.blogspot.ca 20
http://mir-in-action.blogspot.com.es 17
http://mir-in-action.blogspot.pt 15
http://mir-in-action.blogspot.nl 15
http://mir-in-action.blogspot.be 14
http://mir-in-action.blogspot.co.at 14
http://mir-in-action.blogspot.in 14
http://mir-in-action.blogspot.ie 12
http://mir-in-action.blogspot.ru 11
http://mir-in-action.blogspot.com.ar 8
http://mir-in-action.blogspot.kr 8
http://mir-in-action.blogspot.jp 6
http://mir-in-action.blogspot.com.br 6
http://mir-in-action.blogspot.se 6
http://mir-in-action.blogspot.ro 5
http://mir-in-action.blogspot.com.au 5
http://mir-in-action.blogspot.it 5
http://mir-in-action.blogspot.hk 4
http://mir-in-action.blogspot.dk 3
http://mir-in-action.blogspot.co.il 3
http://www.newsblur.com 2
http://webcache.googleusercontent.com 2
http://mir-in-action.blogspot.sk 2
http://mir-in-action.blogspot.gr 2
http://mir-in-action.blogspot.mx 1
http://mir-in-action.blogspot.tw 1
http://mir-in-action.blogspot.ch 1
http://mir-in-action.blogspot.cz 1
http://mir-in-action.blogspot.co.nz 1
http://www.mir-in-action.blogspot.com 1
http://digg.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Crowd sourcing for tempo estimation Presentation Transcript

  • 1. Improving perceptual tempo estimation with crowd-sourced annotations Mark Levy, 26 October 2011
  • 2. Tempo EstimationTerminology: tempo = beats per minute = bpm
  • 3. Tempo EstimationUse crowd-sourcing: quantify influence of metrical ambiguity on tempo perception improve evaluation improve algorithms
  • 4. Perceived TempoMetrical ambiguity: listeners don’t agree about bpm typically in two camps perceived values differ by factor of 2 or 3McKinney and Moelants: 24-40 subjects released experimental data
  • 5. Perceived Tempo Metrical ambiguity:listeners listeners bpm bpm McKinney and Moelants, 2004
  • 6. Machine-Estimated TempoAlso affected by metrical ambiguity: makes estimation difficult natural to see multiple bpm values estimated values often out by factor of 2 or 3 (“octave error”)
  • 7. Crowd SourcingWeb-based questionnaire: capture label choices capture bpm from mean tapping interval capture comparative judgements
  • 8. Crowd Sourcing
  • 9. Crowd Sourcing Music:  over 4000 songs  30-second clips• rock, country, pop, soul, funk and rnb, jazz, latin, reggae, disco, rap, punk, electronic, trance, industrial, house, folk, ...• recent releases back to 60s
  • 10. ResponseFirst week (reported/released): 4k tracks annotated by 2k listeners 20k labels and bpm estimatesTo date: 6k tracks annotated by 27k listeners 200k labels and bpm estimates
  • 11. Analysis: ambiguityWhen people tap to a song at different bpm do they really disagree about whether it’s slow or fast?Investigation: inspect labels from people who tap differently quantify disagreement for ambiguous songs
  • 12. Analysis: ambiguitySubset of slow/fast songs: labelled by at least five listeners majority label “slow” or “fast”
  • 13. Analysis: ambiguitybpm vs speed labelall estimates for slow/fast songs
  • 14. Analysis: ambiguitybpm vs speed label people can tap slowly to fast songsall estimates for slow/fast songs
  • 15. Analysis: ambiguityLabels for fast songs from slow-tappers
  • 16. Analysis: ambiguityQuantify disagreement over labels: model conflict, extremity of tempo conflict coefficient min(Ls , L f ) Ls Lf C max(Ls , L f ) L Ls, Lf, L: number of slow, fast, all labels for a song
  • 17. Analysis: ambiguityDistribution of conflict coefficient C C > 0 means slow and fastall songs with at least five labels
  • 18. Analysis: ambiguitySubset of metrically ambiguous songs: at least 30% of listeners tap at half/twice the majority estimateCompared to the rest: no significant difference in C
  • 19. Evaluation metricsMIREX: capture metrical ambiguity replicate human disagreementAmbiguity considered unhelpful: automatic playlisting DJ tools, production tools jogging
  • 20. Evaluation metricsApplication-oriented : compare with majority* human estimate (*median in most popular bin) categorise machine estimates  same as humans  twice as fast  twice as slow  three times as fast  and so on  unrelated to humans
  • 21. Analysis: evaluationSources: BPM List (DJ kit, human-moderated) Donny Brusca, 7th edition, 2011 EchoNest/MSD (closed-source algorithm) maybe Jehan et al,? VAMP (open-source algorithm) Davies and Landone, 2007-
  • 22. Analysis: machine vs human 80% 70% 60% 50% BPM List 40% VAMP 30% EchoNest 20% 10% 0% x2 same /2 unrelated other
  • 23. Analysis: controlled testControlled comparison: exploit experience from website A/B testing use this to improve algorithm iterativelyResult is independent of any quality metric
  • 24. Analysis: controlled testWhen visitor arrives at the page: choose a source S at random choose a bpm value at random choose two songs given that value by S display them togetherThen ask which sounds faster!
  • 25. Analysis: controlled testNull Hypothesis: there will be presentation effects listeners will attend to subtle differencesbut these effects are independent of the source of bpm estimates if the quality of the sources is the same
  • 26. Analysis: controlled test 100% 90% 80% 70% 60% 50% different 40% same 30% 20% 10% 0% BPM List VAMP EchoNest
  • 27. Analysis: improving estimatesAdjust bpm based on class: imagine an accurate slow/fast classifier Hockmann and Fujinaga, 2010 adjust as follows: bpm:= bpm/2 if slow and bpm > 100 bpm:= bpm*2 if fast and bpm < 100 otherwise don’t adjust simulation: accept majority human label
  • 28. Analysis: adjusted vs human 80% 70% 60% 50% BPM List 40% VAMP 30% EchoNest 20% 10% 0% x2 same /2 unrelated other
  • 29. ConclusionsCrowd sourcing: gather thousands of data points in a few days, half a million over time humans agree over slow/fast labels, even when they tap at different bpmImproving machine estimates: use controlled testing exploit a slow/fast classifier
  • 30. Thanks!mark@last.fm @gamboviolhttp://mir-in-action.blogspot.comhttp://playground.last.fm/demo/speedohttp://users.last.fm/~mark/speedo.tgzWe are looking for interns/research fellows!