Successfully reported this slideshow.
Upcoming SlideShare
×

Beat tracking by dynamic programming

381 views

Published on

Published in: Education
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Beat tracking by dynamic programming

1. 1. Beat Tracking by Dynamic Programming<br />이장우<br />
2. 2. Beat Tracking<br />Deriving from a music audio signal a sequence of beat instants that might correspond to when a human listener would tap his foot<br />2 constraints<br />Correspond to moments in the audio where a beat is indicated<br />Reflect a locally-constant inter-beat-interval<br />
3. 3. Beat Tracking by Dynamic Programming<br />The 2 constraints map neatly onto the two constraints optimized in dynamic programming<br />The local match<br />The transition cost<br />
4. 4. the Beat Tracking System<br />Estimates a global tempo<br />Uses the tempo to construct a transition cost function<br />Uses dynamic programming to find the best-scoring set of beat times<br />
5. 5. Accuracy<br />55.2% on the MIREX-06 training data<br />The impact of the assumption of a fixed target tempo<br />Typically able to track tempo changes in a range of ±10% of the target tempo<br />
6. 6. Objective Function to Get Onsets & rhythmic Pattern<br />Assume that we have a constant target tempo<br />{ti} = the sequence of N beat instants found by the tracker<br />O(t) = an “onset strength envelope” derived from the audio<br />α = a weighting to balance the importance of the two terms<br />F(t, p) = a function that measures the consistency between an inter-beat interval t and the ideal beat spacing p defined by the target tempo<br />
7. 7. Objective Function to Get Onsets & rhythmic Pattern<br /><ul><li>anyway… thanks to dynamic programming, we can effectively search the entire exponentially-sized set of all possible time sequence in a linear-time operation
8. 8. WithMatlab, it can be simple
9. 9. 2 loops(forward calculation and backtrace) of 10 lines of code</li></li></ul><li>Objective Function to Get Onsets & rhythmic Pattern<br />
10. 10. The Beat Tracking System<br />the front-end processing to convert the input audio into the onset strength envelope<br />the global tempo estimation which provides the target inter-beat interval<br />
11. 11. Onset Strength Envelope<br />Resample the input sound to 8kHz<br />Calculate the STFT magnitude using 32ms windows and 4ms advance between frames<br />Convert to an approximate auditory representation by mapping to 40 Mel bands via a weighted summing of the spectrogram<br />The Mel spectrogram is converted dB, and the first-order difference along time is calculated in each band<br />
12. 12. Onset Strength Envelope<br />Negative values are set to Zero (half-wave rectification)<br />The remaining, positive differences are summed across all frequency bands<br />This signal is passed through a high-pass filter with a cutoff around 0.4Hz to make it locally zero-mean, and smoothed by convolving with a Gaussian envelope about 20 ms wide.<br />
13. 13. Onset Strength Envelope<br />
14. 14. Global Tempo Estimate<br />The dynamic programming formulation was base on prior knowledge of a target tempo<br />Now, time to how the tempo is estimated<br />Difficult to choose a single best peak among many correlation peaks of comparable magnitude<br />Humans have a bias towards 120BPM so we will use this bias, then interpret the scaled peaks.<br />
15. 15. Tempo Period Strength<br />W(T) = Gaussian weighting function on a log-time axis<br />To = the center of the tempo period bias<br />T = to control the width of the weighting curve<br />The primary tempo period estimate is the for which TPS(T) is largest<br />
16. 16. Tempo Period Strength<br />To set T & , we used the MIREX-06 Beat Tracking training data containing the actual tapping instants.<br />The subject tapping data could be clustered into two groups, corresponding to slower and faster levels of the metrical hierarchy of the music, which were separated by a ratio of 2 or 3<br />
17. 17. Tempo Calculation<br />To set T & , we used the MIREX-06 Beat Tracking training data containing the actual tapping instants.<br />The subject tapping data could be clustered into two groups, corresponding to slower and faster levels of the metrical hierarchy of the music<br />
18. 18. Tempo Calculation<br />Whichever sequence contains the larger value determines whether the tempo is considered duple or triple, respectively, and the location of the largest value is treated as the faster target tempo, with 1/2 or 1/3 of that tempo, respectively, as the adjacent metrical level.<br />
19. 19. Tempo Estimation Results<br />The tempo estimation system was evaluated within the MIREX-06 Tempo Extraction contest<br />The original tempo extraction algorithm of Global maximum of TPS scored 35.7% and 74.4% with the “accuracy1” and “accuracy2”.<br />The modified tempo extraction algorithm of the maximum of TPS2 or TPS3 scored 45.8% and 80.6%<br />
20. 20. Beat Tracking Results<br />N = the number of ground-truth records used in scoring<br />LG,i= the number of beats in ground truth sequence i<br />LA,i = the number of beats found by the algorithm relating to that sequence <br />G,i = the overall tempo of the ground-truth sequence<br />tG,I, = the time of the k thbeat in the ithground-truth sequence<br />tA,i,j= the time of the j thbeat in the algorithm’s corresponding beat-time sequence<br />
21. 21. Beat Tracking Results<br />Variation of beat tracker score against the 20 MIREX-06 Beat Tracking training examples as a function of , the objective function balance factor.<br />The best score over the entire test set is an accuracy of 58.8% for = 680<br />
22. 22. Limitation<br />Non-constant Tempos<br />Slowly-varying tempos<br />Abrupt changes<br />Finding the End<br />“filling in” applies equally well to any silent (or non-rhythmic) periods at the start or end of a recording<br />
23. 23. Conclusion<br />Despite its limitations, the simplicity and efficiency of the dynamic programming approach to beat tracking makes it an attractive default choice for general music applications<br />
24. 24. 질문은 받지 않습니다.<br />