Melody Detection in Polyphonic Audio

955 views

Published on

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
955
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
7
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Melody Detection in Polyphonic Audio

  1. 1. Rui Pedro Paiva Melody Detection in Polyphonic Audio 9º Encontro da APEA October 20, 2007 Centro de Informática e Sistemas da Universidade de Coimbra
  2. 2. 2 Melody Detection in Polyphonic Audio Outline  Introduction  Overview  Multi-Pitch Detection  From Pitches to Notes  Identification of Melodic Notes  Evaluation Procedures  Overall Results  Conclusions and Future Work
  3. 3. 3 Melody Detection in Polyphonic Audio Introduction  Motivation and Objectives • Goal - Extract a symbolic representation of melody from a polyphonic audio musical signal • Broad range of applications - MIR, music education, plagiarism detection, metadata • No general-purpose, robust, accurate solution developed so far  Overall Approach • Focus on melody, no matter what other sources are present (figure-ground separation)
  4. 4. 4 Melody Detection in Polyphonic Audio Introduction Melody Definition • Subjective concept - Different people may define and perceive melody in different ways - Proposals addressing cultural, perceptual, emotional or musicological facets of melody • Context of our work - “Melody is the individual dominant pitched line in a musical ensemble”
  5. 5. 5 Melody Detection in Polyphonic Audio Overview Raw Musical Signal Melody Notes Multi-Pitch Detection 1) Cochleagram 2) Correlogram 3) Pitch Salience Curve 4) Salient Pitches Determination of Musical Notes 1) Pitch Trajectory Construction 2) Trajectory Segmentation (with onset detection directly on the raw signal) Identification of Melodic Notes 1) Elimination of Ghost Octave Notes 2) Selection of the Most Salient Notes 3) Melody Smoothing 4) Elimination of False Positives Pitches in 46-msec Frames Entire Set of Notes
  6. 6. 6 Melody Detection in Polyphonic Audio Multi-Pitch Detection 0.21 0.22 0.23 0.24 0.25 -0.4 -0.2 0 0.2 0.4 Time (s) x(t) a) Sound waveform Time (s) Frequencychannel b) Cochleagram frame 0.21 0.22 0.23 0.24 0.25 20 40 60 80 Time Lag (ms) Frequncychannel c) Correlogram frame 0 5 10 15 20 20 40 60 80 0 5 10 15 20 25 0 0.1 0.2 0.3 0.4 Time Lag (ms) Pitchsalience d) Summary correlogram
  7. 7. 7 Melody Detection in Polyphonic Audio From Pitches to Notes Goal • Quantize temporal sequences of detected pitches into MIDI notes Approach • Pitch Trajectory Construction [Serra] • Frequency-Based Track Segmentation • Salience-Based Track Segmentation
  8. 8. 8 Melody Detection in Polyphonic Audio From Pitches to Notes Pitch Trajectory Construction 1 2 nn-1n-2 Frame Number MIDInotenumber
  9. 9. 9 Melody Detection in Polyphonic Audio From Pitches to Notes Frequency-Based Track Segmentation • Approximation of frequency curves by piecewise-constant functions (PCFs) - 1) MIDI quantization - 2) Merging of PCFs: oscillation, delimited sequences, glissando - 3) Time correction - 4) Assignment of final MIDI note numbers: tuning compensationb) Filtered PCFs a) Original PCFs
  10. 10. 10 Melody Detection in Polyphonic Audio From Pitches to Notes Frequency-Based Track Segmentation 0 100 200 300 400 500 4800 4900 5000 5100 5200 5300 5400 Time (msec) f[k](cents) a) Eliades Ochoa's excerpt 0 200 400 600 800 6100 6200 6300 6400 6500 6600 6700 6800 Time (msec) f[k](cents) b) Female opera excerpt
  11. 11. 11 Melody Detection in Polyphonic Audio From Pitches to Notes Salience-Based Track Segmentation - 1) Candidate Segmentation Points - 2) Onset Detection on the audio signal [Scheirer; Klapuri] - 3) Validation of Segmentation Points by Onset Matching 0 0.3 0.6 0.9 1.2 20 40 60 80 100 Time (sec) sF[k]
  12. 12. 12 Melody Detection in Polyphonic Audio Identification of Melodic Notes Goal • Separate the melodic notes from the accompaniment Approach • Elimination of Ghost Harmonically-Related Notes • Selection of the Most Salient Notes • Melody Smoothing • Elimination of Spurious Accompaniment Notes • Note Clustering
  13. 13. 13 Melody Detection in Polyphonic Audio Identification of Melodic Notes Elimination of Ghost Harmonically- Related Notes • Delete harmonically-related notes that satisfy the “common fate” principle 30 60 90 120 150 180 210 0.94 0.96 0.98 1 1.02 1.04 1.06 Time (msec) NormalizedFrequency
  14. 14. 14 Melody Detection in Polyphonic Audio Identification of Melodic Notes 0 1 2 3 4 5 6 40 50 60 70 80 90 Time (sec) MIDInotenumber 0 1 2 3 4 5 6 40 50 60 70 80 90 Time (sec) MIDInotenumber Selection of the Most Salient Notes Melody Smoothing Time MIDInotenumber 74 72 60 62 64 66 68 70 R1 a2 R3 a2 R2 a1 R4 a3
  15. 15. 15 Melody Detection in Polyphonic Audio Identification of Melodic Notes Elimination of Spurious Accompaniment Notes - 1) Delete deep valleys in the salience contour - 2) Delete isolated abrupt duration transitions 0 5 10 15 20 25 30 35 40 50 60 70 80 90 Note Sequence AveragePitchSalience
  16. 16. 16 Melody Detection in Polyphonic Audio Identification of Melodic Notes Elimination of Spurious Accompaniment Notes 0 5 10 15 40 50 60 70 80 90 Time (s) PitchSalience
  17. 17. 17 Melody Detection in Polyphonic Audio Identification of Melodic Notes Note Clustering • Feature Extraction - Spectral shape, harmonicity, attack transient, intensity, pitch-related features • Feature Selection and Dimensionality Reduction - Forward feature selection - Principal Component Analysis • Clustering - Gaussian Mixture Models – 2 clusters: melodic and “garbage” cluster
  18. 18. 18 Melody Detection in Polyphonic Audio Evaluation Procedures  Ground Truth Data ID Song Title Category Solo Type 1 Pachelbel’s “Kanon” Classical Instrumental 2 Handel’s “Hallelujah” Choral Vocal 3 Enya – “Only Time” New Age Vocal 4 Dido – “Thank You” Pop Vocal 5 Ricky Martin – “Private Emotion” Pop Vocal 6 Avril Lavigne – “Complicated” Pop/Rock Vocal 7 Claudio Roditi – “Rua Dona Margarida” Jazz/Easy Instrumental 8 Mambo Kings – “Bella Maria de Mi Alma” Bolero Instrumental 9 Eliades Ochoa – “Chan Chan” Son Vocal 10 Juan Luis Guerra – “Palomita Blanca” Bachata Vocal 11 Battlefield Band – “Snow on the Hills” Scottish Folk Instrumental 12 daisy2 Pop Vocal 13 daisy3 Pop Vocal 14 jazz2 Jazz Instrumental 15 jazz3 Jazz Instrumental 16 midi1 Pop Instrumental 17 midi2 Folk Instrumental 18 opera female 2 Opera Vocal 19 opera male 3 Opera Vocal 20 pop1 Pop Vocal 21 pop4 Pop Vocal
  19. 19. 19 Melody Detection in Polyphonic Audio Evaluation Procedures Evaluation Metrics • Pitch Contour Accuracy - Melodic Raw Pitch Accuracy (MRPA) - Overall Raw Pitch Accuracy (ORPA) - Melodic Raw Note Accuracy (MRNA) - Overall Raw Pitch Accuracy (ORNA) • Note Extraction Accuracy - Percentage of correct frames • Melody/Accompaniment Discrimination - Recall - Precison
  20. 20. 20 Melody Detection in Polyphonic Audio Overall Results  Multi-Pitch Detection • MRPA = 81.0%  Note Determination • Frequency-Based Track Segmentation: Re = 72%; Pr = 94.7% • Salience-Based Track Segmentation: Re = 75%; Pr = 41.2%  Identification of Melodic Notes • Elimination of Ghost Notes (EGN) - Eliminated 37.8% of notes (only 0.3% melodic) • Overall Melody Extraction - MRNA = 84.4%; ORNA = 77.0% • Melody/Accompaniment Discrimination - Recall = 31.0%; Precision = 52.8% • MIREX’2004 - Train (note accuracy) = 80.1/ 77.1%; Test = 77.4 / 75.1% - Train (pitch contour accuracy) = 75.1; 71.1% • MIREX’2005 - 61.1% overall raw pitch accuracy
  21. 21. 21 Melody Detection in Polyphonic Audio Conclusions and Future Work  Main Contributions • Note determination • Identification of melodic notes in a mixture  Conclusions • Adequate MPD approach in a melody context • Note determination with satisfactory accuracy • Identification of melodic notes successful in medium SNR conditions  Future Work • Larger musical corpus • Reduce computational time for pitch detection • Melody/Accompaniment discrimination

×