SlideShare a Scribd company logo
1 of 32
Download to read offline
Using the Shape of Music to
Compute the Similarity between
Symbolic Musical Pieces
Julián Urbano, Juan Lloréns,
Jorge Morato and Sonia Sánchez-Cuadrado
http://julian-urbano.info
Twitter: @julian_urbano



                            CMMR 2010 · Málaga, Spain · June 24th
2



Outline
•   Introduction
•   Melodic Similarity Requirements
•   General Solutions to the Requirements
•   A Model Based on Interpolation
•   Implementation and Experimental Results
•   Conclusions and Future Work
3



Symbolic Melodic Similarity
• Given a musical piece (i.e. query), retrieve others
  deemed melodically similar to it (i.e. results)
• Traditional approaches? [Typke et al., 2005a]
  ▫ Geometry [Ukkonen et al., 2003][Typke et al., 2004]
  ▫ n-grams [Uitdenbogerd et al., 1999][Doraisamy et al., 2003]
  ▫ Alignment [Hanna et al., 2007]
• What do we do?
  ▫ Use local an alignment algorithm
  ▫ whose symbols are n-grams
  ▫ according to a geometric substitution function
4



General Requirements
• Any Music Information Retrieval system should
  meet several requirements
 [Selfridge-Field, 1998][Byrd et al., 2002][Mongeau et al., 1990]
• Particularly focused on non-experts

• We just put together and thoroughly describe
  traditional and well-known requirements mostly
  related with transposition invariance
 ▫ Vertical requirements (i.e. pitch)
 ▫ Horizontal requirements (i.e. time)
5



Vertical Requirements
• Query [simplified riff from Layla by Dereck and the Dominos]




• Octave Equivalence
6



Vertical Requirements (II)
• Query




• Degree Equality
7



Vertical Requirements (III)
• Query




• Note Equality
8



Vertical Requirements (IV)
• Query




• Pitch Variation
9



Vertical Requirements (V)
• Query




• Harmonic Similarity
10



Vertical Requirements (and VI)
• Voice Separation
11



Horizontal Requirements
• Query [simplified beginning from op.81 no.10 by S. Heller]




• Time Signature Equivalence
12



Horizontal Requirements (II)
• Query




• Tempo Equivalence
13



Horizontal Requirements (III)
• Query




• Duration Equality
14



Horizontal Requirements (and IV)
• Query




• Duration Variation
15



General Vertical Solutions
• Octave Equivalence
 ▫ Disregard octave number but consider relative
   changes (G5 to C6 is not the same as G5 to C5).
• Degree Equality
 ▫ Use the degrees within the tonality
• Note Equality
 ▫ Use actual pitch values
• Some approaches use both, but key signature is
  not always available in SMF [Hanna et al., 2007]
• The accepted solution is to consider relative
  pitch differences between successive notes
16



General Horizontal Solutions
• Time Signature Equivalence
 ▫ Just ignore it
• Tempo Equivalence
 ▫ Use actual note durations
• Duration Equality
 ▫ Use score durations
• Again, this information is not mandatory in
  SMF, and users with different expertise would
  prefer different approaches
• It is usual to just ignore time altogether, or use
  the duration ratio between successive notes
17



A Model based on Interpolation
• Consider the time-pitch plane
• Arrange the notes as points in the plane,
  according to their pitch and duration
• With different voices, get new pitch-dimensions
  sharing the same time dimension

• Define the curve Ci(t) as the one interpolating
  the notes of the i-th voice (pitch-dimension)
18



A Model based on Interpolation (II)
19



A Model based on Interpolation (and III)
• The similarity of two pieces is thought of as
  their similarity in shape

• Most requirements are directly met
  ▫ Neither pitch nor time invariants change the
    shape of the curve
  ▫ Pitch and Duration Variations can be measured
    analytically
20



Measure of Similarity
• Consider the curves as polynomials
 ▫ C(t)=antn+an-1tn-1+…+a1t+a0

• The first derivative measures how much the
  shape is changing at any time
• The shape dissimilarity between two curves
  (songs) can be measured as the area between
  their first derivatives
21



It is Metric
• Non-negativity
  ▫ diff(C, D) ≥ 0
• Identity of indiscernibles
  ▫ diff(C, D) = 0  C = D
• Symmetry
  ▫ diff(C, D) = diff(D, C)
• Triangle inequality
  ▫ diff(C, E) ≤ diff(C, D) + diff(D, E)

• So we could use vantage objects [Bozkaya et al., 1999]
22



Interpolation with Splines
• Easier to handle than Lagrange’s polynomials
• They avoid the Runge’s phenomenon [de Boor, 2001]
23



Interpolation with Splines (II)
• Defined as piece-wise functions




• Very handy to measure the Pitch and Duration
  Variations
 ▫ Span durations can be normalized from 0 to 1
24



Interpolation with Splines (and III)
• Defined as parametric functions
 ▫ One function per dimension
• Pitch and Time can be compared separately
• Voices can be isolated easily
 ▫ Using partial derivatives
• More weight can be given to pitch than to time
25



First Implementation
• Dynamic programming has been widely used
  with textual representations of music
 ▫ Levenshtein distance
 ▫ Needleman-Wunsch global alignment
 ▫ Smith-Waterman local alignment [Smith et al., 1981]
    Shown to be the most effective [Hanna et al., 2007, 2008]
• The symbols in the sequences are defined as
  n-grams of successive notes, according to the
  spans defined by the curve
• The substitution score between two n-grams
  is the area between their curves’ derivatives
26



First Implementation (and II)
• We used degree 3 Uniform B-Splines [de Boor, 2003]
  ▫ Results in spans of 4 notes (n-gram length)
     Noted be effective [Doraisamy et al., 2003]
• Pitch relative to the first note’s
  ▫ 74, 81, 72, 76
  ▫ 7, -2, 2 (actually 0, 7, -2, 2)
• Duration relative to the first note’s
  ▫ 240, 480, 240, 720
  ▫ 2, 1, 3 (actually 1, 2, 1, 3 or 1/7, 2/7, 1/7, 3/7)
27



Results
• Tested with MIREX
  2005 test collections
 ▫ Training and evaluation
   collections
 ▫ 11 queries per collection
 ▫ About 550 songs per
   collection
 ▫ Partially ordered lists with
   relevants [Typke et al., 2005b]
 ▫ Effectiveness measured
   with ADR [Typke et al., 2006]
28



Results (II)
• Two alternatives tested
  ▫ Kpitch=1 and Ktime=0
  ▫ Kpitch=0.75 and Ktime=0.25
     Chosen by others [Doraisamy et al., 2003][Hanna et al., 2007]
            Tuning              Collection   Avg.    Min.    Max.
        Kpitch=1 , Ktime=0       Training    0.639   0.271   0.864
     Kpitch=0.75 , Ktime=0.25    Training    0.643   0.312   0.864
        Kpitch=1 , Ktime=0      Evaluation   0.709   0.314   0.911
     Kpitch=0.75 , Ktime=0.25   Evaluation   0.710   0.314   0.911

• We found the improvement of considering time
  completely incidental
29



Results (and III)
• Compared with the official MIREX 2005 results
  ▫ We would have ranked first
  ▫ Best ADR scores for 5 of the 11 queries
      Query          Splines     GAM     O     US   TWV                  L(P3)      L(DP)      FM
 190.011.224-1.1.1    0.803      0.820  0.717 0.824 0.538                 0.455      0.547    0.443
 400.065.784-1.1.1    0.879      0.846 0.619 0.624  0.861                 0.614     0.839     0.679
 450.024.802-1.1.1    0.722      0.450 0.554 0.340 0.554                  0.340     0.340     0.340
 600.053.475-1.1.1    0.911      0.883 0.911 0.911 0.725                  0.661     0.650     0.567
 600.053.481-1.1.1    0.630      0.293 0.629 0.486 0.293                  0.357     0.293     0.519
 600.054.278-1.1.1    0.810      0.674 0.785 0.864 0.731                  0.660      0.527    0.418
 600.192.742-1.1.1    0.703      0.808 0.808 0.703 0.808                  0.642     0.642 0.808
 700.010.059-1.1.2    0.521      0.521 0.521  0.521 0.521                0.667       0.521    0.521
 700.010.591-1.4.2    0.314      0.665 0.314  0.314 0.314                 0.474      0.314    0.375
 702.001.406-1.1.1    0.689      0.566 0.874 0.675  0.387                 0.722     0.606     0.469
 703.001.021-1.1.1    0.826      0.730 0.412  0.799 0.548                 0.549     0.692     0.561
     Average          0.710      0.660 0.650 0.642 0.571*                0.558***   0.543 ** 0.518***


                                                   bold for best per query, italics for best per system
            * for significant difference at the 0.10 level, ** at the 0.05 level and *** at the 0.01 level
30



Conclusions
• We presented a new geometric model to
  compute the similarity of symbolic pieces
 ▫ Opens a very promising line for further research
• It has a very intuitive interpretation, but not
  so intuitive implementation
• A very early prototype has shown to perform
  quite well with the MIREX 2005 test collections
 ▫ Would have ranked first
 ▫ Though not significantly better than the top 3
• The modeling of time is once again shown not to
  improve the overall effectiveness
31



So Now What?
• We presented a very early work
• We are currently improving it
  ▫ As of today we reach avg. ADR scores of over 0.82
• Other considerations
  ▫   Local alignment? Domain-dependant tuning?
  ▫   Uniform B-Splines? Cardinal? Hermite?
  ▫   n-grams of length 4? Split at inflection points?
  ▫   Area between derivatives? Between the curves?
  ▫   Shape as a nominal variable (concave, convex)?
  ▫   Harmony: all possible paths? Polyphony?
• We will see...
  ▫ Submitting 3 or 4 versions to MIREX 2010
32



And That’s It!




                 Picture by 姒儿喵喵

More Related Content

Viewers also liked

Improving the Generation of Ground Truths based on Partially Ordered Lists
Improving the Generation of Ground Truths based on Partially Ordered ListsImproving the Generation of Ground Truths based on Partially Ordered Lists
Improving the Generation of Ground Truths based on Partially Ordered ListsJulián Urbano
 
A Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR EvaluationA Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR EvaluationJulián Urbano
 
Evaluation in Audio Music Similarity
Evaluation in Audio Music SimilarityEvaluation in Audio Music Similarity
Evaluation in Audio Music SimilarityJulián Urbano
 
Audio Music Similarity and Retrieval: Evaluation Power and Stability
Audio Music Similarity and Retrieval: Evaluation Power and StabilityAudio Music Similarity and Retrieval: Evaluation Power and Stability
Audio Music Similarity and Retrieval: Evaluation Power and StabilityJulián Urbano
 
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...Julián Urbano
 
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...Julián Urbano
 
Validity and Reliability of Cranfield-like Evaluation in Information Retrieval
Validity and Reliability of Cranfield-like Evaluation in Information RetrievalValidity and Reliability of Cranfield-like Evaluation in Information Retrieval
Validity and Reliability of Cranfield-like Evaluation in Information RetrievalJulián Urbano
 
How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...Julián Urbano
 
Threshold Concepts in Quantitative Finance - DEE 2011 Presentation
Threshold Concepts in Quantitative Finance - DEE 2011 PresentationThreshold Concepts in Quantitative Finance - DEE 2011 Presentation
Threshold Concepts in Quantitative Finance - DEE 2011 PresentationRichard Diamond
 
CAPM: Introduction & Teaching Issues - Richard Diamond
CAPM: Introduction & Teaching Issues - Richard DiamondCAPM: Introduction & Teaching Issues - Richard Diamond
CAPM: Introduction & Teaching Issues - Richard DiamondRichard Diamond
 
Median and Its Significance - Dr Richard Diamond
Median and Its Significance - Dr Richard DiamondMedian and Its Significance - Dr Richard Diamond
Median and Its Significance - Dr Richard DiamondRichard Diamond
 

Viewers also liked (11)

Improving the Generation of Ground Truths based on Partially Ordered Lists
Improving the Generation of Ground Truths based on Partially Ordered ListsImproving the Generation of Ground Truths based on Partially Ordered Lists
Improving the Generation of Ground Truths based on Partially Ordered Lists
 
A Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR EvaluationA Plan for Sustainable MIR Evaluation
A Plan for Sustainable MIR Evaluation
 
Evaluation in Audio Music Similarity
Evaluation in Audio Music SimilarityEvaluation in Audio Music Similarity
Evaluation in Audio Music Similarity
 
Audio Music Similarity and Retrieval: Evaluation Power and Stability
Audio Music Similarity and Retrieval: Evaluation Power and StabilityAudio Music Similarity and Retrieval: Evaluation Power and Stability
Audio Music Similarity and Retrieval: Evaluation Power and Stability
 
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
Evaluation in (Music) Information Retrieval through the Audio Music Similarit...
 
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Mu...
 
Validity and Reliability of Cranfield-like Evaluation in Information Retrieval
Validity and Reliability of Cranfield-like Evaluation in Information RetrievalValidity and Reliability of Cranfield-like Evaluation in Information Retrieval
Validity and Reliability of Cranfield-like Evaluation in Information Retrieval
 
How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...How Significant is Statistically Significant? The case of Audio Music Similar...
How Significant is Statistically Significant? The case of Audio Music Similar...
 
Threshold Concepts in Quantitative Finance - DEE 2011 Presentation
Threshold Concepts in Quantitative Finance - DEE 2011 PresentationThreshold Concepts in Quantitative Finance - DEE 2011 Presentation
Threshold Concepts in Quantitative Finance - DEE 2011 Presentation
 
CAPM: Introduction & Teaching Issues - Richard Diamond
CAPM: Introduction & Teaching Issues - Richard DiamondCAPM: Introduction & Teaching Issues - Richard Diamond
CAPM: Introduction & Teaching Issues - Richard Diamond
 
Median and Its Significance - Dr Richard Diamond
Median and Its Significance - Dr Richard DiamondMedian and Its Significance - Dr Richard Diamond
Median and Its Significance - Dr Richard Diamond
 

Similar to Using the Shape of Music to Compute the similarity between Symbolic Musical Pieces

Algorithmic Techniques for Parametric Model Recovery
Algorithmic Techniques for Parametric Model RecoveryAlgorithmic Techniques for Parametric Model Recovery
Algorithmic Techniques for Parametric Model RecoveryCurvSurf
 
Cp formal lab report
Cp formal lab reportCp formal lab report
Cp formal lab reportstephm32
 
Effectiveness and code optimization in Java
Effectiveness and code optimization in JavaEffectiveness and code optimization in Java
Effectiveness and code optimization in JavaStrannik_2013
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic NotationsRishabh Soni
 
Tony TUNG @ Matsuyama Lab., Kyoto University 2007-2014
Tony TUNG @ Matsuyama Lab., Kyoto University 2007-2014Tony TUNG @ Matsuyama Lab., Kyoto University 2007-2014
Tony TUNG @ Matsuyama Lab., Kyoto University 2007-2014Tony Tung
 
k-space Diagonal Preconditioner: Speeding Up Iterative Reconstruction For Va...
k-space Diagonal Preconditioner:  Speeding Up Iterative Reconstruction For Va...k-space Diagonal Preconditioner:  Speeding Up Iterative Reconstruction For Va...
k-space Diagonal Preconditioner: Speeding Up Iterative Reconstruction For Va...Frank Ong
 
General Phase Regularized MRI Reconstruction Using Phase Cycling
General Phase Regularized MRI Reconstruction  Using Phase CyclingGeneral Phase Regularized MRI Reconstruction  Using Phase Cycling
General Phase Regularized MRI Reconstruction Using Phase CyclingFrank Ong
 
Using nasal curves matching for expression robust 3D nose recognition
Using nasal curves matching for expression robust 3D nose recognitionUsing nasal curves matching for expression robust 3D nose recognition
Using nasal curves matching for expression robust 3D nose recognitionMehryar (Mike) E., Ph.D.
 
Asrec pathania
Asrec  pathaniaAsrec  pathania
Asrec pathaniaUNU-WIDER
 
Bergman lundberg lundberg stake ippc2014
Bergman lundberg lundberg stake ippc2014Bergman lundberg lundberg stake ippc2014
Bergman lundberg lundberg stake ippc2014Dr. Paul Davis
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationUlaş Bağcı
 
Analysis and Enhancement of Algorithms in Computational Geometry
Analysis and Enhancement of Algorithms in Computational GeometryAnalysis and Enhancement of Algorithms in Computational Geometry
Analysis and Enhancement of Algorithms in Computational GeometryKasun Ranga Wijeweera
 
Seismic refractionsurveying r4a
Seismic refractionsurveying r4aSeismic refractionsurveying r4a
Seismic refractionsurveying r4aAlina Arshad
 
Seismic Refraction Surveying
Seismic Refraction SurveyingSeismic Refraction Surveying
Seismic Refraction SurveyingAli Osman Öncel
 
Medición ángulos ii
Medición ángulos iiMedición ángulos ii
Medición ángulos iiVaro Racing
 
Forecasting time series powerful and simple
Forecasting time series powerful and simpleForecasting time series powerful and simple
Forecasting time series powerful and simpleIvo Andreev
 
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES Toru Tamaki
 

Similar to Using the Shape of Music to Compute the similarity between Symbolic Musical Pieces (20)

Algorithmic Techniques for Parametric Model Recovery
Algorithmic Techniques for Parametric Model RecoveryAlgorithmic Techniques for Parametric Model Recovery
Algorithmic Techniques for Parametric Model Recovery
 
Searching Algorithms
Searching AlgorithmsSearching Algorithms
Searching Algorithms
 
Cp formal lab report
Cp formal lab reportCp formal lab report
Cp formal lab report
 
Effectiveness and code optimization in Java
Effectiveness and code optimization in JavaEffectiveness and code optimization in Java
Effectiveness and code optimization in Java
 
Asymptotic Notations
Asymptotic NotationsAsymptotic Notations
Asymptotic Notations
 
Tony TUNG @ Matsuyama Lab., Kyoto University 2007-2014
Tony TUNG @ Matsuyama Lab., Kyoto University 2007-2014Tony TUNG @ Matsuyama Lab., Kyoto University 2007-2014
Tony TUNG @ Matsuyama Lab., Kyoto University 2007-2014
 
k-space Diagonal Preconditioner: Speeding Up Iterative Reconstruction For Va...
k-space Diagonal Preconditioner:  Speeding Up Iterative Reconstruction For Va...k-space Diagonal Preconditioner:  Speeding Up Iterative Reconstruction For Va...
k-space Diagonal Preconditioner: Speeding Up Iterative Reconstruction For Va...
 
General Phase Regularized MRI Reconstruction Using Phase Cycling
General Phase Regularized MRI Reconstruction  Using Phase CyclingGeneral Phase Regularized MRI Reconstruction  Using Phase Cycling
General Phase Regularized MRI Reconstruction Using Phase Cycling
 
Using nasal curves matching for expression robust 3D nose recognition
Using nasal curves matching for expression robust 3D nose recognitionUsing nasal curves matching for expression robust 3D nose recognition
Using nasal curves matching for expression robust 3D nose recognition
 
Asrec pathania
Asrec  pathaniaAsrec  pathania
Asrec pathania
 
bending processes and springback
bending processes and springbackbending processes and springback
bending processes and springback
 
Bergman lundberg lundberg stake ippc2014
Bergman lundberg lundberg stake ippc2014Bergman lundberg lundberg stake ippc2014
Bergman lundberg lundberg stake ippc2014
 
Lecture 6,7,8
Lecture 6,7,8Lecture 6,7,8
Lecture 6,7,8
 
Lec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable RegistrationLec16: Medical Image Registration (Advanced): Deformable Registration
Lec16: Medical Image Registration (Advanced): Deformable Registration
 
Analysis and Enhancement of Algorithms in Computational Geometry
Analysis and Enhancement of Algorithms in Computational GeometryAnalysis and Enhancement of Algorithms in Computational Geometry
Analysis and Enhancement of Algorithms in Computational Geometry
 
Seismic refractionsurveying r4a
Seismic refractionsurveying r4aSeismic refractionsurveying r4a
Seismic refractionsurveying r4a
 
Seismic Refraction Surveying
Seismic Refraction SurveyingSeismic Refraction Surveying
Seismic Refraction Surveying
 
Medición ángulos ii
Medición ángulos iiMedición ángulos ii
Medición ángulos ii
 
Forecasting time series powerful and simple
Forecasting time series powerful and simpleForecasting time series powerful and simple
Forecasting time series powerful and simple
 
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
SCALE RATIO ICP FOR 3D POINT CLOUDS WITH DIFFERENT SCALES
 

More from Julián Urbano

Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Julián Urbano
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowJulián Urbano
 
The Treatment of Ties in AP Correlation
The Treatment of Ties in AP CorrelationThe Treatment of Ties in AP Correlation
The Treatment of Ties in AP CorrelationJulián Urbano
 
Crawling the Web for Structured Documents
Crawling the Web for Structured DocumentsCrawling the Web for Structured Documents
Crawling the Web for Structured DocumentsJulián Urbano
 
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...Julián Urbano
 
A Comparison of the Optimality of Statistical Significance Tests for Informat...
A Comparison of the Optimality of Statistical Significance Tests for Informat...A Comparison of the Optimality of Statistical Significance Tests for Informat...
A Comparison of the Optimality of Statistical Significance Tests for Informat...Julián Urbano
 
MIREX 2010 Symbolic Melodic Similarity: Local Alignment with Geometric Repres...
MIREX 2010 Symbolic Melodic Similarity: Local Alignment with Geometric Repres...MIREX 2010 Symbolic Melodic Similarity: Local Alignment with Geometric Repres...
MIREX 2010 Symbolic Melodic Similarity: Local Alignment with Geometric Repres...Julián Urbano
 
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track
The University Carlos III of Madrid at TREC 2011 Crowdsourcing TrackThe University Carlos III of Madrid at TREC 2011 Crowdsourcing Track
The University Carlos III of Madrid at TREC 2011 Crowdsourcing TrackJulián Urbano
 
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...Julián Urbano
 

More from Julián Urbano (10)

Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...Statistical Significance Testing in Information Retrieval: An Empirical Analy...
Statistical Significance Testing in Information Retrieval: An Empirical Analy...
 
Your PhD and You
Your PhD and YouYour PhD and You
Your PhD and You
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and How
 
The Treatment of Ties in AP Correlation
The Treatment of Ties in AP CorrelationThe Treatment of Ties in AP Correlation
The Treatment of Ties in AP Correlation
 
Crawling the Web for Structured Documents
Crawling the Web for Structured DocumentsCrawling the Web for Structured Documents
Crawling the Web for Structured Documents
 
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
How Do Gain and Discount Functions Affect the Correlation between DCG and Use...
 
A Comparison of the Optimality of Statistical Significance Tests for Informat...
A Comparison of the Optimality of Statistical Significance Tests for Informat...A Comparison of the Optimality of Statistical Significance Tests for Informat...
A Comparison of the Optimality of Statistical Significance Tests for Informat...
 
MIREX 2010 Symbolic Melodic Similarity: Local Alignment with Geometric Repres...
MIREX 2010 Symbolic Melodic Similarity: Local Alignment with Geometric Repres...MIREX 2010 Symbolic Melodic Similarity: Local Alignment with Geometric Repres...
MIREX 2010 Symbolic Melodic Similarity: Local Alignment with Geometric Repres...
 
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track
The University Carlos III of Madrid at TREC 2011 Crowdsourcing TrackThe University Carlos III of Madrid at TREC 2011 Crowdsourcing Track
The University Carlos III of Madrid at TREC 2011 Crowdsourcing Track
 
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
What is the Effect of Audio Quality on the Robustness of MFCCs and Chroma Fea...
 

Using the Shape of Music to Compute the similarity between Symbolic Musical Pieces

  • 1. Using the Shape of Music to Compute the Similarity between Symbolic Musical Pieces Julián Urbano, Juan Lloréns, Jorge Morato and Sonia Sánchez-Cuadrado http://julian-urbano.info Twitter: @julian_urbano CMMR 2010 · Málaga, Spain · June 24th
  • 2. 2 Outline • Introduction • Melodic Similarity Requirements • General Solutions to the Requirements • A Model Based on Interpolation • Implementation and Experimental Results • Conclusions and Future Work
  • 3. 3 Symbolic Melodic Similarity • Given a musical piece (i.e. query), retrieve others deemed melodically similar to it (i.e. results) • Traditional approaches? [Typke et al., 2005a] ▫ Geometry [Ukkonen et al., 2003][Typke et al., 2004] ▫ n-grams [Uitdenbogerd et al., 1999][Doraisamy et al., 2003] ▫ Alignment [Hanna et al., 2007] • What do we do? ▫ Use local an alignment algorithm ▫ whose symbols are n-grams ▫ according to a geometric substitution function
  • 4. 4 General Requirements • Any Music Information Retrieval system should meet several requirements [Selfridge-Field, 1998][Byrd et al., 2002][Mongeau et al., 1990] • Particularly focused on non-experts • We just put together and thoroughly describe traditional and well-known requirements mostly related with transposition invariance ▫ Vertical requirements (i.e. pitch) ▫ Horizontal requirements (i.e. time)
  • 5. 5 Vertical Requirements • Query [simplified riff from Layla by Dereck and the Dominos] • Octave Equivalence
  • 6. 6 Vertical Requirements (II) • Query • Degree Equality
  • 7. 7 Vertical Requirements (III) • Query • Note Equality
  • 8. 8 Vertical Requirements (IV) • Query • Pitch Variation
  • 9. 9 Vertical Requirements (V) • Query • Harmonic Similarity
  • 10. 10 Vertical Requirements (and VI) • Voice Separation
  • 11. 11 Horizontal Requirements • Query [simplified beginning from op.81 no.10 by S. Heller] • Time Signature Equivalence
  • 12. 12 Horizontal Requirements (II) • Query • Tempo Equivalence
  • 13. 13 Horizontal Requirements (III) • Query • Duration Equality
  • 14. 14 Horizontal Requirements (and IV) • Query • Duration Variation
  • 15. 15 General Vertical Solutions • Octave Equivalence ▫ Disregard octave number but consider relative changes (G5 to C6 is not the same as G5 to C5). • Degree Equality ▫ Use the degrees within the tonality • Note Equality ▫ Use actual pitch values • Some approaches use both, but key signature is not always available in SMF [Hanna et al., 2007] • The accepted solution is to consider relative pitch differences between successive notes
  • 16. 16 General Horizontal Solutions • Time Signature Equivalence ▫ Just ignore it • Tempo Equivalence ▫ Use actual note durations • Duration Equality ▫ Use score durations • Again, this information is not mandatory in SMF, and users with different expertise would prefer different approaches • It is usual to just ignore time altogether, or use the duration ratio between successive notes
  • 17. 17 A Model based on Interpolation • Consider the time-pitch plane • Arrange the notes as points in the plane, according to their pitch and duration • With different voices, get new pitch-dimensions sharing the same time dimension • Define the curve Ci(t) as the one interpolating the notes of the i-th voice (pitch-dimension)
  • 18. 18 A Model based on Interpolation (II)
  • 19. 19 A Model based on Interpolation (and III) • The similarity of two pieces is thought of as their similarity in shape • Most requirements are directly met ▫ Neither pitch nor time invariants change the shape of the curve ▫ Pitch and Duration Variations can be measured analytically
  • 20. 20 Measure of Similarity • Consider the curves as polynomials ▫ C(t)=antn+an-1tn-1+…+a1t+a0 • The first derivative measures how much the shape is changing at any time • The shape dissimilarity between two curves (songs) can be measured as the area between their first derivatives
  • 21. 21 It is Metric • Non-negativity ▫ diff(C, D) ≥ 0 • Identity of indiscernibles ▫ diff(C, D) = 0  C = D • Symmetry ▫ diff(C, D) = diff(D, C) • Triangle inequality ▫ diff(C, E) ≤ diff(C, D) + diff(D, E) • So we could use vantage objects [Bozkaya et al., 1999]
  • 22. 22 Interpolation with Splines • Easier to handle than Lagrange’s polynomials • They avoid the Runge’s phenomenon [de Boor, 2001]
  • 23. 23 Interpolation with Splines (II) • Defined as piece-wise functions • Very handy to measure the Pitch and Duration Variations ▫ Span durations can be normalized from 0 to 1
  • 24. 24 Interpolation with Splines (and III) • Defined as parametric functions ▫ One function per dimension • Pitch and Time can be compared separately • Voices can be isolated easily ▫ Using partial derivatives • More weight can be given to pitch than to time
  • 25. 25 First Implementation • Dynamic programming has been widely used with textual representations of music ▫ Levenshtein distance ▫ Needleman-Wunsch global alignment ▫ Smith-Waterman local alignment [Smith et al., 1981]  Shown to be the most effective [Hanna et al., 2007, 2008] • The symbols in the sequences are defined as n-grams of successive notes, according to the spans defined by the curve • The substitution score between two n-grams is the area between their curves’ derivatives
  • 26. 26 First Implementation (and II) • We used degree 3 Uniform B-Splines [de Boor, 2003] ▫ Results in spans of 4 notes (n-gram length)  Noted be effective [Doraisamy et al., 2003] • Pitch relative to the first note’s ▫ 74, 81, 72, 76 ▫ 7, -2, 2 (actually 0, 7, -2, 2) • Duration relative to the first note’s ▫ 240, 480, 240, 720 ▫ 2, 1, 3 (actually 1, 2, 1, 3 or 1/7, 2/7, 1/7, 3/7)
  • 27. 27 Results • Tested with MIREX 2005 test collections ▫ Training and evaluation collections ▫ 11 queries per collection ▫ About 550 songs per collection ▫ Partially ordered lists with relevants [Typke et al., 2005b] ▫ Effectiveness measured with ADR [Typke et al., 2006]
  • 28. 28 Results (II) • Two alternatives tested ▫ Kpitch=1 and Ktime=0 ▫ Kpitch=0.75 and Ktime=0.25  Chosen by others [Doraisamy et al., 2003][Hanna et al., 2007] Tuning Collection Avg. Min. Max. Kpitch=1 , Ktime=0 Training 0.639 0.271 0.864 Kpitch=0.75 , Ktime=0.25 Training 0.643 0.312 0.864 Kpitch=1 , Ktime=0 Evaluation 0.709 0.314 0.911 Kpitch=0.75 , Ktime=0.25 Evaluation 0.710 0.314 0.911 • We found the improvement of considering time completely incidental
  • 29. 29 Results (and III) • Compared with the official MIREX 2005 results ▫ We would have ranked first ▫ Best ADR scores for 5 of the 11 queries Query Splines GAM O US TWV L(P3) L(DP) FM 190.011.224-1.1.1 0.803 0.820 0.717 0.824 0.538 0.455 0.547 0.443 400.065.784-1.1.1 0.879 0.846 0.619 0.624 0.861 0.614 0.839 0.679 450.024.802-1.1.1 0.722 0.450 0.554 0.340 0.554 0.340 0.340 0.340 600.053.475-1.1.1 0.911 0.883 0.911 0.911 0.725 0.661 0.650 0.567 600.053.481-1.1.1 0.630 0.293 0.629 0.486 0.293 0.357 0.293 0.519 600.054.278-1.1.1 0.810 0.674 0.785 0.864 0.731 0.660 0.527 0.418 600.192.742-1.1.1 0.703 0.808 0.808 0.703 0.808 0.642 0.642 0.808 700.010.059-1.1.2 0.521 0.521 0.521 0.521 0.521 0.667 0.521 0.521 700.010.591-1.4.2 0.314 0.665 0.314 0.314 0.314 0.474 0.314 0.375 702.001.406-1.1.1 0.689 0.566 0.874 0.675 0.387 0.722 0.606 0.469 703.001.021-1.1.1 0.826 0.730 0.412 0.799 0.548 0.549 0.692 0.561 Average 0.710 0.660 0.650 0.642 0.571* 0.558*** 0.543 ** 0.518*** bold for best per query, italics for best per system * for significant difference at the 0.10 level, ** at the 0.05 level and *** at the 0.01 level
  • 30. 30 Conclusions • We presented a new geometric model to compute the similarity of symbolic pieces ▫ Opens a very promising line for further research • It has a very intuitive interpretation, but not so intuitive implementation • A very early prototype has shown to perform quite well with the MIREX 2005 test collections ▫ Would have ranked first ▫ Though not significantly better than the top 3 • The modeling of time is once again shown not to improve the overall effectiveness
  • 31. 31 So Now What? • We presented a very early work • We are currently improving it ▫ As of today we reach avg. ADR scores of over 0.82 • Other considerations ▫ Local alignment? Domain-dependant tuning? ▫ Uniform B-Splines? Cardinal? Hermite? ▫ n-grams of length 4? Split at inflection points? ▫ Area between derivatives? Between the curves? ▫ Shape as a nominal variable (concave, convex)? ▫ Harmony: all possible paths? Polyphony? • We will see... ▫ Submitting 3 or 4 versions to MIREX 2010
  • 32. 32 And That’s It! Picture by 姒儿喵喵