Using the Shape of Music to Compute the similarity between Symbolic Musical Pieces

Using the Shape of Music to
Compute the Similarity between
Symbolic Musical Pieces
Julián Urbano, Juan Lloréns,
Jorge Morato and Sonia Sánchez-Cuadrado
http://julian-urbano.info
Twitter: @julian_urbano

CMMR 2010 · Málaga, Spain · June 24th

2

Outline
• Introduction
• Melodic Similarity Requirements
• General Solutions to the Requirements
• A Model Based on Interpolation
• Implementation and Experimental Results
• Conclusions and Future Work

3

Symbolic Melodic Similarity
• Given a musical piece (i.e. query), retrieve others
deemed melodically similar to it (i.e. results)
• Traditional approaches? [Typke et al., 2005a]
▫ Geometry [Ukkonen et al., 2003][Typke et al., 2004]
▫ n-grams [Uitdenbogerd et al., 1999][Doraisamy et al., 2003]
▫ Alignment [Hanna et al., 2007]
• What do we do?
▫ Use local an alignment algorithm
▫ whose symbols are n-grams
▫ according to a geometric substitution function

4

General Requirements
• Any Music Information Retrieval system should
meet several requirements
[Selfridge-Field, 1998][Byrd et al., 2002][Mongeau et al., 1990]
• Particularly focused on non-experts

• We just put together and thoroughly describe
traditional and well-known requirements mostly
related with transposition invariance
▫ Vertical requirements (i.e. pitch)
▫ Horizontal requirements (i.e. time)

5

Vertical Requirements
• Query [simplified riff from Layla by Dereck and the Dominos]

• Octave Equivalence

6

Vertical Requirements (II)
• Query

• Degree Equality

7

Vertical Requirements (III)
• Query

• Note Equality

8

Vertical Requirements (IV)
• Query

• Pitch Variation

9

Vertical Requirements (V)
• Query

• Harmonic Similarity

10

Vertical Requirements (and VI)
• Voice Separation

11

Horizontal Requirements
• Query [simplified beginning from op.81 no.10 by S. Heller]

• Time Signature Equivalence

12

Horizontal Requirements (II)
• Query

• Tempo Equivalence

13

Horizontal Requirements (III)
• Query

• Duration Equality

14

Horizontal Requirements (and IV)
• Query

• Duration Variation

15

General Vertical Solutions
• Octave Equivalence
▫ Disregard octave number but consider relative
changes (G5 to C6 is not the same as G5 to C5).
• Degree Equality
▫ Use the degrees within the tonality
• Note Equality
▫ Use actual pitch values
• Some approaches use both, but key signature is
not always available in SMF [Hanna et al., 2007]
• The accepted solution is to consider relative
pitch differences between successive notes

16

General Horizontal Solutions
• Time Signature Equivalence
▫ Just ignore it
• Tempo Equivalence
▫ Use actual note durations
• Duration Equality
▫ Use score durations
• Again, this information is not mandatory in
SMF, and users with different expertise would
prefer different approaches
• It is usual to just ignore time altogether, or use
the duration ratio between successive notes

17

A Model based on Interpolation
• Consider the time-pitch plane
• Arrange the notes as points in the plane,
according to their pitch and duration
• With different voices, get new pitch-dimensions
sharing the same time dimension

• Define the curve Ci(t) as the one interpolating
the notes of the i-th voice (pitch-dimension)

18

A Model based on Interpolation (II)

19

A Model based on Interpolation (and III)
• The similarity of two pieces is thought of as
their similarity in shape

• Most requirements are directly met
▫ Neither pitch nor time invariants change the
shape of the curve
▫ Pitch and Duration Variations can be measured
analytically

20

Measure of Similarity
• Consider the curves as polynomials
▫ C(t)=antn+an-1tn-1+…+a1t+a0

• The first derivative measures how much the
shape is changing at any time
• The shape dissimilarity between two curves
(songs) can be measured as the area between
their first derivatives

21

It is Metric
• Non-negativity
▫ diff(C, D) ≥ 0
• Identity of indiscernibles
▫ diff(C, D) = 0  C = D
• Symmetry
▫ diff(C, D) = diff(D, C)
• Triangle inequality
▫ diff(C, E) ≤ diff(C, D) + diff(D, E)

• So we could use vantage objects [Bozkaya et al., 1999]

22

Interpolation with Splines
• Easier to handle than Lagrange’s polynomials
• They avoid the Runge’s phenomenon [de Boor, 2001]

23

Interpolation with Splines (II)
• Defined as piece-wise functions

• Very handy to measure the Pitch and Duration
Variations
▫ Span durations can be normalized from 0 to 1

24

Interpolation with Splines (and III)
• Defined as parametric functions
▫ One function per dimension
• Pitch and Time can be compared separately
• Voices can be isolated easily
▫ Using partial derivatives
• More weight can be given to pitch than to time

25

First Implementation
• Dynamic programming has been widely used
with textual representations of music
▫ Levenshtein distance
▫ Needleman-Wunsch global alignment
▫ Smith-Waterman local alignment [Smith et al., 1981]
 Shown to be the most effective [Hanna et al., 2007, 2008]
• The symbols in the sequences are defined as
n-grams of successive notes, according to the
spans defined by the curve
• The substitution score between two n-grams
is the area between their curves’ derivatives

26

First Implementation (and II)
• We used degree 3 Uniform B-Splines [de Boor, 2003]
▫ Results in spans of 4 notes (n-gram length)
 Noted be effective [Doraisamy et al., 2003]
• Pitch relative to the first note’s
▫ 74, 81, 72, 76
▫ 7, -2, 2 (actually 0, 7, -2, 2)
• Duration relative to the first note’s
▫ 240, 480, 240, 720
▫ 2, 1, 3 (actually 1, 2, 1, 3 or 1/7, 2/7, 1/7, 3/7)

27

Results
• Tested with MIREX
2005 test collections
▫ Training and evaluation
collections
▫ 11 queries per collection
▫ About 550 songs per
collection
▫ Partially ordered lists with
relevants [Typke et al., 2005b]
▫ Effectiveness measured
with ADR [Typke et al., 2006]

28

Results (II)
• Two alternatives tested
▫ Kpitch=1 and Ktime=0
▫ Kpitch=0.75 and Ktime=0.25
 Chosen by others [Doraisamy et al., 2003][Hanna et al., 2007]
Tuning Collection Avg. Min. Max.
Kpitch=1 , Ktime=0 Training 0.639 0.271 0.864
Kpitch=0.75 , Ktime=0.25 Training 0.643 0.312 0.864
Kpitch=1 , Ktime=0 Evaluation 0.709 0.314 0.911
Kpitch=0.75 , Ktime=0.25 Evaluation 0.710 0.314 0.911

• We found the improvement of considering time
completely incidental

29

Results (and III)
• Compared with the official MIREX 2005 results
▫ We would have ranked first
▫ Best ADR scores for 5 of the 11 queries
Query Splines GAM O US TWV L(P3) L(DP) FM
190.011.224-1.1.1 0.803 0.820 0.717 0.824 0.538 0.455 0.547 0.443
400.065.784-1.1.1 0.879 0.846 0.619 0.624 0.861 0.614 0.839 0.679
450.024.802-1.1.1 0.722 0.450 0.554 0.340 0.554 0.340 0.340 0.340
600.053.475-1.1.1 0.911 0.883 0.911 0.911 0.725 0.661 0.650 0.567
600.053.481-1.1.1 0.630 0.293 0.629 0.486 0.293 0.357 0.293 0.519
600.054.278-1.1.1 0.810 0.674 0.785 0.864 0.731 0.660 0.527 0.418
600.192.742-1.1.1 0.703 0.808 0.808 0.703 0.808 0.642 0.642 0.808
700.010.059-1.1.2 0.521 0.521 0.521 0.521 0.521 0.667 0.521 0.521
700.010.591-1.4.2 0.314 0.665 0.314 0.314 0.314 0.474 0.314 0.375
702.001.406-1.1.1 0.689 0.566 0.874 0.675 0.387 0.722 0.606 0.469
703.001.021-1.1.1 0.826 0.730 0.412 0.799 0.548 0.549 0.692 0.561
Average 0.710 0.660 0.650 0.642 0.571* 0.558*** 0.543 ** 0.518***

bold for best per query, italics for best per system
* for significant difference at the 0.10 level, ** at the 0.05 level and *** at the 0.01 level

30

Conclusions
• We presented a new geometric model to
compute the similarity of symbolic pieces
▫ Opens a very promising line for further research
• It has a very intuitive interpretation, but not
so intuitive implementation
• A very early prototype has shown to perform
quite well with the MIREX 2005 test collections
▫ Would have ranked first
▫ Though not significantly better than the top 3
• The modeling of time is once again shown not to
improve the overall effectiveness

31

So Now What?
• We presented a very early work
• We are currently improving it
▫ As of today we reach avg. ADR scores of over 0.82
• Other considerations
▫ Local alignment? Domain-dependant tuning?
▫ Uniform B-Splines? Cardinal? Hermite?
▫ n-grams of length 4? Split at inflection points?
▫ Area between derivatives? Between the curves?
▫ Shape as a nominal variable (concave, convex)?
▫ Harmony: all possible paths? Polyphony?
• We will see...
▫ Submitting 3 or 4 versions to MIREX 2010

32

And That’s It!

Picture by 姒儿喵喵

Using the Shape of Music to Compute the similarity between Symbolic Musical Pieces

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (11)

Similar to Using the Shape of Music to Compute the similarity between Symbolic Musical Pieces

Similar to Using the Shape of Music to Compute the similarity between Symbolic Musical Pieces (20)

More from Julián Urbano

More from Julián Urbano (10)

Using the Shape of Music to Compute the similarity between Symbolic Musical Pieces