Symbolic Melodic Similarity
• Symbolic
– No audio signals, just notes
• Melodic
– No polyphony
– Usually no harmony
– Usually a single voice
• Similarity
– Sound alike
– In terms of…good question!
– Very ill-defined: “whatever criteria you think makes two
melodies sound alike”
5
Typical Use Case
• A user has a (large) collection of melodies and a
query melody
• An SMS system ranks all melodies in the
collection according to their estimated melodic
similarity to the query melody
– Implicitly, similarity is considered non-binary
• Two melodies are similar to each other to some degree
• Similarity is not a binary yes-no relation
– Usually just the top-k most similar
6
Desired: Transposition Invariance
• Ignore “irrelevant” differences in pitch
– Same note, different octave
– Same degree, different key
– Same pitch, different key
7
Desired: Time-scale Invariance
• Ignore “irrelevant” differences in rhythm
– Same duration, different time signature
– Same duration, different tempo
8
Desired: No Metadata
• Try not to rely on metadata, because it is not
always available
– Obvious: artist, genre, etc.
– Less-obvious: key, time signature, tempo, etc.
9
Models of Similarity
• Given two melodies A and B…
– Represent them following these criteria
– But how do we compute their similarity?
• Several types of models
– Melody as text, classic retrieval
– Melody as text, editing distance
– Melody as graphs or trees, transition similarity
– Melody as n-grams, classic retrieval
– Melody as set of points, geometric similarities
– Melody as orthogonal chains, area between chains
– Melody as curves, alignment with geometric similarities
14
Melody as (sequence of) Text
• Melodies are just a string of text, much like
sentences in a book
• Classic retrieval: compare frequencies of symbols
– C4 appears 5% of times, D6 does 0.38% of times, etc.
– Compare on a pitch-by-pitch basis
• Bag-of-words: loses sense of order in music
15
Melody as (sequence of) Text
• Melodies are just a string of text, much like
sentences in a book
• Editing distance: how many transformations are
needed to go from melody A to melody B?
– C7 E7
A6 C7
– C7 A6 D7 A6 C7
16
Melody as Graphs
• Melodies are a graph representing the
probability of transitions from pitch to pitch
– Compare on a transition-by-transition basis
D7
D7
E7
0.1
0.02
E7
0.1
0.1
0.1
0.08
G3
C7
0.15
E4
0.09
0.1
0.09
G3
C7
0.02
E4
0.09
0.01
0.01
• Sort-of sense of order
17
Melody as (sequence of) n-grams
• Overlapping sequences of n successive notes
– C7 E7 A6 C7 D7
• n=2 : (C7 E7) (E7 A6) (A6 C7) (C7 D7)
• n=3 : (C7 E7 A6) (E7 A6 C7) (A6 C7 D7)
• n=4 : (C7 E7 A6 C7) (E7 A6 C7 D7)
• Classic retrieval: compare frequencies of n-grams
• Motifs, though n is fixed
• Sort-of sense of order
18
Melody as (set of) Points
• Notes are points in the pitch-time plane, spaced
according to the pitch and time representation
– Compare disposition of points (eg. EMD)
19
Melody as Orthogonal Chain
• Represent melodies in a stairway fashion
– Compare area between chains, looking for optimum
20
Alignment of Symbols
• Independent of representation
• Alignment: similar to editing distance, but weight
differently insertions, deletions and mismatches
– C7
E7 A6 C7
– C7 A6 D7 A6 C7
(mismatch E7-D7 “costs” less than mismatch A6-E7)
• Problem: how to decide on weights
– Usually just compare numbers (pitch, octave, etc.)
– Some apply music theory
21
Melody as a Curve
• Distribute notes as points in the pitch-time plane,
spaced according to their relative representation
• Calculate the curve interpolating those points
23
Intuition Behind
• Two melodies are as similar as their curves are
• First derivative measures how much change
there is in pitch at any given point in time
– Nicely models the “change in music”
• Area between derivatives measures how much
the two melodies differ in their change of pitch
24
Intuition Behind
• Naturally meets all transposition and time-scale
invariance requirements
• Naturally keeps sense of order in music
25
Interpolation
• Many (boring) details regarding type of
interpolation
– Lagrange polynomials
– Splines
• Type
• Degree
26
Interpolation
• We chose Uniform B-Splines of various degrees
– Changes are kept local
– Easy to compute
– Easy to differentiate (polynomials)
• Degrees 2 and 3 seemed to work better
– Computed in a span-by-span basis
• Sense of motifs, like n-grams
– Easily adaptable to differences in length
– Easily incorporate several dimensions of music beyond
pitch and time
– Easily compute similarity per dimension (partial derivative)
– Many other nice properties in terms of extreme behavior
27
Implementation
• Represent melodies A and B with relative
distances between two successive notes
• Compute splines spans
– Same as compute n-grams and then smaller splines
• Align spline spans
– Weights computed from similarity in shape of spans
28
Two Approaches
• Based on geometric properties and
characteristics of the collection
– Reward matches depending on average similarity
between spline spans in the collection
– Penalize insertions and deletions based on geometric
overall change with respect to x-axis
– Penalize mismatches based on geometric similarity of
spline spans
• Again, many (not so boring) details…
– Normalization of dimensions and weights
29
Two Approaches
• How common spline spans are
– Reward matches and penalize insertions and removals
based on how common the spline spans are
• The more common, the smaller the penalization and the
higher the reward
– Penalize mismatches based on geometric similarity of
spline spans
• Roughly compare the direction of the curves
• Compute the area between derivatives
• Pitch and time together or separately
– Weight time differently, perceived as less important
30
Alignment
• Local: don’t penalize changes at the beginning
and at the end of melodies
• Global: penalize changes everywhere
• Hybrid: penalize at the beginning, but don’t
penalize at the end
– Listeners pay attention to the beginning the most
31
Evaluation of Systems
• (Kind of) yearly at MIREX
– Collection with >2000 melodies
– Select query melodies
– Run systems
– Humans listen to the retrieved melodies and judge
how similar they are to the queries
– Systems are scored accordingly
• From 0 to 1 (perfect)
• And once again, many (not at all boring!) details…
33
Briefly…
• Alignment works better
– Hybrid alignment works much better
• Geometric representations work better
• Incorporating rhythm in the comparison does not
improve much, and it’s computationally more
expensive
– Slightly improves ranking though
• Compare curves very roughly, without computing
areas between them
– Additionally, it is computationally more efficient
39
To Run
• Just download everything to the same folder
– melodyshape-1.1.jar
– commons-math3-3.2.jar
– commons-cli-1.2.jar
• And double click
– melodyshape-1.1.jar
• Can also run from the
command line
• All your melodies
must be in MIDI format
43
References
• Aloupis, G., Fevens, T., Langerman, S., Matsui, T., Mesa, A., Nuñez, Y., Rappaport, D., Toussaint, G.: Algorithms for Computing Geometric Measures of Melodic Similarity. Computer Music Journal
30(3), 67–76 (2006)
• Bainbridge, D., Dewsnip, M., Witten, I.H.: Searching Digital Music Libraries. Information Processing and Management 41(1), 41–56 (2005)
• de Boor, C.: A Practical guide to Splines. Springer, Heidelberg (2001)
• Byrd, D., Crawford, T.: Problems of Music Information Retrieval in the Real World. Information Processing and Management 38(2), 249–272 (2002)
• Casey, M.A., Veltkamp, R.C., Goto, M., Leman, M., Rhodes, C., Slaney, M.: Content-Based Music Information Retrieval: Current Directions and Future Challenges. Proceedings of the IEEE 96(4),
668–695 (2008)
• Clifford, R., Christodoulakis, M., Crawford, T., Meredith, D., Wiggins, G.: A Fast, Randomised, Maximal Subset Matching Algorithm for Document-Level Music Retrieval.In: International
Conference on Music Information Retrieval, pp. 150–155 (2006)
• Doraisamy, S., Rüger, S.: Robust Polyphonic Music Retrieval with N-grams. Journal of Intelligent Systems 21(1), 53–70 (2003)
• Downie, J.S.: The Scientific Evaluation of Music Information Retrieval Systems: Foundations and Future. Computer Music Journal 28(2), 12–23 (2004)
• Downie, J. S., Ehmann, A. F., Bay, M., & Jones, M. C.: The Music Information Retrieval Evaluation eXchange: Some Observations and Insights. In W. R. Zbigniew & A. A. Wieczorkowska (Eds.),
Advances in Music Information Retrieval, pp. 93–115 (2010)
• Hanna, P., Ferraro, P., Robine, M.: On Optimizing the Editing Algorithms for Evaluating Similarity Between Monophonic Musical Sequences. Journal of New Music Research 36(4), 267–279 (2007)
• Hanna, P., Robine, M., Ferraro, P., Allali, J.: Improvements of Alignment Algorithms for Polyphonic Music Retrieval. In: International Symposium on Computer Music Modeling and Retrieval, pp.
244–251 (2008)
• Isaacson, E.U.: Music IR for Music Theory. In: The MIR/MDL Evaluation Project White paper Collection, 2nd edn., pp. 23–26 (2002)
• Kilian, J., Hoos, H.H.: Voice Separation — A Local Optimisation Approach. In: International Symposium on Music Information Retrieval, pp. 39–46 (2002)
• Lin, H.-J., Wu, H.-H.: Efficient Geometric Measure of Music Similarity. Information Processing Letters 109(2), 116–120 (2008)
• McAdams, S., Bregman, A.S.: Hearing Musical Streams. In: Roads, C., Strawn, J. (eds.) Foundations of Computer Music, pp. 658–598. The MIT Press, Cambridge (1985)
• Mongeau, M., Sankoff, D.: Comparison of Musical Sequences. Computers and the Humanities 24(3), 161–175 (1990)
• Rizo, D.: Symbolic Music Comparison with Tree Data Structures. Department of Computer Science, University of Alicante (2010).
• Selfridge-Field, E.: Conceptual and Representational Issues in Melodic Comparison. Computing in Musicology 11, 3–64 (1998)
• Smith, L.A., McNab, R.J., Witten, I.H.: Sequence-Based Melodic Comparison: A Dynamic Programming Approach. Computing in Musicology 11, 101–117 (1998)
• Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. Journal of Molecular Biology 147(1), 195–197 (1981)
• Typke, R., Veltkamp, R.C., Wiering, F.: Searching Notated Polyphonic Music Using Transportation Distances. In: ACM International Conference on Multimedia, pp. 128–135 (2004)
• Typke, R., Wiering, F., Veltkamp, R.C.: A Survey of Music Information Retrieval Systems. In: International Conference on Music Information Retrieval, pp. 153–160 (2005)
• Uitdenbogerd, A., Zobel, J.: Melodic Matching Techniques for Large Music Databases. In: ACM International Conference on Multimedia, pp. 57–66 (1999)
• Ukkonen, E., Lemström, K., Mäkinen, V.: Geometric Algorithms for Transposition Invariant Content-Based Music Retrieval. In: International Conference on Music Information Retrieval, pp. 193–
199 (2003)
• Urbano, J.: MIREX 2013 Symbolic Melodic Similarity: A Geometric Model supported with Hybrid Sequence Alignment. Music Information Retrieval Evaluation eXchange (2013)
• Urbano, J., Lloréns, J., Morato, J., & Sánchez-Cuadrado, S.: Melodic Similarity through Shape Similarity. In S. Ystad, M. Aramaki, R. Kronland-Martinet, & K. Jensen (Eds.), Exploring Music Contents,
pp. 338–355 (2013)
• Urbano, J., Schedl, M. and Serra, X.: Evaluation in Music Information Retrieval. Journal of Intelligent Information Systems 41(3), 345-369 (2013)
• Ó Maidín, D.: A Geometrical Algorithm for Melodic Difference. Computing in Musicology 11, 65–72 (1998)
44