Symbolic Melodic Similarity
(through Shape Similarity)
Julián Urbano

Barcelona, November 12th 2013
Spam
• @julian_urbano
• PhD, Computer Science
– (Evaluation in) (Music) Information Retrieval

• Postdoctoral researcher
–...
Outline
•
•
•
•
•
•

Symbolic Melodic Similarity
Representing Melodies
Comparing Melodies
Melodic Similarity through Shape...
SYMBOLIC MELODIC SIMILARITY
Symbolic Melodic Similarity
• Symbolic
– No audio signals, just notes

• Melodic
– No polyphony
– Usually no harmony
– Usu...
Typical Use Case
• A user has a (large) collection of melodies and a
query melody
• An SMS system ranks all melodies in th...
Desired: Transposition Invariance
• Ignore “irrelevant” differences in pitch

7
Desired: Transposition Invariance
• Ignore “irrelevant” differences in pitch
– Same note, different octave

7
Desired: Transposition Invariance
• Ignore “irrelevant” differences in pitch
– Same note, different octave
– Same degree, ...
Desired: Transposition Invariance
• Ignore “irrelevant” differences in pitch
– Same note, different octave
– Same degree, ...
Desired: Time-scale Invariance
• Ignore “irrelevant” differences in rhythm

8
Desired: Time-scale Invariance
• Ignore “irrelevant” differences in rhythm
– Same duration, different time signature

8
Desired: Time-scale Invariance
• Ignore “irrelevant” differences in rhythm
– Same duration, different time signature
– Sam...
Desired: No Metadata
• Try not to rely on metadata, because it is not
always available
– Obvious: artist, genre, etc.
– Le...
REPRESENTING AND COMPARING MELODIES
(Bad) Representations of Melodies
• Transposition Invariance
– Pitches: from 0 to 127, in semitones
• C4 A4 A#5 = 60 69 82...
(Bad) Representations of Melodies
• Time-scale Invariance
– Actual duration, in milliseconds
• 250ms 187ms

– Metric durat...
(Good) Representation of Melodies
• Represent relation between successive notes
– Directed pitch interval
• C4 D4 E4 E3 = ...
Models of Similarity
• Given two melodies A and B…
– Represent them following these criteria
– But how do we compute their...
Melody as (sequence of) Text
• Melodies are just a string of text, much like
sentences in a book
• Classic retrieval: comp...
Melody as (sequence of) Text
• Melodies are just a string of text, much like
sentences in a book
• Editing distance: how m...
Melody as Graphs
• Melodies are a graph representing the
probability of transitions from pitch to pitch
– Compare on a tra...
Melody as (sequence of) n-grams
• Overlapping sequences of n successive notes
– C7 E7 A6 C7 D7
• n=2 : (C7 E7) (E7 A6) (A6...
Melody as (set of) Points
• Notes are points in the pitch-time plane, spaced
according to the pitch and time representatio...
Melody as Orthogonal Chain
• Represent melodies in a stairway fashion
– Compare area between chains, looking for optimum

...
Alignment of Symbols
• Independent of representation
• Alignment: similar to editing distance, but weight
differently inse...
MELODIC SIMILARITY THROUGH SHAPE SIMILARITY
Melody as a Curve
• Distribute notes as points in the pitch-time plane,
spaced according to their relative representation
...
Intuition Behind
• Two melodies are as similar as their curves are
• First derivative measures how much change
there is in...
Intuition Behind
• Naturally meets all transposition and time-scale
invariance requirements
• Naturally keeps sense of ord...
Interpolation
• Many (boring) details regarding type of
interpolation
– Lagrange polynomials
– Splines
• Type
• Degree

26
Interpolation
• We chose Uniform B-Splines of various degrees
– Changes are kept local
– Easy to compute
– Easy to differe...
Implementation
• Represent melodies A and B with relative
distances between two successive notes
• Compute splines spans
–...
Two Approaches
• Based on geometric properties and
characteristics of the collection
– Reward matches depending on average...
Two Approaches
• How common spline spans are
– Reward matches and penalize insertions and removals
based on how common the...
Alignment
• Local: don’t penalize changes at the beginning
and at the end of melodies
• Global: penalize changes everywher...
DOES ALL OF THIS EVEN WORK?

32
Evaluation of Systems
• (Kind of) yearly at MIREX
– Collection with >2000 melodies
– Select query melodies
– Run systems
–...
2005
• Our algorithms didn’t participate, but anyway…

34
2005
• Our algorithms didn’t participate, but anyway…

best participant

34
2010

35
2011

36
2012

37
2013

38
Briefly…
• Alignment works better
– Hybrid alignment works much better

• Geometric representations work better
• Incorpor...
HANDS-ON DEMO
MelodyShape
• A Java library and tool implementing all the
algorithms submitted to MIREX
https://github.com/julian-urbano/...
MelodyShape
• A Java library and tool implementing all the
algorithms submitted to MIREX
https://github.com/julian-urbano/...
MelodyShape
• A Java library and tool implementing all the
algorithms submitted to MIREX
https://github.com/julian-urbano/...
Two Libraries Needed

here

...

or here

42
To Run
• Just download everything to the same folder
– melodyshape-1.1.jar
– commons-math3-3.2.jar
– commons-cli-1.2.jar

...
References
• Aloupis, G., Fevens, T., Langerman, S., Matsui, T., Mesa, A., Nuñez, Y., Rappaport, D., Toussaint, G.: Algori...
Upcoming SlideShare
Loading in …5
×

Symbolic Melodic Similarity (through Shape Similarity)

680 views
502 views

Published on

A brief presentation on techniques to compute the similarity between melodic pieces.

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
680
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Symbolic Melodic Similarity (through Shape Similarity)

  1. 1. Symbolic Melodic Similarity (through Shape Similarity) Julián Urbano Barcelona, November 12th 2013
  2. 2. Spam • @julian_urbano • PhD, Computer Science – (Evaluation in) (Music) Information Retrieval • Postdoctoral researcher – Music Technology Group – Universitat Pompeu Fabra 2
  3. 3. Outline • • • • • • Symbolic Melodic Similarity Representing Melodies Comparing Melodies Melodic Similarity through Shape Similarity Results Demo 3
  4. 4. SYMBOLIC MELODIC SIMILARITY
  5. 5. Symbolic Melodic Similarity • Symbolic – No audio signals, just notes • Melodic – No polyphony – Usually no harmony – Usually a single voice • Similarity – Sound alike – In terms of…good question! – Very ill-defined: “whatever criteria you think makes two melodies sound alike” 5
  6. 6. Typical Use Case • A user has a (large) collection of melodies and a query melody • An SMS system ranks all melodies in the collection according to their estimated melodic similarity to the query melody – Implicitly, similarity is considered non-binary • Two melodies are similar to each other to some degree • Similarity is not a binary yes-no relation – Usually just the top-k most similar 6
  7. 7. Desired: Transposition Invariance • Ignore “irrelevant” differences in pitch 7
  8. 8. Desired: Transposition Invariance • Ignore “irrelevant” differences in pitch – Same note, different octave 7
  9. 9. Desired: Transposition Invariance • Ignore “irrelevant” differences in pitch – Same note, different octave – Same degree, different key 7
  10. 10. Desired: Transposition Invariance • Ignore “irrelevant” differences in pitch – Same note, different octave – Same degree, different key – Same pitch, different key 7
  11. 11. Desired: Time-scale Invariance • Ignore “irrelevant” differences in rhythm 8
  12. 12. Desired: Time-scale Invariance • Ignore “irrelevant” differences in rhythm – Same duration, different time signature 8
  13. 13. Desired: Time-scale Invariance • Ignore “irrelevant” differences in rhythm – Same duration, different time signature – Same duration, different tempo 8
  14. 14. Desired: No Metadata • Try not to rely on metadata, because it is not always available – Obvious: artist, genre, etc. – Less-obvious: key, time signature, tempo, etc. 9
  15. 15. REPRESENTING AND COMPARING MELODIES
  16. 16. (Bad) Representations of Melodies • Transposition Invariance – Pitches: from 0 to 127, in semitones • C4 A4 A#5 = 60 69 82 – Directed interval with tonic, in semitones • Tonic = C4: D4 A4 E3 = +2 +9 -8 – Ignore octaves • C4 D4 = 2 , C4 E4 = 4 , C4 E3 = 4 – Parsons Code • C4 D4 D4 E3 = U R D 11
  17. 17. (Bad) Representations of Melodies • Time-scale Invariance – Actual duration, in milliseconds • 250ms 187ms – Metric duration • Crotchet quaver = 1 1/2 12
  18. 18. (Good) Representation of Melodies • Represent relation between successive notes – Directed pitch interval • C4 D4 E4 E3 = +2 +2 -12 – Duration ratio • 250ms 500ms 190ms 570ms = 2 0.37 3 • Crotchet quaver quaver whole = 0.5 1 4 • Allows us to detect and keep changes locally +3 +2 +3 -3 -2 +2 +7 -2 -3 -4 +2 -5 +3 +2 +3 -3 -2 1 1 1 1 1 6 2/3 1 1 1 1 1/2 1 1 1 1 1 13
  19. 19. Models of Similarity • Given two melodies A and B… – Represent them following these criteria – But how do we compute their similarity? • Several types of models – Melody as text, classic retrieval – Melody as text, editing distance – Melody as graphs or trees, transition similarity – Melody as n-grams, classic retrieval – Melody as set of points, geometric similarities – Melody as orthogonal chains, area between chains – Melody as curves, alignment with geometric similarities 14
  20. 20. Melody as (sequence of) Text • Melodies are just a string of text, much like sentences in a book • Classic retrieval: compare frequencies of symbols – C4 appears 5% of times, D6 does 0.38% of times, etc. – Compare on a pitch-by-pitch basis • Bag-of-words: loses sense of order in music 15
  21. 21. Melody as (sequence of) Text • Melodies are just a string of text, much like sentences in a book • Editing distance: how many transformations are needed to go from melody A to melody B? – C7 E7 A6 C7 – C7 A6 D7 A6 C7 16
  22. 22. Melody as Graphs • Melodies are a graph representing the probability of transitions from pitch to pitch – Compare on a transition-by-transition basis D7 D7 E7 0.1 0.02 E7 0.1 0.1 0.1 0.08 G3 C7 0.15 E4 0.09 0.1 0.09 G3 C7 0.02 E4 0.09 0.01 0.01 • Sort-of sense of order 17
  23. 23. Melody as (sequence of) n-grams • Overlapping sequences of n successive notes – C7 E7 A6 C7 D7 • n=2 : (C7 E7) (E7 A6) (A6 C7) (C7 D7) • n=3 : (C7 E7 A6) (E7 A6 C7) (A6 C7 D7) • n=4 : (C7 E7 A6 C7) (E7 A6 C7 D7) • Classic retrieval: compare frequencies of n-grams • Motifs, though n is fixed • Sort-of sense of order 18
  24. 24. Melody as (set of) Points • Notes are points in the pitch-time plane, spaced according to the pitch and time representation – Compare disposition of points (eg. EMD) 19
  25. 25. Melody as Orthogonal Chain • Represent melodies in a stairway fashion – Compare area between chains, looking for optimum 20
  26. 26. Alignment of Symbols • Independent of representation • Alignment: similar to editing distance, but weight differently insertions, deletions and mismatches – C7 E7 A6 C7 – C7 A6 D7 A6 C7 (mismatch E7-D7 “costs” less than mismatch A6-E7) • Problem: how to decide on weights – Usually just compare numbers (pitch, octave, etc.) – Some apply music theory 21
  27. 27. MELODIC SIMILARITY THROUGH SHAPE SIMILARITY
  28. 28. Melody as a Curve • Distribute notes as points in the pitch-time plane, spaced according to their relative representation • Calculate the curve interpolating those points 23
  29. 29. Intuition Behind • Two melodies are as similar as their curves are • First derivative measures how much change there is in pitch at any given point in time – Nicely models the “change in music” • Area between derivatives measures how much the two melodies differ in their change of pitch 24
  30. 30. Intuition Behind • Naturally meets all transposition and time-scale invariance requirements • Naturally keeps sense of order in music 25
  31. 31. Interpolation • Many (boring) details regarding type of interpolation – Lagrange polynomials – Splines • Type • Degree 26
  32. 32. Interpolation • We chose Uniform B-Splines of various degrees – Changes are kept local – Easy to compute – Easy to differentiate (polynomials) • Degrees 2 and 3 seemed to work better – Computed in a span-by-span basis • Sense of motifs, like n-grams – Easily adaptable to differences in length – Easily incorporate several dimensions of music beyond pitch and time – Easily compute similarity per dimension (partial derivative) – Many other nice properties in terms of extreme behavior 27
  33. 33. Implementation • Represent melodies A and B with relative distances between two successive notes • Compute splines spans – Same as compute n-grams and then smaller splines • Align spline spans – Weights computed from similarity in shape of spans 28
  34. 34. Two Approaches • Based on geometric properties and characteristics of the collection – Reward matches depending on average similarity between spline spans in the collection – Penalize insertions and deletions based on geometric overall change with respect to x-axis – Penalize mismatches based on geometric similarity of spline spans • Again, many (not so boring) details… – Normalization of dimensions and weights 29
  35. 35. Two Approaches • How common spline spans are – Reward matches and penalize insertions and removals based on how common the spline spans are • The more common, the smaller the penalization and the higher the reward – Penalize mismatches based on geometric similarity of spline spans • Roughly compare the direction of the curves • Compute the area between derivatives • Pitch and time together or separately – Weight time differently, perceived as less important 30
  36. 36. Alignment • Local: don’t penalize changes at the beginning and at the end of melodies • Global: penalize changes everywhere • Hybrid: penalize at the beginning, but don’t penalize at the end – Listeners pay attention to the beginning the most 31
  37. 37. DOES ALL OF THIS EVEN WORK? 32
  38. 38. Evaluation of Systems • (Kind of) yearly at MIREX – Collection with >2000 melodies – Select query melodies – Run systems – Humans listen to the retrieved melodies and judge how similar they are to the queries – Systems are scored accordingly • From 0 to 1 (perfect) • And once again, many (not at all boring!) details… 33
  39. 39. 2005 • Our algorithms didn’t participate, but anyway… 34
  40. 40. 2005 • Our algorithms didn’t participate, but anyway… best participant 34
  41. 41. 2010 35
  42. 42. 2011 36
  43. 43. 2012 37
  44. 44. 2013 38
  45. 45. Briefly… • Alignment works better – Hybrid alignment works much better • Geometric representations work better • Incorporating rhythm in the comparison does not improve much, and it’s computationally more expensive – Slightly improves ranking though • Compare curves very roughly, without computing areas between them – Additionally, it is computationally more efficient 39
  46. 46. HANDS-ON DEMO
  47. 47. MelodyShape • A Java library and tool implementing all the algorithms submitted to MIREX https://github.com/julian-urbano/melodyshape 41
  48. 48. MelodyShape • A Java library and tool implementing all the algorithms submitted to MIREX https://github.com/julian-urbano/melodyshape 41
  49. 49. MelodyShape • A Java library and tool implementing all the algorithms submitted to MIREX https://github.com/julian-urbano/melodyshape 41
  50. 50. Two Libraries Needed here ... or here 42
  51. 51. To Run • Just download everything to the same folder – melodyshape-1.1.jar – commons-math3-3.2.jar – commons-cli-1.2.jar • And double click – melodyshape-1.1.jar • Can also run from the command line • All your melodies must be in MIDI format 43
  52. 52. References • Aloupis, G., Fevens, T., Langerman, S., Matsui, T., Mesa, A., Nuñez, Y., Rappaport, D., Toussaint, G.: Algorithms for Computing Geometric Measures of Melodic Similarity. Computer Music Journal 30(3), 67–76 (2006) • Bainbridge, D., Dewsnip, M., Witten, I.H.: Searching Digital Music Libraries. Information Processing and Management 41(1), 41–56 (2005) • de Boor, C.: A Practical guide to Splines. Springer, Heidelberg (2001) • Byrd, D., Crawford, T.: Problems of Music Information Retrieval in the Real World. Information Processing and Management 38(2), 249–272 (2002) • Casey, M.A., Veltkamp, R.C., Goto, M., Leman, M., Rhodes, C., Slaney, M.: Content-Based Music Information Retrieval: Current Directions and Future Challenges. Proceedings of the IEEE 96(4), 668–695 (2008) • Clifford, R., Christodoulakis, M., Crawford, T., Meredith, D., Wiggins, G.: A Fast, Randomised, Maximal Subset Matching Algorithm for Document-Level Music Retrieval.In: International Conference on Music Information Retrieval, pp. 150–155 (2006) • Doraisamy, S., Rüger, S.: Robust Polyphonic Music Retrieval with N-grams. Journal of Intelligent Systems 21(1), 53–70 (2003) • Downie, J.S.: The Scientific Evaluation of Music Information Retrieval Systems: Foundations and Future. Computer Music Journal 28(2), 12–23 (2004) • Downie, J. S., Ehmann, A. F., Bay, M., & Jones, M. C.: The Music Information Retrieval Evaluation eXchange: Some Observations and Insights. In W. R. Zbigniew & A. A. Wieczorkowska (Eds.), Advances in Music Information Retrieval, pp. 93–115 (2010) • Hanna, P., Ferraro, P., Robine, M.: On Optimizing the Editing Algorithms for Evaluating Similarity Between Monophonic Musical Sequences. Journal of New Music Research 36(4), 267–279 (2007) • Hanna, P., Robine, M., Ferraro, P., Allali, J.: Improvements of Alignment Algorithms for Polyphonic Music Retrieval. In: International Symposium on Computer Music Modeling and Retrieval, pp. 244–251 (2008) • Isaacson, E.U.: Music IR for Music Theory. In: The MIR/MDL Evaluation Project White paper Collection, 2nd edn., pp. 23–26 (2002) • Kilian, J., Hoos, H.H.: Voice Separation — A Local Optimisation Approach. In: International Symposium on Music Information Retrieval, pp. 39–46 (2002) • Lin, H.-J., Wu, H.-H.: Efficient Geometric Measure of Music Similarity. Information Processing Letters 109(2), 116–120 (2008) • McAdams, S., Bregman, A.S.: Hearing Musical Streams. In: Roads, C., Strawn, J. (eds.) Foundations of Computer Music, pp. 658–598. The MIT Press, Cambridge (1985) • Mongeau, M., Sankoff, D.: Comparison of Musical Sequences. Computers and the Humanities 24(3), 161–175 (1990) • Rizo, D.: Symbolic Music Comparison with Tree Data Structures. Department of Computer Science, University of Alicante (2010). • Selfridge-Field, E.: Conceptual and Representational Issues in Melodic Comparison. Computing in Musicology 11, 3–64 (1998) • Smith, L.A., McNab, R.J., Witten, I.H.: Sequence-Based Melodic Comparison: A Dynamic Programming Approach. Computing in Musicology 11, 101–117 (1998) • Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. Journal of Molecular Biology 147(1), 195–197 (1981) • Typke, R., Veltkamp, R.C., Wiering, F.: Searching Notated Polyphonic Music Using Transportation Distances. In: ACM International Conference on Multimedia, pp. 128–135 (2004) • Typke, R., Wiering, F., Veltkamp, R.C.: A Survey of Music Information Retrieval Systems. In: International Conference on Music Information Retrieval, pp. 153–160 (2005) • Uitdenbogerd, A., Zobel, J.: Melodic Matching Techniques for Large Music Databases. In: ACM International Conference on Multimedia, pp. 57–66 (1999) • Ukkonen, E., Lemström, K., Mäkinen, V.: Geometric Algorithms for Transposition Invariant Content-Based Music Retrieval. In: International Conference on Music Information Retrieval, pp. 193– 199 (2003) • Urbano, J.: MIREX 2013 Symbolic Melodic Similarity: A Geometric Model supported with Hybrid Sequence Alignment. Music Information Retrieval Evaluation eXchange (2013) • Urbano, J., Lloréns, J., Morato, J., & Sánchez-Cuadrado, S.: Melodic Similarity through Shape Similarity. In S. Ystad, M. Aramaki, R. Kronland-Martinet, & K. Jensen (Eds.), Exploring Music Contents, pp. 338–355 (2013) • Urbano, J., Schedl, M. and Serra, X.: Evaluation in Music Information Retrieval. Journal of Intelligent Information Systems 41(3), 345-369 (2013) • Ó Maidín, D.: A Geometrical Algorithm for Melodic Difference. Computing in Musicology 11, 65–72 (1998) 44

×