Project - Sound Model Similarity Search

Student: Sudarshan
Supervisor: Dr. Lonce Wyse
Research Acknowledgments: IDMI

Background – Rec. Sound vs. Sound Model

Recorded Sound Sound Model

Recorded Sound versus Sound Model

Background - Objective

Sounds
metadata

Sound Model
DB

Automation
Input: tool algorithm
Sound query
Result:
Sound Model

Background - Application Areas
Application tool: Enables accessing shared sound model resources
created by various “music production” communities on such a
platform
Game-production environments
Musicians new, interesting sounds for games
design new instruments
with interactive parameters

Interactive Media enthusiasts
communicate with sounds in easier ways

Background – Concepts involved

Querying: Query by Example Music Retrieval (QEMR)

Storage: Sound Model Databases

Analysis: Feature Vectors, Audio Segmentation

Analysis, result: Data Clustering (search optimization)

Result: Nearest Neighbors, Euclidean Distance

Sound Models - Characteristics
Characteristics

Class of Variable
sounds durations

Lesser More
memory interactivity

Descriptive
of sounds

Sound Models - Challenges
Challenges
Delta-parameter problem

One-to-many, Many-to-one mapping problem

Hysteresis problem

Infinite versus finite sound problem

Silent sound problem

System design and Implementation -
Architecture
Query phase
Training phase
Sound
Q3
models to Sounds
Q2
S4 sounds Analysis
S3 generator Q1
S2
S1
Queries
Sound models Nearest
cluster
centroid
Sounds
Analysis

n-nearest
n-dimensional neighbors
feature space
data
clustering Snd Sound
Mod models Densest
K-means DB database sub-cluster
Sound model result

System design and Implementation – A.S

Audio Segmentation: handles detection of onsets
and offsets in the sound file. It detects the number of
events (pitch, beat, amplitude peaks) present in the
audio file.

System design and Implementation – F.E

Feature Extraction: involves passing the sound file
through many algorithms that calculate timbre
(spectral), rhythm and pitch features for the sound.
These features are normalized and the feature vector
is mapped onto the Vector Space (data clustered by
k-means clustering).

G. Tzanetakis and P. Cook, 2002

System design and Implementation – KM

K-means Data clustering: It is an algorithm that
classifies data sets based on features (or) attributes
into k different groups. The choice of k depends on
the developer and is arrived at using trial and error.

System design and Implementation

n-Nearest Neighbors algorithm: compares the
incoming input sound query (feature vector) with
feature vectors in certain clusters of the feature
space and point to those with the smallest n
Euclidean distances. After density check, it produces
a result that points towards the sound model that
closely resembles the sound.
Euclidean Distance

Results – Experiment 1

Cluster 10
Legend
40 1,9,10,
27 12
File-
45, 52, number
59
22,23
16
90,98,99, 28,31,
100 33

101
Experiment 1:
Training Set Query

Results

Inference 1: Each cluster contains sounds from different
sound models due to similarities in spectral shape,
temporal changes; proximity in parametric space.

Inference 2: If the parametric differences between Expt 1
sounds crosses a threshold, the sounds, despite being
generated by the same sound model may occupy
multiple clusters.

Inference 3: Resultant sound models for a sound may be
selected based on influence by one of the features. Expt 2

Results – Inference 1
NoiseTickerFrequency0.75

Cluster 17 - Sound files 13, 41 – sounds
VIDrillDrill Speed0.25 from “BasicFM”, “NoiseTicker” models

Cluster 10
Experiment 1:
Training Set Query

Feature File 13 File 41

SC 0.000379 0.000971

Moments 0.011628 0.011628

MFCC 0.011628 0.011628

SEC1 0.5 0.5

SEC2 0.5 0.5

SCSFr 0.869156 0.939304

Results
MomSCr 0 0

MomSFr 0.163903 0

MFCCSCr 0.863859 0.790013

Inference 1 MFCCSFr

REC1
0

0
0.007110

0

Proof REC2

PEC1
0

1
0

1

PEC2 0 0

RhyPitr 0 0

Maxpeak 0.003356 0.000412

RMSEC1 0 0

RMSEC2 0 0

SCRMSr 0 0.146025

SFRMSr 0 0.479915

MomRMSr 0.047086 0.031206

MFCCRMSr 0 0

SCPeakr 0 0.029616

SFPeakr 0 0.083875

MomPeakr 0.001911 0

MFCCPeakr 0 0

Cluster 9 Results – Inference 2 Cluster 3

1 2

Risset Fixed model - Type Risset Beats model - Type
"Infinite" – Cluster 9 "Infinite" – Similar -
3 Cluster9

Drips model - Type "Infinite" – Less
Similar – Cluster 9

4

Vi Drill model sounds - Type "Infinite" –
Dissimilar – Cluster 2


Q2a

Q2b

Q2c

Queries

Q2a’s result – NoiseComb model Q2b, Q2c’s result – BasicFM model

Conclusion
Usage of Sound models will become more increasingly
common in games, digital media

Querying Sound model databases has great potential to assist
film makers, game producers, media enthusiasts in accessing
a vast DB of sound models and hence sounds

This automation tool could very well be the platform for
future developers in the music field to tap onto a collection
of a vast set of sounds (generated by sound models)

Time and experience needed to find sound models is lesser
when compared to developing them

Project - Sound Model Similarity Search

Project - Sound Model Similarity Search

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Viewers also liked

Viewers also liked (13)

Similar to Project - Sound Model Similarity Search

Similar to Project - Sound Model Similarity Search (20)

Project - Sound Model Similarity Search

Editor's Notes