Music Matrix, in simple terms, processes songs to sort them in accordance with genres and displays the results in an aesthetic, visually appealing and easy-to-operate manner.
It is to contain a NxN matrix, with each individual cell containing a list of songs (or maybe even just one single song) that belong a single variation/combination of two or more genres.
Intermediary steps include Preprocessing, Feature Extraction and Classification.
As their name suggests, Preprocessing encompasses conversion of the continuous analog song input into a discrete, digital vector signal (or a byte array). It also covers noise removal.
Feature Extraction, literally, is the extraction of features from the processed signal.
Classification, as the namesake suggests, is classifying these extracted features using fuzzy logic, to increase accuracy, and passing the results to the Music Matrix, to be displayed.
Music Matrix - A Fuzzy Automated Genre Classification
1.
2.
3. *Automated genre classification is the process
by which a musical piece is associated to a
genre to allow users to search, browse, and
organize their music catalogues; through
machine learning and advanced algorithms.
*In simple terms, your songs are sorted,
according to genre, without any intervention
or effort on your part.
6. *Each digital audio file has some features.
These are extracted for the purpose of genre
identification.
*These features can be classified into three
categories, namely, timbre, pitch and rhythm.
7. *Timbre – the quality that distinguishes
different types of sound production, such as
voices and musical instruments, string
instruments, wind instruments, and percussion
instruments.
*Pitch – the perception-based quality that
allows ordering of sound on a frequency-
related scale.
*Rhythm – the timing of musical sounds and
silences on a human scale.
9. *Some formulae and procedures used to calculate
features
*Zero Crossings Rate
for (int samp = 0; samp < samples.length - 1; samp++)
{
if (samples[samp] > 0.0 && samples[samp + 1] < 0.0)
count++;
else if (samples[samp] < 0.0 && samples[samp + 1] > 0.0)
count++;
else if (samples[samp] == 0.0 && samples[samp + 1] != 0.0)
count++;
}
10. *Beat Sum
double sum = 0.0;
for (int i = 0; i < beat_histogram.length; i++)
sum += beat_histogram[i];
double[] result = new double[1];
result[0] = sum;
return result;
11. *Strongest Frequency Via Zero Crossings
result = (zero_crossings / 2.0) * (sampling_rate / (double) samples.length)
12. *The above extracted features are then used to
identify genre using one or more clustering
algorithms.
*Many approaches are used for the above,
including Unsupervised and Supervised
approach.
13. *Unsupervised Approaches have no
knowledge about genres. Classifier can
observe the data position in the feature
space, but do not know what the genre
cluster of the data is.
*Unsupervised classifiers:
K-means, Agglomerative hierarchical clustering,
Self-organizing Map (SOM), Growing hierarchical Self-
organizing Map (GHSOM).
14. *In Supervised Approaches, the system is
trained by manually labeling the data at
first, then, when unlabeled data (new
coming data) comes, the trained system
is used to classify it into a known genre.
*Supervised classifiers:
K-nearest neighbor (KNN), Gaussian Mixture Model
(GMM), Linear Discriminant Analysis (LDA), Support Vector
Machines (SVMs), Artificial Neural Networks (ANNs).
15. *A fuzzy inference system is implemented.
*It is a supervised classifier.
*Rules are manually created.
*The rules are, then, implemented on two
feature sets, and the output evaluated.
*Feature set 1 = (Zero Crossings, Beat Sum,
Strongest Frequency)
*Feature set 2 =(MFCC)
17. *The “front-end” of my project.
*The Music Matrix is a NxN matrix where each
cell represents a list of song(s) which are
placed in one or more genres, in a fuzzy
manner.
*This system clearly demonstrates multi-label
songs.
18. *For example, choosing a cell in the following
matrix may cause a list of songs to be played,
that are 60%-70% classic, and 10%-15% pop.
19. *Huge size of genre (and sub-genre) list.
*Non-Agreement on Taxonomies – Well-known
websites like Allmusic (http://www.allmusic.com—
531genres), Amazon (http://www.amazon.com—719
genres), and Mp3 (http://www.mp3.com—430
genres).
*Trivialization of music art.
*Classification Basis
20. *Fuzzy definition of genres
*Differences in human perception
*Scalability of any AMC system
21.
22. *Automated Genre Classification is a non-trivial
task.
*Emotion and music-matching is subjective.
*The problems of genre taxonomy are carried
onto Automated Genre Classification.
23. *Extraction of all features of an audio file is not only
unnecessary, but also counterproductive.
*Different combinations of extracted features and
various classification algorithms yield different
results, of different accuracy.
*A combination of low-level signal properties such as
zero-crossing rate, spectral centroid and skewness,
mean energy, etc. and perception-based features
such as MFCCs, beat histograms, etc. may be the
most appropriate set.
24. *Multi-label classification is the most
appropriate for real world.
*A fuzzy classification algorithm must be used to
allow for multi-label songs.
*A lot of novelty functions have been created,
but, sadly, they return results of lesser
accuracy.
25. *Practices used for Automated Genre
Classification can also be used to sieve similar
songs. It may help in copyright and IPR
protection.
Ref: http://www.thatsongsoundslike.com/