Music Data Mining
Name :- Kaushik Samant
Roll no :- 05
Course :- Msc(I.T)Part 1
The overall goal of the data mining process is to extract
information from a data set and transform it into an
understandable structure for further use.
Data mining uses information from past data to analyze the
outcome of a particular problem or situation that may arise. Data
mining works to analyze data stored in data warehouses that are
used to store that data that is being analyzed.
The actual data mining task is the automatic or semi-automatic
analysis of large quantities of data to extract previously unknown
interesting patterns such as groups of data records unusual records
MUSIC DATA MINING
Music plays an important role in the everyday life for many people,
and with the digitalization, large music data collections are formed
and tend to be accumulated further by music enthusiasts.
This has lead to music collections -not on the shelf in form of audio
or video records and CD’s - but on the hard drive and on the
internet, to grow beyond what previously was physical possible.
It has become impossible for humans to keep track of music and the
relations between songs and pieces, and this fact naturally calls for
data mining and machine learning techniques to assist in the
navigation within the music world .
The objective of this presentation is to give you overview of some
music data mining tasks and navigate various data mining
approaches with respect to such tasks.
In case of a music data collection, the data sources can be many
things such as, it can be the music itself, or metadata such as the
title of the songs, or some acoustic features underlying audio
pieces,or even statistics of how many people have listened to a
On What Kind of Tasks?
Music data mining involves methods for various tasks, e.g., genre
classification,artist/singer identification, mode/motion , detection
of instrument recognition,music similarity search, music
summarization and visualization, and so forth.
Different music data mining tasks focus on different data
sources, and try to explore different aspects of data sources.
On What Kind of Data?
Music Similarity Search
In the field of music data mining, similarity search refers to
searching for music sound files similar to a given music sound file.
In principle, searching can be carried out on any dimension.
The similarity search processes can be divided into feature
extraction and query the main task is to employ a proper
similarity measure to calculate the similarity between the given
music work and the candidate music work system processing.
The objective of similarity search is to find music sound files
similar to a music sound file given as input.
Music classification based on genre and style is naturally the form
of a hierarchy.
Similarity can be used to group sounds together at any node
in the hierarchies. The use of sound signals for similarity is
justified by an observation that audio signals (digital or
analog) of music belonging to the same genre share certain
characteristic ,because they are composed of similar types of
instruments, having similar rhythmic patterns, and similar
Clustering is a technique where data are divided into different
Cluster are small different sets of data which have a same
behavior within in a group.
Clustering is the task of separating a collection of data into
different groups based on some criteria.
Specifically in music data mining, clustering aims at dividing a
collection of music data into groups of similar objects based on
their pairwise similarities without predefined class labels.
It has been shown that the characteristics of a singer’s voice can be extracted
from music via vocal segment detection followed by solo vocal signal modeling
which propose a clustering algorithm that integrates features from both lyrics
and acoustic data sources to perform bimodal learning
Music Sequence Mining
Sequence mining is the task to find patterns which are
presented in a certain number of data instances. The
instances consist of sequences of elements.
The detected patterns are expressed in terms of sub-
sequences of the data sequences and impose an order, i.e.,
the order of the elements of the pattern should be respected
in all instances where it appears.
The pattern is considered to be frequent if it appears in a
number of instances above a given threshold value ,usually
defined by the user.
These patterns may represent valuable information.
For ex– maximum number of passionate fans.