SlideShare a Scribd company logo
MUSIC GENRE
DETECTION
Literature Survey
WAV File format
■ Waveform Audio File Format (WAVE, or more commonly known asWAV due to
its filename extension) is a Microsoft and IBM audio file format standard for
storing an audio bitstream on PCs.
■ It is the main format used onWindows systems for raw and typically
uncompressed audio.The usual bitstream encoding is the linear pulse-code
modulation (LPCM) format.
Mel Frequency Cepstral Coefficient (MFCC)
■ In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-
term power spectrum of a sound
■ Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an
MFC
■ MFCCs are commonly derived as follows:
– Take the Fourier transform of (a windowed excerpt of) a signal.
– Map the powers of the spectrum obtained above onto the mel scale, using triangular
overlapping windows.
– Take the logs of the powers at each of the mel frequencies.
– Take the discrete cosine transform of the list of mel log powers, as if it were a signal.
– The MFCCs are the amplitudes of the resulting spectrum.
Audio genres
■ BLUES:
– The blues form is a cyclic musical form in which a repeating progression of chords
mirrors the call and response scheme commonly found in African and African-
American music.Typical instruments are Guitar, bass guitar, piano, harmonica,
upright bass, drums, saxophone, vocals, trumpet, cornet, trombone
■ CLASSICAL:
– The Classical era stringed instruments were the four instruments which form the
string section of the orchestra: the violin, viola, cello and contrabass.Woodwinds
included the basset clarinet, basset horn, the Classical clarinet, the chalumeau, the
flute, oboe and bassoon. Keyboard instruments included the clavichord and the
fortepiano
■ JAZZ:
– It is music that includes qualities such as swing, improvising, group interaction,
developing an 'individual voice', and being open to different musical possibilities.
Typical instruments are Double bass, drums, guitar (typically electric guitar), piano,
saxophone, trumpet, clarinet, trombone, vocals, vibraphone, Hammond organ and
harmonica. In jazz fusion of the 1970s, electric bass, electric piano and synthesizer
were common.
■ ROCK:
– Rock music is a genre of popular music that originated as "rock and roll" in the
United States in the 1950s, and developed into a range of different styles in the
1960s and later, particularly in the United Kingdom and the United States.Typical
instruments are Electric guitar, bass guitar, acoustic guitar, drums, piano,
synthesizer, keyboards
Collection of dataset
■ We have considered four genres for this project
– Jazz
– Rock
– Classical
– blues
HIDDEN MARKOV MODEL (HMM)
■ A hidden Markov model (HMM) is a statistical Markov model in which the system
being modeled is assumed to be a Markov process with unobserved (hidden) states.A
HMM can be presented as the simplest dynamic Bayesian network.
■ In a hidden Markov model, the state is not directly visible, but output, dependent on
the state, is visible. Each state has a probability distribution over the possible output
tokens.
■ The sequence of tokens generated by an HMM gives some information about the
sequence of states.
■ The adjective 'hidden' refers to the state sequence through which the model passes,
not to the parameters of the model; the model is still referred to as a 'hidden' Markov
model even if these parameters are known exactly.
HMM
■ if the n'th observation in a chain of observations is influenced by a corresponding
latent (i.e. hidden) variable ie.., If the latent variables are discrete and form a Markov
chain, then it is a hidden Markov model (HMM)
PROBLEM STATEMENT
■ Music can be classified into genres based on the style it follows
■ The classification of music is an important part as a person’s taste in music is limited to
a few genres, personalized to them
■ Due to the boom in information, huge amount of music data can be found without
proper description or classification
■ This project aims at categorizing music samples into four genres – jazz, blues, rock,
classic
Steps involved
■ Collection of dataset
■ Preprocessing of dataset
■ Training of classifier
■ Classification of music samples
Collection of dataset
■ Each of these genres have distinctive characteristics
■ Music samples for each of these genres have been collected
■ Each genre has 100 music samples
■ This number of samples have been considered for a better efficiency of the classifier
■ The ratio considered for training and testing is 3:1 ie.., 75% samples will be used for
training and 25%, for testing
■ Out of 400 samples, 300 samples will be used for training and 100 samples will be used
for testing
DESIGN Unprocessed
audio files
Processed audio
files
Audio PreprocessorFeature Extractor
Feature files
Classifier
(HMM)
Output
IMPLEMENTATION
■ PRE-PROCESSING STAGE
– This module automates the audio conversion process.This module scans the current
directory to find all the mp3 files.Then, each file is loaded one by one. An audio clip
may be one-channeled or two channeled.The final step in this automated process is
to decompress the audio and store it in wave format, which is a lossless compression
format. After the process is complete, the current directory contains all wave files.
This process has to be repeated for every genre tag being considered.
start
get list of mp3 files in pwd
for every item in the list do :
load file
set # of channels to 1
export as wave
end
■ FEATURE EXTRACTION STAGE
– The python_speech_features library provides common speech features for ASR
including MFCCs and filterbank energies.
■ features.mfcc() - Mel Frequency Cepstral Coefficients
■ features.logfbank() - Log Filterbank Energies
■ CLASSIFICATION STAGE
– Classifier used is Hidden markov model
– Four HMMs used to classify into 4 genres
– Maximum probability predicts genre of music
TESTINGAND SCREEN SHOTS
■ TESTING STAGE
– Input sample gives probability for each genre when it is tested with HMM classifier
– 30% of dataset is used for testing classifier
– Confusion matrix is as given
RESULTSAND DISCUSSION
■ Training was done with two sizes of datasets-
– Containing 400 audio files
– Containing 100 audio files
■ For bigger dataset, higher efficiency was achieved
■ For smaller dataset, lesser efficiency was achieved.
■ The genres of blues and rock have similar elements, and therefore some of the
samples were classified into the other category, and hence the overall efficiency of
smaller datasets were reduced
CONCLUSION
■ The Hidden markov model approach for music genre classification has an efficiency of
around 72% when the dataset size considered for training is huge.
FUTUREWORK
■ GUI for the application
■ Input file from user
■ Include more classes
■ Display classification and find if audio sample belong to more than one genre
REFERENCES
[1] Molau, Pitz, Schlueter, Ney :Computing Mel- Frequency Cepstral Coefficients
on the Power Spectrum, RWTH Aachen – University ofTechnology, 52056
Aachen,Germany. IEEETransactions on Communications 26 (6)
[2] Frigo, M.; Johnson, S. G. (2005). "The Design and Implementation of FFTW3"
(PDF). Proceedings of the IEEE 93: 216–231. 10.1109/jproc.2004.840301
[3] Narasimha, M.; Peterson, A. (June 1978). "On the Computation of the
Discrete CosineTransform". IEEETransactions on Communications 26 (6): 934–
936. 10.1109/TCOM.1978.1094144
[4] S. B. Kotsiantis : Supervised Machine Learning:A Review of Classification
Techniques, 2007, University of Peloponnesse,Greece
[5] DavidA. Freedman (2009). Statistical Models:Theory and Practice.
Cambridge University Press. p. 128.
[6] Musical genre classification of audio signals, Speech and Audio Processing,
IEEETransactions on (Volume:10 , Issue: 5 ), 293-302, November 2002,
http://dspace.library.uvic.ca:8080/bitstream/handle/1828/1344/tsap02gtzan.pdf
■ WEB LINKSAND OTHER RESOURCES
[7] http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/
[8] en.wikipedia.org/wiki/Hidden_Markov_Model
[9] en.wikipedia.org/wiki/Mel_Frequency_Ceptral
[10] en.wikipedia.org/wiki/Window_function
[11] http://practicalcryptography.com/miscellaneous/machine-learning/guide-
mel-frequency-cepstral-coefficients-mfccs/
THANKYOU

More Related Content

What's hot

Cnn
CnnCnn
Grid computing , Utility computing الحوسبة الشبكية و الخدمية
Grid computing , Utility computing  الحوسبة الشبكية و الخدميةGrid computing , Utility computing  الحوسبة الشبكية و الخدمية
Grid computing , Utility computing الحوسبة الشبكية و الخدمية
Sally Jarkas
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
Universitat Politècnica de Catalunya
 
Deepfakes: Trick or Treat?
Deepfakes: Trick or Treat?Deepfakes: Trick or Treat?
Deepfakes: Trick or Treat?
Ian McCarthy
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
남주 김
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in brief
Shashi Perera
 
AI and Deep Learning
AI and Deep LearningAI and Deep Learning
AI and Deep Learning
Manoj Kumar
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
Raviraj singh shekhawat
 
Digital image processing
Digital image processingDigital image processing
Digital image processingAvisek Roy
 
Intro to Object Detection with SSD
Intro to Object Detection with SSDIntro to Object Detection with SSD
Intro to Object Detection with SSD
Thomas Delteil
 
Mathematics Foundation Course for Machine Learning & AI By Eduonix
Mathematics Foundation Course for Machine Learning & AI By Eduonix Mathematics Foundation Course for Machine Learning & AI By Eduonix
Mathematics Foundation Course for Machine Learning & AI By Eduonix
Nick Trott
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
Manohar Mukku
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-Encoders
Gianmario Spacagna
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
HARISH R
 
Face Mask Detection PPT.pptx
Face Mask Detection PPT.pptxFace Mask Detection PPT.pptx
Face Mask Detection PPT.pptx
Srikar Dasharadhi
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
MLReview
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Oswald Campesato
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
Ankit Gupta
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
Brodmann17
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
Grigory Sapunov
 

What's hot (20)

Cnn
CnnCnn
Cnn
 
Grid computing , Utility computing الحوسبة الشبكية و الخدمية
Grid computing , Utility computing  الحوسبة الشبكية و الخدميةGrid computing , Utility computing  الحوسبة الشبكية و الخدمية
Grid computing , Utility computing الحوسبة الشبكية و الخدمية
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
 
Deepfakes: Trick or Treat?
Deepfakes: Trick or Treat?Deepfakes: Trick or Treat?
Deepfakes: Trick or Treat?
 
Generative adversarial networks
Generative adversarial networksGenerative adversarial networks
Generative adversarial networks
 
Federated learning in brief
Federated learning in briefFederated learning in brief
Federated learning in brief
 
AI and Deep Learning
AI and Deep LearningAI and Deep Learning
AI and Deep Learning
 
Moving object detection
Moving object detectionMoving object detection
Moving object detection
 
Digital image processing
Digital image processingDigital image processing
Digital image processing
 
Intro to Object Detection with SSD
Intro to Object Detection with SSDIntro to Object Detection with SSD
Intro to Object Detection with SSD
 
Mathematics Foundation Course for Machine Learning & AI By Eduonix
Mathematics Foundation Course for Machine Learning & AI By Eduonix Mathematics Foundation Course for Machine Learning & AI By Eduonix
Mathematics Foundation Course for Machine Learning & AI By Eduonix
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 
Anomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-EncodersAnomaly Detection using Deep Auto-Encoders
Anomaly Detection using Deep Auto-Encoders
 
Autoencoder
AutoencoderAutoencoder
Autoencoder
 
Face Mask Detection PPT.pptx
Face Mask Detection PPT.pptxFace Mask Detection PPT.pptx
Face Mask Detection PPT.pptx
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Introduction to object detection
Introduction to object detectionIntroduction to object detection
Introduction to object detection
 
Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016Deep Learning and the state of AI / 2016
Deep Learning and the state of AI / 2016
 

Similar to Music genre detection using hidden markov models

Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Daichi Kitamura
 
Music similarity: what for?
Music similarity: what for?Music similarity: what for?
Music similarity: what for?
Emilia Gómez
 
Teaching Computers to Listen to Music
Teaching Computers to Listen to MusicTeaching Computers to Listen to Music
Teaching Computers to Listen to Music
Eric Battenberg
 
Analysis Synthesis Comparison
Analysis Synthesis ComparisonAnalysis Synthesis Comparison
Analysis Synthesis Comparison
Jim Webb
 
AMT overview
AMT overviewAMT overview
AMT overview
WarNik Chow
 
SYLLABUS (1).pptx
SYLLABUS (1).pptxSYLLABUS (1).pptx
SYLLABUS (1).pptx
ShafeeqMuringath2
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016
SaruwatariLabUTokyo
 
The synthesizer for A2 music tech students
The synthesizer for A2 music tech studentsThe synthesizer for A2 music tech students
The synthesizer for A2 music tech students
music_hayes
 
Mit21 m 380s12_complecnot
Mit21 m 380s12_complecnotMit21 m 380s12_complecnot
Mit21 m 380s12_complecnot
VenkateshKumar708402
 
MLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to MusicMLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to Music
Eric Battenberg
 
Ml conf2013 teaching_computers_share
Ml conf2013 teaching_computers_shareMl conf2013 teaching_computers_share
Ml conf2013 teaching_computers_share
MLconf
 
Human Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A ReviewHuman Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A Review
Editor IJCATR
 
SoundSense
SoundSenseSoundSense
SoundSensebutest
 
Mp3
Mp3Mp3
MIR
MIRMIR
Indexing and Retrieval of Audio
Indexing and Retrieval of AudioIndexing and Retrieval of Audio
Indexing and Retrieval of Audio
Rachmat Wahid Saleh Insani
 
Mastering
MasteringMastering
Query By humming - Music retrieval technology
Query By humming - Music retrieval technologyQuery By humming - Music retrieval technology
Query By humming - Music retrieval technology
Shital Kat
 
Recognition of music genres using deep learning.
Recognition of music genres using deep learning.Recognition of music genres using deep learning.
Recognition of music genres using deep learning.
IRJET Journal
 

Similar to Music genre detection using hidden markov models (20)

Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
Music similarity: what for?
Music similarity: what for?Music similarity: what for?
Music similarity: what for?
 
Teaching Computers to Listen to Music
Teaching Computers to Listen to MusicTeaching Computers to Listen to Music
Teaching Computers to Listen to Music
 
Analysis Synthesis Comparison
Analysis Synthesis ComparisonAnalysis Synthesis Comparison
Analysis Synthesis Comparison
 
AMT overview
AMT overviewAMT overview
AMT overview
 
Asr
AsrAsr
Asr
 
SYLLABUS (1).pptx
SYLLABUS (1).pptxSYLLABUS (1).pptx
SYLLABUS (1).pptx
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016
 
The synthesizer for A2 music tech students
The synthesizer for A2 music tech studentsThe synthesizer for A2 music tech students
The synthesizer for A2 music tech students
 
Mit21 m 380s12_complecnot
Mit21 m 380s12_complecnotMit21 m 380s12_complecnot
Mit21 m 380s12_complecnot
 
MLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to MusicMLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to Music
 
Ml conf2013 teaching_computers_share
Ml conf2013 teaching_computers_shareMl conf2013 teaching_computers_share
Ml conf2013 teaching_computers_share
 
Human Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A ReviewHuman Perception and Recognition of Musical Instruments: A Review
Human Perception and Recognition of Musical Instruments: A Review
 
SoundSense
SoundSenseSoundSense
SoundSense
 
Mp3
Mp3Mp3
Mp3
 
MIR
MIRMIR
MIR
 
Indexing and Retrieval of Audio
Indexing and Retrieval of AudioIndexing and Retrieval of Audio
Indexing and Retrieval of Audio
 
Mastering
MasteringMastering
Mastering
 
Query By humming - Music retrieval technology
Query By humming - Music retrieval technologyQuery By humming - Music retrieval technology
Query By humming - Music retrieval technology
 
Recognition of music genres using deep learning.
Recognition of music genres using deep learning.Recognition of music genres using deep learning.
Recognition of music genres using deep learning.
 

Recently uploaded

State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
javier ramirez
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
eddie19851
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Enterprise Wired
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 

Recently uploaded (20)

State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
The Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series DatabaseThe Building Blocks of QuestDB, a Time Series Database
The Building Blocks of QuestDB, a Time Series Database
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Nanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdfNanandann Nilekani's ppt On India's .pdf
Nanandann Nilekani's ppt On India's .pdf
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfUnleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 

Music genre detection using hidden markov models

  • 2. Literature Survey WAV File format ■ Waveform Audio File Format (WAVE, or more commonly known asWAV due to its filename extension) is a Microsoft and IBM audio file format standard for storing an audio bitstream on PCs. ■ It is the main format used onWindows systems for raw and typically uncompressed audio.The usual bitstream encoding is the linear pulse-code modulation (LPCM) format.
  • 3. Mel Frequency Cepstral Coefficient (MFCC) ■ In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short- term power spectrum of a sound ■ Mel-frequency cepstral coefficients (MFCCs) are coefficients that collectively make up an MFC ■ MFCCs are commonly derived as follows: – Take the Fourier transform of (a windowed excerpt of) a signal. – Map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows. – Take the logs of the powers at each of the mel frequencies. – Take the discrete cosine transform of the list of mel log powers, as if it were a signal. – The MFCCs are the amplitudes of the resulting spectrum.
  • 4. Audio genres ■ BLUES: – The blues form is a cyclic musical form in which a repeating progression of chords mirrors the call and response scheme commonly found in African and African- American music.Typical instruments are Guitar, bass guitar, piano, harmonica, upright bass, drums, saxophone, vocals, trumpet, cornet, trombone ■ CLASSICAL: – The Classical era stringed instruments were the four instruments which form the string section of the orchestra: the violin, viola, cello and contrabass.Woodwinds included the basset clarinet, basset horn, the Classical clarinet, the chalumeau, the flute, oboe and bassoon. Keyboard instruments included the clavichord and the fortepiano
  • 5. ■ JAZZ: – It is music that includes qualities such as swing, improvising, group interaction, developing an 'individual voice', and being open to different musical possibilities. Typical instruments are Double bass, drums, guitar (typically electric guitar), piano, saxophone, trumpet, clarinet, trombone, vocals, vibraphone, Hammond organ and harmonica. In jazz fusion of the 1970s, electric bass, electric piano and synthesizer were common. ■ ROCK: – Rock music is a genre of popular music that originated as "rock and roll" in the United States in the 1950s, and developed into a range of different styles in the 1960s and later, particularly in the United Kingdom and the United States.Typical instruments are Electric guitar, bass guitar, acoustic guitar, drums, piano, synthesizer, keyboards
  • 6. Collection of dataset ■ We have considered four genres for this project – Jazz – Rock – Classical – blues
  • 7. HIDDEN MARKOV MODEL (HMM) ■ A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states.A HMM can be presented as the simplest dynamic Bayesian network. ■ In a hidden Markov model, the state is not directly visible, but output, dependent on the state, is visible. Each state has a probability distribution over the possible output tokens. ■ The sequence of tokens generated by an HMM gives some information about the sequence of states. ■ The adjective 'hidden' refers to the state sequence through which the model passes, not to the parameters of the model; the model is still referred to as a 'hidden' Markov model even if these parameters are known exactly.
  • 8. HMM ■ if the n'th observation in a chain of observations is influenced by a corresponding latent (i.e. hidden) variable ie.., If the latent variables are discrete and form a Markov chain, then it is a hidden Markov model (HMM)
  • 9. PROBLEM STATEMENT ■ Music can be classified into genres based on the style it follows ■ The classification of music is an important part as a person’s taste in music is limited to a few genres, personalized to them ■ Due to the boom in information, huge amount of music data can be found without proper description or classification ■ This project aims at categorizing music samples into four genres – jazz, blues, rock, classic
  • 10. Steps involved ■ Collection of dataset ■ Preprocessing of dataset ■ Training of classifier ■ Classification of music samples
  • 11. Collection of dataset ■ Each of these genres have distinctive characteristics ■ Music samples for each of these genres have been collected ■ Each genre has 100 music samples ■ This number of samples have been considered for a better efficiency of the classifier ■ The ratio considered for training and testing is 3:1 ie.., 75% samples will be used for training and 25%, for testing ■ Out of 400 samples, 300 samples will be used for training and 100 samples will be used for testing
  • 12. DESIGN Unprocessed audio files Processed audio files Audio PreprocessorFeature Extractor Feature files Classifier (HMM) Output
  • 13. IMPLEMENTATION ■ PRE-PROCESSING STAGE – This module automates the audio conversion process.This module scans the current directory to find all the mp3 files.Then, each file is loaded one by one. An audio clip may be one-channeled or two channeled.The final step in this automated process is to decompress the audio and store it in wave format, which is a lossless compression format. After the process is complete, the current directory contains all wave files. This process has to be repeated for every genre tag being considered. start get list of mp3 files in pwd for every item in the list do : load file set # of channels to 1 export as wave end
  • 14. ■ FEATURE EXTRACTION STAGE – The python_speech_features library provides common speech features for ASR including MFCCs and filterbank energies. ■ features.mfcc() - Mel Frequency Cepstral Coefficients ■ features.logfbank() - Log Filterbank Energies ■ CLASSIFICATION STAGE – Classifier used is Hidden markov model – Four HMMs used to classify into 4 genres – Maximum probability predicts genre of music
  • 15. TESTINGAND SCREEN SHOTS ■ TESTING STAGE – Input sample gives probability for each genre when it is tested with HMM classifier – 30% of dataset is used for testing classifier – Confusion matrix is as given
  • 16. RESULTSAND DISCUSSION ■ Training was done with two sizes of datasets- – Containing 400 audio files – Containing 100 audio files ■ For bigger dataset, higher efficiency was achieved ■ For smaller dataset, lesser efficiency was achieved. ■ The genres of blues and rock have similar elements, and therefore some of the samples were classified into the other category, and hence the overall efficiency of smaller datasets were reduced
  • 17. CONCLUSION ■ The Hidden markov model approach for music genre classification has an efficiency of around 72% when the dataset size considered for training is huge.
  • 18. FUTUREWORK ■ GUI for the application ■ Input file from user ■ Include more classes ■ Display classification and find if audio sample belong to more than one genre
  • 19. REFERENCES [1] Molau, Pitz, Schlueter, Ney :Computing Mel- Frequency Cepstral Coefficients on the Power Spectrum, RWTH Aachen – University ofTechnology, 52056 Aachen,Germany. IEEETransactions on Communications 26 (6) [2] Frigo, M.; Johnson, S. G. (2005). "The Design and Implementation of FFTW3" (PDF). Proceedings of the IEEE 93: 216–231. 10.1109/jproc.2004.840301 [3] Narasimha, M.; Peterson, A. (June 1978). "On the Computation of the Discrete CosineTransform". IEEETransactions on Communications 26 (6): 934– 936. 10.1109/TCOM.1978.1094144
  • 20. [4] S. B. Kotsiantis : Supervised Machine Learning:A Review of Classification Techniques, 2007, University of Peloponnesse,Greece [5] DavidA. Freedman (2009). Statistical Models:Theory and Practice. Cambridge University Press. p. 128. [6] Musical genre classification of audio signals, Speech and Audio Processing, IEEETransactions on (Volume:10 , Issue: 5 ), 293-302, November 2002, http://dspace.library.uvic.ca:8080/bitstream/handle/1828/1344/tsap02gtzan.pdf
  • 21. ■ WEB LINKSAND OTHER RESOURCES [7] http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/ [8] en.wikipedia.org/wiki/Hidden_Markov_Model [9] en.wikipedia.org/wiki/Mel_Frequency_Ceptral [10] en.wikipedia.org/wiki/Window_function [11] http://practicalcryptography.com/miscellaneous/machine-learning/guide- mel-frequency-cepstral-coefficients-mfccs/