SlideShare a Scribd company logo
1 of 23
Download to read offline
2
General Speaker Diarization - a “Who spoke when?” problem:
- Crucial for speech understanding in a multi-speaker environment
- Many practical applications
- Challenging, the general solution does not exist
3
Problem: Speaker Classification
Research goals:
- Provide overview of the related work in the scientific field of the
speaker classification / diarization problem
- Explore existing solution with regard to the research hypothesis
- Propose and evaluate original solution to the speaker classification
problem
Industry related goals:
- First step towards product level diarization system
4
Sound is multidimensional unstructured data.
Therefore modern neural networks models should
perform comparatively well for the classification task.
5
●
●
●
6
CNNs in sound and voice processing
●
●
●
7
●
●
●
●
●
8
- LibriSpeech ASR corpus - derived from
audiobooks, 251 speakers, 100 hours of speech.
- ICSI Meeting Speech corpus - derived from real
meetings, 53 speakers, 72 hours of speech.
9
Stage 1 Front-end:
● MFCC 16
● MFCC 32
● Mel-Spectrogram 256
● Mel-Spectrogram 512
● Spectrogram 128
● Spectrogram 256
● Spectrogram 512
10
Stage 1 Back-end:
● CNN with 6 layers
Stage 1 Samples:
● 0.5 sec
● 1 sec
● 2 sec
Back-end example for 128x128 input
11
Mel Frequency Cepstral Coefficients (MFCC) process.
12
MFCC with 16 coefficients for 2 seconds audio.
13
Mel sprectrogram with window size = 512 and cepstral coefficients =
128 for 1 sec audio.
14
Spectrogram of 1 sec audio fragment, window length = 256.
15
16
● Spectrograms perform better then MFCC
● Longer fragments perform better
● Inputs with equal dimensions perform better
17
18
19
● Accuracy of 0.94 gives strong promise for future
development
● Model is context independent
● Data preprocessing is more important than
classifier architecture
20
Extended Research
● LSTMs
● TDNN
● D-vector
Bridging the Diarization Gap
● Ensemble models
● End-2-End models
21
● In-depth overview of Speaker diarization
scientific field
● Performance comparison of different sound
representation with CNN classifier
● Proposed original approach for speaker
classification task
22
● Model was trained for each dataset
● Only activation sizes are depicted on Figures.
● The model can not be directly compared to other
models because they use different proprietary or
costly datasets
● Split ratio for test and train data was 0.2 and 0.8
23

More Related Content

Similar to Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classification

MDX challenge 2021 town hall presentation
MDX challenge 2021 town hall presentationMDX challenge 2021 town hall presentation
MDX challenge 2021 town hall presentationChin Yun Yu
 
Finding the best solution for Image Processing
Finding the best solution for Image ProcessingFinding the best solution for Image Processing
Finding the best solution for Image ProcessingTech Triveni
 
Transformer-based SE.pptx
Transformer-based SE.pptxTransformer-based SE.pptx
Transformer-based SE.pptxssuser849b73
 
Translatotron review
Translatotron reviewTranslatotron review
Translatotron reviewJune-Woo Kim
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsHPCC Systems
 
Seminar on Parallel and Concurrent Programming
Seminar on Parallel and Concurrent ProgrammingSeminar on Parallel and Concurrent Programming
Seminar on Parallel and Concurrent ProgrammingStefan Marr
 
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...Intel® Software
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AIQualcomm Research
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deploymenttaeseon ryu
 
State-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsState-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsKnoldus Inc.
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyQualcomm Research
 
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...Unai Lopez-Novoa
 

Similar to Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classification (20)

MDX challenge 2021 town hall presentation
MDX challenge 2021 town hall presentationMDX challenge 2021 town hall presentation
MDX challenge 2021 town hall presentation
 
Finding the best solution for Image Processing
Finding the best solution for Image ProcessingFinding the best solution for Image Processing
Finding the best solution for Image Processing
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
 
Mini Project- Audio Enhancement
Mini Project-  Audio EnhancementMini Project-  Audio Enhancement
Mini Project- Audio Enhancement
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
 
Transformer-based SE.pptx
Transformer-based SE.pptxTransformer-based SE.pptx
Transformer-based SE.pptx
 
Translatotron review
Translatotron reviewTranslatotron review
Translatotron review
 
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC SystemsImproving Efficiency of Machine Learning Algorithms using HPCC Systems
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
 
WaveNet
WaveNetWaveNet
WaveNet
 
Seminar on Parallel and Concurrent Programming
Seminar on Parallel and Concurrent ProgrammingSeminar on Parallel and Concurrent Programming
Seminar on Parallel and Concurrent Programming
 
Icon18revrec sudeshna
Icon18revrec sudeshnaIcon18revrec sudeshna
Icon18revrec sudeshna
 
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
Performance Optimization of Deep Learning Frameworks Caffe* and Tensorflow* f...
 
Efficient video perception through AI
Efficient video perception through AIEfficient video perception through AI
Efficient video perception through AI
 
CNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAsCNN Dataflow Implementation on FPGAs
CNN Dataflow Implementation on FPGAs
 
CNN Dataflow implementation on FPGAs
CNN Dataflow implementation on FPGAsCNN Dataflow implementation on FPGAs
CNN Dataflow implementation on FPGAs
 
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 Once-for-All: Train One Network and Specialize it for Efficient Deployment Once-for-All: Train One Network and Specialize it for Efficient Deployment
Once-for-All: Train One Network and Specialize it for Efficient Deployment
 
Manycores for the Masses
Manycores for the MassesManycores for the Masses
Manycores for the Masses
 
State-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domainsState-of-the-art Image Processing across all domains
State-of-the-art Image Processing across all domains
 
Intelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiencyIntelligence at scale through AI model efficiency
Intelligence at scale through AI model efficiency
 
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
Contributions to the Efficient Use of General Purpose Coprocessors: KDE as Ca...
 

More from Lviv Data Science Summer School

Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Lviv Data Science Summer School
 
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Lviv Data Science Summer School
 
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...Lviv Data Science Summer School
 
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Lviv Data Science Summer School
 
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...Lviv Data Science Summer School
 
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
 Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata ItemsLviv Data Science Summer School
 
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...Lviv Data Science Summer School
 
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...Lviv Data Science Summer School
 
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...Lviv Data Science Summer School
 
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...Lviv Data Science Summer School
 
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...Lviv Data Science Summer School
 
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...Lviv Data Science Summer School
 
Master defence 2020 - Yevhen Pozdniakov - Changing Clothing on People Images...
Master defence 2020 - Yevhen Pozdniakov -  Changing Clothing on People Images...Master defence 2020 - Yevhen Pozdniakov -  Changing Clothing on People Images...
Master defence 2020 - Yevhen Pozdniakov - Changing Clothing on People Images...Lviv Data Science Summer School
 
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
 Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar... Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...Lviv Data Science Summer School
 
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Lviv Data Science Summer School
 
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...Lviv Data Science Summer School
 
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...Lviv Data Science Summer School
 
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...Lviv Data Science Summer School
 
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...Lviv Data Science Summer School
 
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...Lviv Data Science Summer School
 

More from Lviv Data Science Summer School (20)

Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
 
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
Master defence 2020 - Andrew Kurochkin - Meme Generation for Social Media Aud...
 
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
Master defence 2020 - Nazariy Perepichka - Parameterizing of Human Speech Gen...
 
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
Master defence 2020 - Anastasiia Khaburska - Statistical and Neural Language ...
 
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
Master defence 2020 - Serhii Tiutiunnyk - Context-based Question-answering Sy...
 
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
 Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
Master defence 2020 - Kateryna Liubonko - Matching Red Links to Wikidata Items
 
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
Master defence 2020 - Dmytro Babenko - Determining Sentiment and Important Pr...
 
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
Master defence 2020 - Oleh Lukianykhin - Reinforcement Learning for Voltage C...
 
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
Master defence 2020 - Philipp Kofman - Efficient Generation of Complex Data D...
 
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
Master defence 2020 - Anastasiia Kasprova - Customer Lifetime Value for Retai...
 
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
Master defence 2020 - Dmitri Glusco - Replica Exchange For Multiple-Environme...
 
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
Master defence 2020 - Ivan Prodaiko - Person Re-identification in a Top-view ...
 
Master defence 2020 - Yevhen Pozdniakov - Changing Clothing on People Images...
Master defence 2020 - Yevhen Pozdniakov -  Changing Clothing on People Images...Master defence 2020 - Yevhen Pozdniakov -  Changing Clothing on People Images...
Master defence 2020 - Yevhen Pozdniakov - Changing Clothing on People Images...
 
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
 Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar... Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
Master defence 2020 - Oleh Onyshchak - Image Recommendation for Wikipedia Ar...
 
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
Master defence 2020 - Oleh Misko - Ensembling and Transfer Learning for Multi...
 
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...Master defence 2020 -  Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
Master defence 2020 - Roman Riazantsev - 3D Reconstruction of Video Sign Lan...
 
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
Master defence 2020 - Vadym Korshunov - Region-Selected Image Generation with...
 
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
Master defence 2020 -Roman Moiseiev - Stock Market Prediction Utilizing Centr...
 
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
Master defence 2020 - Maksym Opirskyi -Topological Approach to Wikipedia Arti...
 
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
Master defence 2020 - Oleksandr Smyrnov - A Multifactorial Optimization of Pe...
 

Recently uploaded

Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)DHURKADEVIBASKAR
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxFarihaAbdulRasheed
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real timeSatoshi NAKAHIRA
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentationtahreemzahra82
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555kikilily0909
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxpriyankatabhane
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10ROLANARIBATO3
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫qfactory1
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxdharshini369nike
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsssuserddc89b
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |aasikanpl
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayZachary Labe
 

Recently uploaded (20)

Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)Recombinant DNA technology( Transgenic plant and animal)
Recombinant DNA technology( Transgenic plant and animal)
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptxRESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
RESPIRATORY ADAPTATIONS TO HYPOXIA IN HUMNAS.pptx
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Grafana in space: Monitoring Japan's SLIM moon lander in real time
Grafana in space: Monitoring Japan's SLIM moon lander  in real timeGrafana in space: Monitoring Japan's SLIM moon lander  in real time
Grafana in space: Monitoring Japan's SLIM moon lander in real time
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Harmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms PresentationHarmful and Useful Microorganisms Presentation
Harmful and Useful Microorganisms Presentation
 
‏‏VIRUS - 123455555555555555555555555555555555555555
‏‏VIRUS -  123455555555555555555555555555555555555555‏‏VIRUS -  123455555555555555555555555555555555555555
‏‏VIRUS - 123455555555555555555555555555555555555555
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Speech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptxSpeech, hearing, noise, intelligibility.pptx
Speech, hearing, noise, intelligibility.pptx
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Hauz Khas Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10Gas_Laws_powerpoint_notes.ppt for grade 10
Gas_Laws_powerpoint_notes.ppt for grade 10
 
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Aiims Metro Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫Manassas R - Parkside Middle School 🌎🏫
Manassas R - Parkside Middle School 🌎🏫
 
TOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptxTOTAL CHOLESTEROL (lipid profile test).pptx
TOTAL CHOLESTEROL (lipid profile test).pptx
 
TOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physicsTOPIC 8 Temperature and Heat.pdf physics
TOPIC 8 Temperature and Heat.pdf physics
 
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Lajpat Nagar (Delhi) |
 
Welcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work DayWelcome to GFDL for Take Your Child To Work Day
Welcome to GFDL for Take Your Child To Work Day
 

Master defence 2020 - Borys Olshanetskyi -Context Independent Speaker Classification

  • 1.
  • 2. 2
  • 3. General Speaker Diarization - a “Who spoke when?” problem: - Crucial for speech understanding in a multi-speaker environment - Many practical applications - Challenging, the general solution does not exist 3
  • 4. Problem: Speaker Classification Research goals: - Provide overview of the related work in the scientific field of the speaker classification / diarization problem - Explore existing solution with regard to the research hypothesis - Propose and evaluate original solution to the speaker classification problem Industry related goals: - First step towards product level diarization system 4
  • 5. Sound is multidimensional unstructured data. Therefore modern neural networks models should perform comparatively well for the classification task. 5
  • 7. CNNs in sound and voice processing ● ● ● 7
  • 9. - LibriSpeech ASR corpus - derived from audiobooks, 251 speakers, 100 hours of speech. - ICSI Meeting Speech corpus - derived from real meetings, 53 speakers, 72 hours of speech. 9
  • 10. Stage 1 Front-end: ● MFCC 16 ● MFCC 32 ● Mel-Spectrogram 256 ● Mel-Spectrogram 512 ● Spectrogram 128 ● Spectrogram 256 ● Spectrogram 512 10 Stage 1 Back-end: ● CNN with 6 layers Stage 1 Samples: ● 0.5 sec ● 1 sec ● 2 sec
  • 11. Back-end example for 128x128 input 11
  • 12. Mel Frequency Cepstral Coefficients (MFCC) process. 12
  • 13. MFCC with 16 coefficients for 2 seconds audio. 13
  • 14. Mel sprectrogram with window size = 512 and cepstral coefficients = 128 for 1 sec audio. 14
  • 15. Spectrogram of 1 sec audio fragment, window length = 256. 15
  • 16. 16
  • 17. ● Spectrograms perform better then MFCC ● Longer fragments perform better ● Inputs with equal dimensions perform better 17
  • 18. 18
  • 19. 19
  • 20. ● Accuracy of 0.94 gives strong promise for future development ● Model is context independent ● Data preprocessing is more important than classifier architecture 20
  • 21. Extended Research ● LSTMs ● TDNN ● D-vector Bridging the Diarization Gap ● Ensemble models ● End-2-End models 21
  • 22. ● In-depth overview of Speaker diarization scientific field ● Performance comparison of different sound representation with CNN classifier ● Proposed original approach for speaker classification task 22
  • 23. ● Model was trained for each dataset ● Only activation sizes are depicted on Figures. ● The model can not be directly compared to other models because they use different proprietary or costly datasets ● Split ratio for test and train data was 0.2 and 0.8 23