SlideShare a Scribd company logo
1 of 17
Dynamic Music Emotion Recognition 
using State-Space Models 
Team UoA 
Konstantin Markov*, Tomoko Matsui** 
*The University of Aizu, Japan 
**Institute of Statistical Mathematics, Japan
Subtask focus 
 Subtask 1: Feature development 
 Static affect prediction 
 New features 
 Signal Processing challenge 
 Subtask 2: Dynamic estimation 
 Dynamic affect prediction 
 New modeling approaches 
 Machine learning challenge
Subtask focus 
 Subtask 1: Feature development 
 Static affect prediction 
 New features 
 Signal Processing challenge 
 Subtask 2: Dynamic estimation 
 Dynamic affect prediction 
 New modeling approaches 
 Machine learning challenge
Affect trajectory 
 Dynamic emotion recognition is estimation of the 
affect trajectory in time! 
 Apply time series analysis tools: 
 Trajectory estimation is a time series 
filtering/smoothing task. 
 Well suited are the State-Space Models (SSM).
State-Space Models 
 Also known as dynamic (temporal) systems: 
푥푡 = 푓 푥푡−1, 퐴 + υ푡 , υ~퓝(0, 푅) 
푦푡 = 푔 푥푡 , 퐵 + ν푡 , ν~퓝(0, 푄)
Gaussian Filtering/Smoothing 
 Given a State-Space Model: 
 The task of filtering is to approximate 
푝 푥푡 푦1:푡 , 푡 = 1, … 푇 
 The task of smoothing is to approximate 
푝 푥푡 푦1:푇 , 푡 = 1, … 푇 
 When they are approximated by 
Gaussian distributions, the task is called 
Gaussian filtering/smoothing.
Gaussian Filtering 
 It can be shown that: 
푥, Σ푡 
푝 푥푡 푦1:푡 = 퓝 푥푡 휇푡 
(Deisenroth, 2011) 
푥 , where 
푥 = 휇푡 
휇푡 
푥푝푟푒푑 + Σ푡 
푦 −1 
푥,푦 Σ푡 
푦 
푦푡 − 휇푡 
푥 = Σ푡 
Σ푡 
푥푝푟푒푑 − Σ푡 
푦 −1 
푥,푦 Σ푡 
푥,푦 )푇 
(Σ푡 
푥푝푟푒푑 , Σ푡 
푝 푥푡 푦1:푡−1 = 퓝 푥푡 휇푡 
푥푝푟푒푑
Kalman Filter (KF) 
 Linear state-space model: 
푥푡 = 퐹푥푡−1 + υ푡 
푦푡 = 퐺푥푡 + ν푡 
 Advantages: 
 Analytic approximation to 푝(풙푡 |풚1:푡 ) 
 Fast 
 Disadvantages: 
 Linearity assumption
Gaussian Process SSM (GP-SSM) 
 Gaussian Processes based state-space model: 
푥푡 = 푓 푥푡−1 + υ푡 , 푓(푥) ∼ 풢풫(0, 퐾푓 ) 
푦푡 = 푔 푥푡 + ν푡, 푔(푥) ∼ 풢풫(0, 퐾푔) 
 Advantages: 
 Non-linear, Non-parametric 
 Flexible. 
 Disadvantages: 
 No standard algorithms for training and inference. 
 Analytic moment matching approximation (Deisenroth,2012) 
 Computationally expensive.
Experiments 
 Feature extraction. 
 Marsyas tool. 
 mfcc – MelFreq Cepstral Coefficients 
 spfe – zero-cross, spectral flux, centroid, rolloff. 
 scf – spectral crest factor. 
 baseline – Features used in the official baseline 
system. 
 Independent state and observation model learning. 
 Multivariate linear regression for the KF. 
 GP regression learning for the GP-SSM.
Experiments 
 Development data. 
 Training set - 600 clips. 
 Validation set – 144 clips. 
 Training set clustering. 
 Four clusters based on clips static A-V vectors. 
 Separate SSM trained for each 
cluster. 
 Maximum likelihood based model 
selection.
Results on Development Data 
Feature Kalman filter RTS smoother 
R RMSE R RMSE 
AROUSAL – SINGLE MODEL 
mfcc 0.2062 0.2894 0.1070 0.3008 
mfcc+spfe 0.2326 0.2378 0.0894 0.2291 
mfcc+scf 0.1171 0.2288 0.1611 0.2188 
baseline 0.2791 0.3631 0.1898 0.4027 
AROUSAL – MULTIPLE MODELS 
mfcc 0.1698 0.1384 0.0991 0.1284 
mfcc+spfe 0.2022 0.1290 0.1246 0.1277 
mfcc+scf 0.0059 0.1613 0.0253 0.1615 
baseline 0.0212 0.2276 0.0236 0.2259
Results on Development Data 
Feature Kalman filter RTS smoother 
R RMSE R RMSE 
VALENCE – SINGLE MODEL 
mfcc 0.0411 0.3131 0.0598 0.3542 
mfcc+spfe 0.0304 0.3100 0.0725 0.3495 
mfcc+scf 0.1545 0.3346 0.1401 0.3616 
baseline 0.0753 0.1341 0.0779 0.1499 
VALENCE – MULTIPLE MODELS 
mfcc -0.082 0.1847 -0.042 0.1915 
mfcc+spfe -0.054 0.1866 -0.068 0.1914 
mfcc+scf 0.0149 0.1688 -0.008 0.1703 
baseline -0.080 0.2425 -0.058 0.2497
Results on Development Data 
Feature GP-SSM filter GP-SSM smoother 
R RMSE R RMSE 
AROUSAL – MULTIPLE MODELS 
mfcc 0.0436 0.3088 0.0743 0.3207 
spfe 0.0582 0.3046 0.0714 0.3486 
baseline -0.007 0.3025 0.0393 0.3444 
VALENCE – MULTIPLE MODELS 
mfcc 0.0217 0.2766 0.0313 0.3083 
spfe 0.0283 0.3297 -0.003 0.3515 
baseline -0.011 0.3891 -0.020 0.4431
Results on Test Data 
Feature Kalman filter GP-SSM filter 
R RMSE R RMSE 
AROUSAL – MULTIPLE MODELS 
mfcc 0.1914 0.0852 0.044 0.1089 
spfe 0.0526 0.1986 0.1015 0.2066 
baseline -0.1520 0.3824 0.0301 0.2073 
VALENCE – MULTIPLE MODELS 
mfcc -0.065 0.1590 -0.017 0.1096 
spfe -0.075 0.2679 -0.023 0.1920 
baseline -0.099 0.2325 -0.049 0.2267
Conclusions 
 Training data clustering. 
 Did not improve the Kalman Filter performance. 
 The only way GP-SSM could be trained. 
 KF vs. GP-SST 
 Similar performance under similar conditions. 
 GP-SST a bit better for Valence estimation. 
 Feature types. 
 Baseline features seem to perform well. 
 No definite winner.
The end 
Thank you for your attention! 
Questions?

More Related Content

What's hot

Lecture 6 modelling-of_electrical__electronic_systems
Lecture 6 modelling-of_electrical__electronic_systemsLecture 6 modelling-of_electrical__electronic_systems
Lecture 6 modelling-of_electrical__electronic_systems
Saifullah Memon
 
Lecture 2 transfer-function
Lecture 2 transfer-functionLecture 2 transfer-function
Lecture 2 transfer-function
Saifullah Memon
 
Detection of temporary objects in mobile lidar point clouds
Detection of temporary objects in mobile lidar point cloudsDetection of temporary objects in mobile lidar point clouds
Detection of temporary objects in mobile lidar point clouds
Kourosh Khoshelham
 
HSFC Physics formula sheet
HSFC Physics formula sheetHSFC Physics formula sheet
HSFC Physics formula sheet
oneill95
 

What's hot (20)

Aerodynamic Flow over a Car
Aerodynamic Flow over a CarAerodynamic Flow over a Car
Aerodynamic Flow over a Car
 
Unit 4 frequency response-Bode plot
Unit 4 frequency response-Bode plotUnit 4 frequency response-Bode plot
Unit 4 frequency response-Bode plot
 
Lecture 6 modelling-of_electrical__electronic_systems
Lecture 6 modelling-of_electrical__electronic_systemsLecture 6 modelling-of_electrical__electronic_systems
Lecture 6 modelling-of_electrical__electronic_systems
 
PRIM’S AND KRUSKAL’S ALGORITHM
PRIM’S AND KRUSKAL’S  ALGORITHMPRIM’S AND KRUSKAL’S  ALGORITHM
PRIM’S AND KRUSKAL’S ALGORITHM
 
Control of a Quadcopter
Control of a QuadcopterControl of a Quadcopter
Control of a Quadcopter
 
FYP 2 SLIDE
FYP 2 SLIDEFYP 2 SLIDE
FYP 2 SLIDE
 
Lecture 2 transfer-function
Lecture 2 transfer-functionLecture 2 transfer-function
Lecture 2 transfer-function
 
report_1
report_1report_1
report_1
 
report_2_v2
report_2_v2report_2_v2
report_2_v2
 
Av 738-Adaptive Filters - Extended Kalman Filter
Av 738-Adaptive Filters - Extended Kalman FilterAv 738-Adaptive Filters - Extended Kalman Filter
Av 738-Adaptive Filters - Extended Kalman Filter
 
Me314 week05a-block diagreduction
Me314 week05a-block diagreductionMe314 week05a-block diagreduction
Me314 week05a-block diagreduction
 
Detection of temporary objects in mobile lidar point clouds
Detection of temporary objects in mobile lidar point cloudsDetection of temporary objects in mobile lidar point clouds
Detection of temporary objects in mobile lidar point clouds
 
Tracking[1]
Tracking[1]Tracking[1]
Tracking[1]
 
Root Locus
Root Locus Root Locus
Root Locus
 
Stability of Control System
Stability of Control SystemStability of Control System
Stability of Control System
 
Zupt, LLC's SLAM and Optimal Sensor fusion
Zupt, LLC's SLAM and Optimal Sensor fusionZupt, LLC's SLAM and Optimal Sensor fusion
Zupt, LLC's SLAM and Optimal Sensor fusion
 
Stability and pole location
Stability and pole locationStability and pole location
Stability and pole location
 
Graph-based SLAM
Graph-based SLAMGraph-based SLAM
Graph-based SLAM
 
LAPLACE TRANSFORM (Differential Equation)
LAPLACE TRANSFORM (Differential Equation)LAPLACE TRANSFORM (Differential Equation)
LAPLACE TRANSFORM (Differential Equation)
 
HSFC Physics formula sheet
HSFC Physics formula sheetHSFC Physics formula sheet
HSFC Physics formula sheet
 

Similar to Dynamic Music Emotion Recognition Using State-Space Models

Presentation for the 19th EUROSTAR Users Conference June 2011
Presentation for the 19th EUROSTAR Users Conference June 2011Presentation for the 19th EUROSTAR Users Conference June 2011
Presentation for the 19th EUROSTAR Users Conference June 2011
Antonios Arkas
 

Similar to Dynamic Music Emotion Recognition Using State-Space Models (20)

Adaptive Beamforming Algorithms
Adaptive Beamforming Algorithms Adaptive Beamforming Algorithms
Adaptive Beamforming Algorithms
 
Presentation for the 19th EUROSTAR Users Conference June 2011
Presentation for the 19th EUROSTAR Users Conference June 2011Presentation for the 19th EUROSTAR Users Conference June 2011
Presentation for the 19th EUROSTAR Users Conference June 2011
 
Seminar On Kalman Filter And Its Applications
Seminar On  Kalman  Filter And Its ApplicationsSeminar On  Kalman  Filter And Its Applications
Seminar On Kalman Filter And Its Applications
 
Basics Of Kalman Filter And Position Estimation Of Front Wheel Automatic Stee...
Basics Of Kalman Filter And Position Estimation Of Front Wheel Automatic Stee...Basics Of Kalman Filter And Position Estimation Of Front Wheel Automatic Stee...
Basics Of Kalman Filter And Position Estimation Of Front Wheel Automatic Stee...
 
Kalman_filtering
Kalman_filteringKalman_filtering
Kalman_filtering
 
NaCoMM Final
NaCoMM FinalNaCoMM Final
NaCoMM Final
 
Andrey Kuznetsov and Vladislav Myasnikov - Using Efficient Linear Local Feat...
Andrey Kuznetsov and  Vladislav Myasnikov - Using Efficient Linear Local Feat...Andrey Kuznetsov and  Vladislav Myasnikov - Using Efficient Linear Local Feat...
Andrey Kuznetsov and Vladislav Myasnikov - Using Efficient Linear Local Feat...
 
CSPA 2008 Presentation
CSPA 2008 PresentationCSPA 2008 Presentation
CSPA 2008 Presentation
 
Computer aided design of communication systems / Simulation Communication Sys...
Computer aided design of communication systems / Simulation Communication Sys...Computer aided design of communication systems / Simulation Communication Sys...
Computer aided design of communication systems / Simulation Communication Sys...
 
Multiple Sensors Soft-Failure Diagnosis Based on Kalman Filter
Multiple Sensors Soft-Failure Diagnosis Based on Kalman FilterMultiple Sensors Soft-Failure Diagnosis Based on Kalman Filter
Multiple Sensors Soft-Failure Diagnosis Based on Kalman Filter
 
MODELLING, ANALYSIS AND SIMULATION OF DYNAMIC SYSTEMS USING CONTROL TECHNIQUE...
MODELLING, ANALYSIS AND SIMULATION OF DYNAMIC SYSTEMS USING CONTROL TECHNIQUE...MODELLING, ANALYSIS AND SIMULATION OF DYNAMIC SYSTEMS USING CONTROL TECHNIQUE...
MODELLING, ANALYSIS AND SIMULATION OF DYNAMIC SYSTEMS USING CONTROL TECHNIQUE...
 
• Sensorless speed and position estimation of a PMSM (Master´s Thesis)
•	Sensorless speed and position estimation of a PMSM (Master´s Thesis)•	Sensorless speed and position estimation of a PMSM (Master´s Thesis)
• Sensorless speed and position estimation of a PMSM (Master´s Thesis)
 
P10 project
P10 projectP10 project
P10 project
 
RMA-LSCh-CMA, presentation for WCCI'2014 (IEEE CEC'2014)
RMA-LSCh-CMA, presentation for WCCI'2014 (IEEE CEC'2014)RMA-LSCh-CMA, presentation for WCCI'2014 (IEEE CEC'2014)
RMA-LSCh-CMA, presentation for WCCI'2014 (IEEE CEC'2014)
 
A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...A hybrid sine cosine optimization algorithm for solving global optimization p...
A hybrid sine cosine optimization algorithm for solving global optimization p...
 
Vibration isolation progect lego(r)
Vibration isolation progect lego(r)Vibration isolation progect lego(r)
Vibration isolation progect lego(r)
 
論文紹介 Probabilistic sfa for behavior analysis
論文紹介 Probabilistic sfa for behavior analysis論文紹介 Probabilistic sfa for behavior analysis
論文紹介 Probabilistic sfa for behavior analysis
 
Kalman filter for Beginners
Kalman filter for BeginnersKalman filter for Beginners
Kalman filter for Beginners
 
12. Linear models
12. Linear models12. Linear models
12. Linear models
 
Report Simulations of Communication Systems
Report Simulations of Communication SystemsReport Simulations of Communication Systems
Report Simulations of Communication Systems
 

More from multimediaeval

Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
multimediaeval
 

More from multimediaeval (20)

Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
Classification of Strokes in Table Tennis with a Three Stream Spatio-Temporal...
 
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
HCMUS at MediaEval 2020: Ensembles of Temporal Deep Neural Networks for Table...
 
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...Sports Video Classification: Classification of Strokes in Table Tennis for Me...
Sports Video Classification: Classification of Strokes in Table Tennis for Me...
 
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
Predicting Media Memorability from a Multimodal Late Fusion of Self-Attention...
 
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 TaskEssex-NLIP at MediaEval Predicting Media Memorability 2020 Task
Essex-NLIP at MediaEval Predicting Media Memorability 2020 Task
 
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
Overview of MediaEval 2020 Predicting Media Memorability task: What Makes a V...
 
Fooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality EstimatorFooling an Automatic Image Quality Estimator
Fooling an Automatic Image Quality Estimator
 
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
Fooling Blind Image Quality Assessment by Optimizing a Human-Understandable C...
 
Pixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social ImagesPixel Privacy: Quality Camouflage for Social Images
Pixel Privacy: Quality Camouflage for Social Images
 
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-MatchingHCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
HCMUS at MediaEval 2020:Image-Text Fusion for Automatic News-Images Re-Matching
 
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
Efficient Supervision Net: Polyp Segmentation using EfficientNet and Attentio...
 
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
HCMUS at Medico Automatic Polyp Segmentation Task 2020: PraNet and ResUnet++ ...
 
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
Depth-wise Separable Atrous Convolution for Polyps Segmentation in Gastro-Int...
 
Deep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp SegmentationDeep Conditional Adversarial learning for polyp Segmentation
Deep Conditional Adversarial learning for polyp Segmentation
 
A Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image DetectionA Temporal-Spatial Attention Model for Medical Image Detection
A Temporal-Spatial Attention Model for Medical Image Detection
 
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
HCMUS-Juniors 2020 at Medico Task in MediaEval 2020: Refined Deep Neural Netw...
 
Fine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with AttentionFine-tuning for Polyp Segmentation with Attention
Fine-tuning for Polyp Segmentation with Attention
 
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
Bigger Networks are not Always Better: Deep Convolutional Neural Networks for...
 
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
Insights for wellbeing: Predicting Personal Air Quality Index using Regressio...
 
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ... Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
Use Visual Features From Surrounding Scenes to Improve Personal Air Quality ...
 

Recently uploaded

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 

Dynamic Music Emotion Recognition Using State-Space Models

  • 1. Dynamic Music Emotion Recognition using State-Space Models Team UoA Konstantin Markov*, Tomoko Matsui** *The University of Aizu, Japan **Institute of Statistical Mathematics, Japan
  • 2. Subtask focus  Subtask 1: Feature development  Static affect prediction  New features  Signal Processing challenge  Subtask 2: Dynamic estimation  Dynamic affect prediction  New modeling approaches  Machine learning challenge
  • 3. Subtask focus  Subtask 1: Feature development  Static affect prediction  New features  Signal Processing challenge  Subtask 2: Dynamic estimation  Dynamic affect prediction  New modeling approaches  Machine learning challenge
  • 4. Affect trajectory  Dynamic emotion recognition is estimation of the affect trajectory in time!  Apply time series analysis tools:  Trajectory estimation is a time series filtering/smoothing task.  Well suited are the State-Space Models (SSM).
  • 5. State-Space Models  Also known as dynamic (temporal) systems: 푥푡 = 푓 푥푡−1, 퐴 + υ푡 , υ~퓝(0, 푅) 푦푡 = 푔 푥푡 , 퐵 + ν푡 , ν~퓝(0, 푄)
  • 6. Gaussian Filtering/Smoothing  Given a State-Space Model:  The task of filtering is to approximate 푝 푥푡 푦1:푡 , 푡 = 1, … 푇  The task of smoothing is to approximate 푝 푥푡 푦1:푇 , 푡 = 1, … 푇  When they are approximated by Gaussian distributions, the task is called Gaussian filtering/smoothing.
  • 7. Gaussian Filtering  It can be shown that: 푥, Σ푡 푝 푥푡 푦1:푡 = 퓝 푥푡 휇푡 (Deisenroth, 2011) 푥 , where 푥 = 휇푡 휇푡 푥푝푟푒푑 + Σ푡 푦 −1 푥,푦 Σ푡 푦 푦푡 − 휇푡 푥 = Σ푡 Σ푡 푥푝푟푒푑 − Σ푡 푦 −1 푥,푦 Σ푡 푥,푦 )푇 (Σ푡 푥푝푟푒푑 , Σ푡 푝 푥푡 푦1:푡−1 = 퓝 푥푡 휇푡 푥푝푟푒푑
  • 8. Kalman Filter (KF)  Linear state-space model: 푥푡 = 퐹푥푡−1 + υ푡 푦푡 = 퐺푥푡 + ν푡  Advantages:  Analytic approximation to 푝(풙푡 |풚1:푡 )  Fast  Disadvantages:  Linearity assumption
  • 9. Gaussian Process SSM (GP-SSM)  Gaussian Processes based state-space model: 푥푡 = 푓 푥푡−1 + υ푡 , 푓(푥) ∼ 풢풫(0, 퐾푓 ) 푦푡 = 푔 푥푡 + ν푡, 푔(푥) ∼ 풢풫(0, 퐾푔)  Advantages:  Non-linear, Non-parametric  Flexible.  Disadvantages:  No standard algorithms for training and inference.  Analytic moment matching approximation (Deisenroth,2012)  Computationally expensive.
  • 10. Experiments  Feature extraction.  Marsyas tool.  mfcc – MelFreq Cepstral Coefficients  spfe – zero-cross, spectral flux, centroid, rolloff.  scf – spectral crest factor.  baseline – Features used in the official baseline system.  Independent state and observation model learning.  Multivariate linear regression for the KF.  GP regression learning for the GP-SSM.
  • 11. Experiments  Development data.  Training set - 600 clips.  Validation set – 144 clips.  Training set clustering.  Four clusters based on clips static A-V vectors.  Separate SSM trained for each cluster.  Maximum likelihood based model selection.
  • 12. Results on Development Data Feature Kalman filter RTS smoother R RMSE R RMSE AROUSAL – SINGLE MODEL mfcc 0.2062 0.2894 0.1070 0.3008 mfcc+spfe 0.2326 0.2378 0.0894 0.2291 mfcc+scf 0.1171 0.2288 0.1611 0.2188 baseline 0.2791 0.3631 0.1898 0.4027 AROUSAL – MULTIPLE MODELS mfcc 0.1698 0.1384 0.0991 0.1284 mfcc+spfe 0.2022 0.1290 0.1246 0.1277 mfcc+scf 0.0059 0.1613 0.0253 0.1615 baseline 0.0212 0.2276 0.0236 0.2259
  • 13. Results on Development Data Feature Kalman filter RTS smoother R RMSE R RMSE VALENCE – SINGLE MODEL mfcc 0.0411 0.3131 0.0598 0.3542 mfcc+spfe 0.0304 0.3100 0.0725 0.3495 mfcc+scf 0.1545 0.3346 0.1401 0.3616 baseline 0.0753 0.1341 0.0779 0.1499 VALENCE – MULTIPLE MODELS mfcc -0.082 0.1847 -0.042 0.1915 mfcc+spfe -0.054 0.1866 -0.068 0.1914 mfcc+scf 0.0149 0.1688 -0.008 0.1703 baseline -0.080 0.2425 -0.058 0.2497
  • 14. Results on Development Data Feature GP-SSM filter GP-SSM smoother R RMSE R RMSE AROUSAL – MULTIPLE MODELS mfcc 0.0436 0.3088 0.0743 0.3207 spfe 0.0582 0.3046 0.0714 0.3486 baseline -0.007 0.3025 0.0393 0.3444 VALENCE – MULTIPLE MODELS mfcc 0.0217 0.2766 0.0313 0.3083 spfe 0.0283 0.3297 -0.003 0.3515 baseline -0.011 0.3891 -0.020 0.4431
  • 15. Results on Test Data Feature Kalman filter GP-SSM filter R RMSE R RMSE AROUSAL – MULTIPLE MODELS mfcc 0.1914 0.0852 0.044 0.1089 spfe 0.0526 0.1986 0.1015 0.2066 baseline -0.1520 0.3824 0.0301 0.2073 VALENCE – MULTIPLE MODELS mfcc -0.065 0.1590 -0.017 0.1096 spfe -0.075 0.2679 -0.023 0.1920 baseline -0.099 0.2325 -0.049 0.2267
  • 16. Conclusions  Training data clustering.  Did not improve the Kalman Filter performance.  The only way GP-SSM could be trained.  KF vs. GP-SST  Similar performance under similar conditions.  GP-SST a bit better for Valence estimation.  Feature types.  Baseline features seem to perform well.  No definite winner.
  • 17. The end Thank you for your attention! Questions?