• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Cenk Demiroglu - Analysis of Prosodic Patterns in Conversational Speech in People with Alzheimer’s Disease
 

Cenk Demiroglu - Analysis of Prosodic Patterns in Conversational Speech in People with Alzheimer’s Disease

on

  • 595 views

Presentation of Workshop on Technology for Healthcare and Healthy Lifestyle 2011

Presentation of Workshop on Technology for Healthcare and Healthy Lifestyle 2011

Thursday 1st Dec 2011
Session IV

http://www.tsb.upv.es/wths2011

Statistics

Views

Total Views
595
Views on SlideShare
595
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Cenk Demiroglu - Analysis of Prosodic Patterns in Conversational Speech in People with Alzheimer’s Disease Cenk Demiroglu - Analysis of Prosodic Patterns in Conversational Speech in People with Alzheimer’s Disease Presentation Transcript

    • Analysis of Prosodic Patterns inConversational Speech in People with Alzheimer’s Disease Dr Cenk DemiroğluAssistant Professor at Özyeğin University, Istanbul, Turkey Founder and CEO of NeoSes Inc. Istanbul, Turkey
    • Who are we? Özyeğin University  Private school in Istanbul, Turkey  Has one of the best faculty profiles in the nation OzU Speech Lab  Founded and directed by Cenk Demiroğlu  Experts at  Speech recognition, synthesis and verification technologies  Tightly connected to industry (including AT&T Research in the USA)  Neoses is a spin-off company from the lab  Focused on speech technologies and machine learning theory
    • Why are we Interested in Biomedical? Health is one of the emerging areas in the speech field We are interested in  Alzheimer’s diagnosis and monitoring  Depression diagnosis and monitoring Increasing interest in the EU framework programmes Increasing interest in the Turkish Ministry of Health We see opportunities  For research  For commercialization
    • Our Core Strategy Diagnosis can be semi-automated  Need to work with Medical Doctors and hospitals  Diagnosis of the doctor is still critical  Technology may help doctors There is a large gap between doctor visits (may be months)  Not enough MDs  Too expensive to monitor very closely Speech technology can help  Speech analysis over the phone  Cheap monitoring through automated call centers
    • Literature Review Cognitive decline in the auditory part of the central nervous system has been observed in Alzheimer’s disease Lexical problems are one of the earliest signs of the disease  Early detection Prosodic parameters have been found to be relevant identifying the disease Research in analysing spontenous, conversational speech is relatively rare.  Semantic and syntactic analysis
    • Focus on Conversational Speech Spontenous speech is rich in information  Prosodic  Semantic  Syntactic Literature is not multi-disciplinary (at least not enough)  A combination of expertise in speech signal processing, pattern recognition ,and medical fields is required  Some of the most powerful pattern recognition and cluster analysis algorithms have not been investigated enough in the literature (Graphical Models, probabilistic factor analysis etc.)  Some of the more advanced speech analysis tools have not been used (accurate glottal closure point analysis using STRAIGHT)  Results of the efforts with the speech recognition based approach to lexical deterioration analysis mainly missing  The idea is there but results are not. Difficult problem!  Leading labs in the speech research are just beginning to get interested with this problem We have a brief summary of our preliminary investigation here
    • Data Collection Method First attempt:  Setup an automated call center  Patients call everyday  List of 20 questions  What time did you wake up today?  What did you eat at breakfast?  Did you do any exercise?  Etc  Problems  Patients forget to call!  Patients are not able to understand and respond to the question over the phone
    • Data Collection Method Second attempt:  Send a graduate student to the nursing house  The student interviews the patient  Data is collected through a digital voice recorder  Collected data from 24 patients at phase-3 (late phase of the disease)  Need more data from more patients at different phases  Need to monitor over time We have data from 400 healthy subjects  The age range is 30-50  Need better age range for a fair comparison
    • Subjective Observations We could not understand what some of the patients were saying  Slurred speech  Semantic and syntactically wrong sentences  Missing and/or mispronunced phonemes In some patients answers to the questions were relevant but the patients begin to repeat himself/herself after a couple of minutes If the answers are relevant, they are typically short If the answers are irrelevant, they are mostly long
    • Prosodic Parameters and KL Distances Feature KLDSilence/speech ratio 7.9123Speech (max. distance to median) in sec 6.1105Speech (std. dev) in sec 5.1559Number of silence per minute 2.7054Silence (std. dev) 2.3291Speech (mean) in sec 2.2849Pitch (max. distance to median) 1.9820Pitch (min. distance to median) 1.9359 KLD: Kullback-Leibler distance between distributions. A way to measure the discriminatory power of the features
    • Prosodic Parameters and KL Distances Feature KLDSilence (mean) 1.9013Silence (median) 0.8603Silence (distance of min to median) 0.8603Silence (max. distance to median) 0.6320Speech (median) in sec 0.4450Speech (min dist 2 median) in sec) 0.4450Pitch (std. dev) 0.3260
    • Scatter Plot of the Most RelevantParameters
    • Scatter Plot of the Most RelevantParameters
    • Linear Discriminant Analysis Feature Equal Error Rate (EER)Pitch (max dist 2 median) 12.50Pitch (min dist 2 median) 12.50Speech (std. dev) in sec 12.50Speech (max dist 2 median) (in sec) 16.67Silence/speech 16.67Silence std (in sec) 20.83Speech mean (in sec) 20.83Silence mean (in sec) 25.00 EER: Equal-Error-Rate Prob(False Alarm) = Prob(Miss)
    • Linear Discriminant Analysis Feature Equal Error Rate (EER)Silence median (in sec) 33.33Silence (max dist 2 median) (in sec) 33.33Silence (min dist 2 median) (in sec) 33.33Speech median (in sec) 50.00Speech (min dist 2 median) (in sec) 50.00pitch std 54.17
    • LDA vs KL Distance
    • Conclusion and Future Work Preliminary study is promising Even individual features have high discriminatory power Need more data to use more advanced analysis techniques Need data from multiple phases of disease to get a stronger sense of correlation Need to follow the patients over time to monitor how the parameters change over time We made an attempt to do classifier to fusion to improve the performance but no success yet  Will focus more on this in the future