SlideShare a Scribd company logo
More than Words

Advancing Prosodic Analysis
Andrew Rosenberg
City Tech Colloquium
February 5, 2015
Speech Technology
2
Prosody
Syntax Semantics Pragmatics Paralinguistics
Mary knows; you can do it.

Mary knows you can do it.
Bill doesn’t drink because
he’s unhappy
Going to Boston.
Going to Boston?
Three Hundred Twelve.
Three Thousand Twelve.
3
Prosody in Text
ALSO FROM NORTH STATION I THINK THE ORANGE LINE RUNS BY
THERE TOO SO YOU CAN ALSO CATCH THE ORANGE LINE AND
THEN INSTEAD OF TRANSFERRING UM I YOU KNOW THE MAP IS
REALLY OBVIOUS ABOUT THIS BUT INSTEAD OF TRANSFERRING AT
PARK STREET YOU CAN TRANSFER AT UH WHAT’S THE STATION
NAME DOWNTOWN CROSSING UM AND THAT’LL GET YOU BACK
TO THE RED LINE JUST AS EASILY
4
Also, from the North Station...
(I think the Orange Line runs by there too so you can also catch the
Orange Line... )
And then instead of transferring
(um I- you know, the map is really obvious about this but)
Instead of transferring at Park Street, you can transfer at (uh what’s the
station name) Downtown Crossing and (um) that’ll get you back to the
Red Line just as easily.
Prosody in Text
5
Prosody in Text
I sooo hate you right now :-)
mondays :,(
Conner Thiele @St04hoEs:
Madison people are so funny #sarcasm
Dodie Clark @doddleoddle:
RePlAcEmEnT bus SerVicEs are mY fAvOURITE
#sARcASM.
Michelle Lee @mlee418
finding someone who loves makeup just as much as me
makes me feel warm inside #notkidding
6
Prosody in Spoken Language Processing
• Recognizing Emotions. 

Frustration and Anger in Call Centers
• Inserting punctuation in speech transcripts.

Notably, not in mobile voice input yet…
• Speaker Recognition
• Speaking Style Recognition
• Recognizing Native Language, Gender, Speaker Roles
• Improving performance of other spoken language processing
tasks. Parsing, Discourse Structure, Intent Recognition. 

Today: Identifying (possibly misrecognized) names in speech
7
Dimensions of Prosodic Variation
Pitch in Blue Intensity in Green
Duration of words/syllables
Presence of

Silence
Spectral Qualities
8
ToBI
• High level dimensions of prosodic variation.
• Tones and Break Indices
• High and Low tones describe prosodic events,
pitch accent and phrasing.
• Break indices describe the degree of disjuncture
between words.
• Two hierarchical levels of phrasing: intermediate
and intonational
9
ToBI Example - Praat
10
Dimensions of Prosodic Variation
Prominence (bold word)


Phrasing (end of phrase)
L-L% L-H% H-H% H-L% !H-L%
H* L* L*+H L+H* H+!H*
Mother TheresaGive me the brown oneis that Mariana’s money?do you really think it’s that one? (x2)
get on the harvard square T stopleave the government center T stopwe will go through centralthrough Boylestongo from Harvard Square
11
How is prosody used?
Symbolic
• Modular
• Linguistically
Meaningful
• Reduced
Dimensionality
Direct
• Task-Appropriate
• Lower information
loss (general)
• High Dimensionality
Acoustic Features
D = 100s-1000s
Symbolic Analysis
D=10-20
Task Specific
Acoustic Features
D = 100s-1000s
Task Specific
Learned Representations
• Modular
• Task-Appropriate
• Linguistically Meaningful
• Low information loss
• Reduced Dimensionality
Acoustic Features
D = 100s-1000s
Learned
Representation
D=10-20
Task Specific
Goal: compact,
consistent,
universal
12
Direct Modeling
• Topic and Sentence Segmentation.

[Liu et al. 2008, Rosenberg et al. 2006, Ostendorf et al. 2008 etc.]
• Lexical: n-grams, POS-tags, TextTiling, Lexical Chains and
other Coherence measures
• Prosodic: measures of acoustic “reset” across candidate
boundaries.
• Question Recognition for Spoken Dialog Systems

[Liscombe et al 2006]
• Lexical: n-grams, pos tags, filled pauses
• Prosodic: pitch slope in last 200ms. pausing, loudness
13
Contour Modeling
Pitch in Blue Intensity in Green
14
TILT
• Describes an F0 excursion based as a single parameter
Taylor 1998
• Compact representation of an excursion based on
position of the maxima
Contour Modeling
tiltamp =
|amprise| |ampfall|
|amprise| + |ampfall|
tiltdur =
durrise durfall
durrise + durfall
tilt =
tiltdur + tiltamp
2
15
Quantized Contour Modeling
• Each syllabic contour is laid onto an N-by-M grid with normalized
time and range. Results in an M element vector with an N-sized
vocabulary.

Rosenberg 2010
• This allows for a simple classification strategy
Contour Modeling
L-L% L-H%
type⇤
= argmax
type
p(type)
MY
i
p(Ci|type, i)
type⇤
= argmax
type
p(type)
MY
i
p(Ci|Ci 1, type, i)
16
Approximate Curve Fitting
• Polynomial fitting
• Legendre polynomials

[orthogonal bases]
• Coefficients become the representation
Contour Modeling
from wikipedia
f(~x) = ~a
˜x(t) =
kX
i=0
aiti
˜x(t) =
kX
i=0
aiLi(t)
L0 = 1; L1 = x
L2 =
1
2
(3x2
1)
Ln =
1
2n
mX
k=0
✓
n
k
◆2
(x 1)n k
(x + 1)k
17
Interactions
• Most shape representations ignore the interaction
between different information streams.
• Pitch is assumed to be the most relevant dimension of
intonation.
• Combined Pitch and Energy contour.

Can be viewed as weighting the importance of pitch
values by the energy.
• Energy and Duration (Area under Contour)
• Very simple feature.
• Improves pitch accent detection

by >3% absolute
18
Symbolic Modeling: AuToBI
• Automatic ToBI labeling toolkit.
• Unified feature extraction and ToBI label prediction
• Released under Apache 2.0
• Extensible Feature Extraction Framework
• Low-level digital signal processing: pitch, spectrum, intensity, FFV
• Unique features: Automatic syllabification; shape modeling; context-
sensitive features
• Applied to English, German, Spanish, Portuguese, Mandarin, French
Acoustic Features
D = 100s-1000s
Symbolic Analysis
D=10-20
Task Specific
19
Feature Extraction in AuToBI
Mean Mean Mean
ContextA ContextB ContextB
normalized log F0
log F0
F0
Requested Features
mean[context[norm[log[F0]],A]]
mean[context[norm[log[F0]],B]]
mean[context[norm[log[F0]],C]]
Mean
ContextA
normalized log F0
log F0
F0F0
log F0
normalized log F0
ContextA
Mean
ContextA
Mean
ContextBContextB
Mean
ContextB
Mean
ContextBContextB
Mean
ContextB
normalized log F0
log F0
F0
20
Correcting Classifiers for Prominence Detection
• Examine the predictive power of Intensity drawn
from 210 different spectral regions.

[Rosenberg & Hirschberg 2006, 2007]
etc.
[My name is Randy Keller]
21
Correcting Classifiers
• For each ensemble member, train an additional correcting
classifier — using pitch, and duration features.

• Predict if an ensemble member will be correct or incorrect
• Invert the prediction if the correcting classifier predicts
incorrect.
score(A) = θ(A | xi )*ψ(C | yi) + (1−θ(¬A | xi))*(1−ψ(¬C | yi))
i
N
∑
Correcting ClassifierEnergy Classifier
22
Correcting Classifier Diagram
∑
Energy
Classifiers
Correctors
Aggregator
Filters
...
...
23
Correcting Classifier Performance
Corpus Unfiltered Energy Voting Corrected Voting Change
BDC-read 79.80 79.87 84.38 +4.51
BDC-spon 79.12 80.67 83.20 +2.53
BURNC 82.90 83.18 85.51 +2.33
Speaker Dependent Performance
24
Learning Representations
• Find redundancy in the data.
• Correlated dimensions — like PCA
• Irrelevant dimensions — L1 or L0 regularization
• Goal here: learn discrete categories, with no
discriminative labels (as in MDS or LDA)
• Clustering or Codebook learning
25
Clustering as a Representation
x 2 R2
f(x) 2 {A, B, C}
g(x) 2 R3
26
Learning Representations
• Neural Net Representations
• Autoencoder
x 2 RD
g(x) 2 Rk
x xW1 W2
g(x) = s(W1s(W2x))
27
Learning Representations
• Neural Net Representations
• Bottleneck layer
x 2 RD
g(x) 2 Rk
x W1 W2 t
g(x) = s(W1s(W2x))
28
Applications of Prosodic Representations
• Candidate Representations:
• Manual ToBI Labels
• Automatically hypothesized ToBI Labels
• Codebook/Clusters of acoustic features

(k-means, dpgmm)
• Named Entity Tagging
• Sarcasm
• Prosody Sequence Modeling
• Speaking Style; Nativeness; Speaker
29
Name Tagging
• Names: Persons, Geopolitical Entities (Places),
Organizations.
• These are often misrecognized, and sometimes
completely unknown.
• (Most) Speech recognition systems will never
recognize a word it’s never heard before. “Out-
of-vocabulary” problem.
• Goal: Use prosody to help identify which words in a
transcript are actually names — despite this.
work with Denys Katerenchuk
30
Approach
• CRF-based Tagger

from Heng Ji’s (RPI) group
• Lexical Features
• n-grams, POS, brown cluster, syntactic
chunking, known dictionaries (place names,
etc.)
• Prosodic Features
• AuToBI hypotheses: 6 features.
• K-means codebook of the input features used
by AuToBI with k=2-10: 8 features.
Name Tagging
31
Results
• Prosody helps. Is likely approximating punctuation.
• AuToBI features are robust at even worse ASR performance.

still higher WER!
Name Tagging
F1-score
20
27.5
35
42.5
50
39.94
45.02
44.34
39.38
Text Features +Prosodic Clusters & AuToBI Features +AuToBI Features +Prosodic Clusters
WER: 49.13%
Ground Truth: marines battling for control of the bridges in
the southern city of Nasiriyah
Hypothesis: marines battling for control the bridges in the
southern city of non <GPE> sir </GPE> re f
32
Recognizing Sarcasm
• Sarcasm: the use of irony to indicate scorn or disdain
• Clips from Daria
• Rated by 165 participants as sarcastic or sincere
• Features:
• Baseline: Mean pitch, range pitch, standard deviation of
pitch, mean intensity, intensity range, speaking rate
• Prosodic Representations: k=3 clustering of order-2
Legendre polynomial coefficients based on pitch and
intensity
• unigram and bigram rates of both pitch and intensity
representations
work with Rachel Rakov
33
Results
• Learned representations:
• Pitch: Fast Rise, Slow Rise, Fast Fall
• Intensity: Fast Rise, Stable, Moderate Fall
Recognizing Sarcasm
Feature Set Accuracy
Chance Baseline 55.26
Standard Acoustic 65.78
+Unigram Features 78.31
+Unigram Features 

+Intensity Bigrams
81.57
+Unigram Features 

+Both Bigrams
76.31
Logistic Regression
34
Modeling Prosodic Sequences
• Prosodic Recognition of:
• Speaking Style - Read, Spontaneous, Dialog,
News
• Speaker - 4 speakers all Spontaneous speech
• Nativeness - Native vs. Non-native American
English Speakers, reading the same material.
35
Prosodic Sequence Modeling
• 3-gram model with backoff
• Clusters trained over all material.
• Sequence models trained on training splits.
• automatic syllabification
• only 7 acoustic features: 

mean pitch and intensity and delta, duration, pre/fol silence
C⇤
= argmax
C
p(x0|C)p(x1|x0, C)
NY
i=2
p(xi|xi 1, xi 2, C)
Prosodic Sequences
36
Dirichlet Process GMMs
G|{↵, G0} ⇠ DP(↵, G0)
✓n|G ⇠ G
Xn|✓n ⇠ p(xn|✓n)
G0
G0
i
xi
0
p(x) =
1X
n
⇡nN(x; µn, ⌃n)
• Non-parametric infinite mixture model
• No need to specify the number of
clusters.
• need a prior of π – the dirichlet process
• and a prior over N – a zero mean
gaussian
• still need to set hyper parameters α &
G0
• Stick-breaking & Chinese Restaurant
metaphors
• Blei and Jordan 2005

Variational Inference
• “Rich get Richer”
Plate notation from M. Jordan 2005 NIPS tutorial
Prosodic Sequences
37
Results
Prosodic Sequences
Speaking Style (of 4)
Nativeness (of 2)
Speaker (of 6)
• K-means is a
clear winner on
all tasks
• DPGMM here fail
to find effective
representations
ToBI
K-means
DPGMM
variable lengthed
sequences with
repetition
38
Common Representations
• Previous experiments generated representations
from a wide range of material. 

(3 corpora: 1) spontaneous/read; 2) dialog; 3) news
• Here: we repeat these experiments with
representations learned from material from a single
corpus (only news)
• Also include AuToBI hypotheses, and clusters are
based on full feature set. (compared to 7 before)
Prosodic Sequences
39
Results
Prosodic Sequences
K-meansSpeaking Style (of 4)
• K-means provides a
robust representation of
prosody.
• All speaker material is
unknown during
representation generations
Speaker (of 12)
40
Next Problems
• Hunting for Language Universals
• Additional Applications
• Automatically identifying the unit of analysis.
• Too short - low information; Too long - low
generalization
• Unify with representation learning
• Identifying “discriminative” prosodic events.
• In emotion, deception, foreign accent recognition, the
important signal is rare, but important.
• Discriminative modeling
• Anomaly detection (one class modeling)
41
Thanks
Denys Katerenchuk, Rachel Rakov
Adam Goodkind, Ali Raza Syed, David Guy Brizan, Felix Grezes,
Guozhen An, Michelle Morales, Min Ma, Justin Richards, Syed Reza
andrew@cs.qc.cuny.edu
speech.cs.qc.cuny.edu

eniac.cs.qc.cuny.edu/andrew
Questions?

More Related Content

What's hot

A Unifying Four-State Labelling Semantics for Bridging Abstract Argumentation...
A Unifying Four-State Labelling Semantics for Bridging Abstract Argumentation...A Unifying Four-State Labelling Semantics for Bridging Abstract Argumentation...
A Unifying Four-State Labelling Semantics for Bridging Abstract Argumentation...
Carlo Taticchi
 
20191215 rate distortion theory and VAEs
20191215 rate distortion theory and VAEs20191215 rate distortion theory and VAEs
20191215 rate distortion theory and VAEs
X 37
 
Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2
Zahra Amini
 
Search 1
Search 1Search 1
A Unified Perspective for Darmon Points
A Unified Perspective for Darmon PointsA Unified Perspective for Darmon Points
A Unified Perspective for Darmon Points
mmasdeu
 
Jarrar.lecture notes.aai.2011s.ch4.informedsearch
Jarrar.lecture notes.aai.2011s.ch4.informedsearchJarrar.lecture notes.aai.2011s.ch4.informedsearch
Jarrar.lecture notes.aai.2011s.ch4.informedsearchPalGov
 
RuleML 2015 Constraint Handling Rules - What Else?
RuleML 2015 Constraint Handling Rules - What Else?RuleML 2015 Constraint Handling Rules - What Else?
RuleML 2015 Constraint Handling Rules - What Else?
RuleML
 
Selected topics in Bayesian Optimization
Selected topics in Bayesian OptimizationSelected topics in Bayesian Optimization
Selected topics in Bayesian Optimization
ginsby
 
Wireless Localization: Ranging (second part)
Wireless Localization: Ranging (second part)Wireless Localization: Ranging (second part)
Wireless Localization: Ranging (second part)
Stefano Severi
 
Slides
SlidesSlides
Smooth entropies a tutorial
Smooth entropies a tutorialSmooth entropies a tutorial
Smooth entropies a tutorialwtyru1989
 
Informed search (heuristics)
Informed search (heuristics)Informed search (heuristics)
Informed search (heuristics)
Bablu Shofi
 
Extending Labelling Semantics to Weighted Argumentation Frameworks
Extending Labelling Semantics to Weighted Argumentation FrameworksExtending Labelling Semantics to Weighted Argumentation Frameworks
Extending Labelling Semantics to Weighted Argumentation Frameworks
Carlo Taticchi
 
A Matrix Based Approach for Weighted Argumentation Frameworks
A Matrix Based Approach for Weighted Argumentation FrameworksA Matrix Based Approach for Weighted Argumentation Frameworks
A Matrix Based Approach for Weighted Argumentation Frameworks
Carlo Taticchi
 
Searchadditional2
Searchadditional2Searchadditional2
Searchadditional2chandsek666
 
Lecture 10 Uninformed Search Techniques conti..
Lecture 10 Uninformed Search Techniques conti..Lecture 10 Uninformed Search Techniques conti..
Lecture 10 Uninformed Search Techniques conti..
Hema Kashyap
 

What's hot (19)

Ai unit-4
Ai unit-4Ai unit-4
Ai unit-4
 
A Unifying Four-State Labelling Semantics for Bridging Abstract Argumentation...
A Unifying Four-State Labelling Semantics for Bridging Abstract Argumentation...A Unifying Four-State Labelling Semantics for Bridging Abstract Argumentation...
A Unifying Four-State Labelling Semantics for Bridging Abstract Argumentation...
 
20191215 rate distortion theory and VAEs
20191215 rate distortion theory and VAEs20191215 rate distortion theory and VAEs
20191215 rate distortion theory and VAEs
 
Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect2
 
Search 1
Search 1Search 1
Search 1
 
M3 search
M3 searchM3 search
M3 search
 
A Unified Perspective for Darmon Points
A Unified Perspective for Darmon PointsA Unified Perspective for Darmon Points
A Unified Perspective for Darmon Points
 
Jarrar.lecture notes.aai.2011s.ch4.informedsearch
Jarrar.lecture notes.aai.2011s.ch4.informedsearchJarrar.lecture notes.aai.2011s.ch4.informedsearch
Jarrar.lecture notes.aai.2011s.ch4.informedsearch
 
RuleML 2015 Constraint Handling Rules - What Else?
RuleML 2015 Constraint Handling Rules - What Else?RuleML 2015 Constraint Handling Rules - What Else?
RuleML 2015 Constraint Handling Rules - What Else?
 
Selected topics in Bayesian Optimization
Selected topics in Bayesian OptimizationSelected topics in Bayesian Optimization
Selected topics in Bayesian Optimization
 
Wireless Localization: Ranging (second part)
Wireless Localization: Ranging (second part)Wireless Localization: Ranging (second part)
Wireless Localization: Ranging (second part)
 
Slides
SlidesSlides
Slides
 
Smooth entropies a tutorial
Smooth entropies a tutorialSmooth entropies a tutorial
Smooth entropies a tutorial
 
Informed search (heuristics)
Informed search (heuristics)Informed search (heuristics)
Informed search (heuristics)
 
M4 heuristics
M4 heuristicsM4 heuristics
M4 heuristics
 
Extending Labelling Semantics to Weighted Argumentation Frameworks
Extending Labelling Semantics to Weighted Argumentation FrameworksExtending Labelling Semantics to Weighted Argumentation Frameworks
Extending Labelling Semantics to Weighted Argumentation Frameworks
 
A Matrix Based Approach for Weighted Argumentation Frameworks
A Matrix Based Approach for Weighted Argumentation FrameworksA Matrix Based Approach for Weighted Argumentation Frameworks
A Matrix Based Approach for Weighted Argumentation Frameworks
 
Searchadditional2
Searchadditional2Searchadditional2
Searchadditional2
 
Lecture 10 Uninformed Search Techniques conti..
Lecture 10 Uninformed Search Techniques conti..Lecture 10 Uninformed Search Techniques conti..
Lecture 10 Uninformed Search Techniques conti..
 

Viewers also liked

Phonetics, Phonology and Prosodic Features
Phonetics, Phonology and Prosodic FeaturesPhonetics, Phonology and Prosodic Features
Phonetics, Phonology and Prosodic Features
alb58
 
Linguistic n prosodic basis
Linguistic n prosodic basisLinguistic n prosodic basis
Linguistic n prosodic basis
Hemaraja Nayaka S
 
Prosody
ProsodyProsody
English 8 - Prosodic Features of Speech
English 8 - Prosodic Features of SpeechEnglish 8 - Prosodic Features of Speech
English 8 - Prosodic Features of Speech
Juan Miguel Palero
 
Prosodic Featuures of Speech
Prosodic Featuures of SpeechProsodic Featuures of Speech
Prosodic Featuures of Speech
Marjorie Calar
 
ASHA Poster 2015_RArce
ASHA Poster 2015_RArceASHA Poster 2015_RArce
ASHA Poster 2015_RArceRobin M. Arce
 
Model sentences
Model sentencesModel sentences
Model sentences
Hyaci
 
Phrases and clauses 2
Phrases and clauses 2Phrases and clauses 2
Phrases and clauses 2jayaenglish
 
Suprasegmental features
Suprasegmental featuresSuprasegmental features
Suprasegmental featuresLusya Liann
 
Prosody
ProsodyProsody
Accommodation Theory
Accommodation TheoryAccommodation Theory
Accommodation Theory
Irsalina Viramdani
 
Definitions, Origins and approaches of Sociolinguistics
Definitions, Origins and approaches of Sociolinguistics Definitions, Origins and approaches of Sociolinguistics
Definitions, Origins and approaches of Sociolinguistics
AleeenaFarooq
 
Grade 9 Prosodic Features of Speech (Suprasegmental Phonology)
Grade 9 Prosodic Features of Speech (Suprasegmental Phonology)Grade 9 Prosodic Features of Speech (Suprasegmental Phonology)
Grade 9 Prosodic Features of Speech (Suprasegmental Phonology)
Talugtug National High School
 
Introduction to Cross-Cultural Comparison
Introduction to Cross-Cultural ComparisonIntroduction to Cross-Cultural Comparison
Introduction to Cross-Cultural Comparison
PaulVMcDowell
 
Introduction to Suprasegmental Features
Introduction to Suprasegmental FeaturesIntroduction to Suprasegmental Features
Introduction to Suprasegmental Features
Noramaliah Mohd Rahim
 

Viewers also liked (20)

Phonetics, Phonology and Prosodic Features
Phonetics, Phonology and Prosodic FeaturesPhonetics, Phonology and Prosodic Features
Phonetics, Phonology and Prosodic Features
 
Suprasegmental features and Prosody
Suprasegmental features and ProsodySuprasegmental features and Prosody
Suprasegmental features and Prosody
 
Linguistic n prosodic basis
Linguistic n prosodic basisLinguistic n prosodic basis
Linguistic n prosodic basis
 
Prosody
ProsodyProsody
Prosody
 
English 8 - Prosodic Features of Speech
English 8 - Prosodic Features of SpeechEnglish 8 - Prosodic Features of Speech
English 8 - Prosodic Features of Speech
 
Prosodic Featuures of Speech
Prosodic Featuures of SpeechProsodic Featuures of Speech
Prosodic Featuures of Speech
 
LL Upper Pri D - Prosody
LL Upper Pri D - ProsodyLL Upper Pri D - Prosody
LL Upper Pri D - Prosody
 
ASHA Poster 2015_RArce
ASHA Poster 2015_RArceASHA Poster 2015_RArce
ASHA Poster 2015_RArce
 
Speech Structure
Speech StructureSpeech Structure
Speech Structure
 
7 Phrase Vs Clause
7 Phrase Vs Clause7 Phrase Vs Clause
7 Phrase Vs Clause
 
Model sentences
Model sentencesModel sentences
Model sentences
 
Phrases and clauses 2
Phrases and clauses 2Phrases and clauses 2
Phrases and clauses 2
 
Suprasegmental features
Suprasegmental featuresSuprasegmental features
Suprasegmental features
 
Prosody
ProsodyProsody
Prosody
 
Suprasegmentals
SuprasegmentalsSuprasegmentals
Suprasegmentals
 
Accommodation Theory
Accommodation TheoryAccommodation Theory
Accommodation Theory
 
Definitions, Origins and approaches of Sociolinguistics
Definitions, Origins and approaches of Sociolinguistics Definitions, Origins and approaches of Sociolinguistics
Definitions, Origins and approaches of Sociolinguistics
 
Grade 9 Prosodic Features of Speech (Suprasegmental Phonology)
Grade 9 Prosodic Features of Speech (Suprasegmental Phonology)Grade 9 Prosodic Features of Speech (Suprasegmental Phonology)
Grade 9 Prosodic Features of Speech (Suprasegmental Phonology)
 
Introduction to Cross-Cultural Comparison
Introduction to Cross-Cultural ComparisonIntroduction to Cross-Cultural Comparison
Introduction to Cross-Cultural Comparison
 
Introduction to Suprasegmental Features
Introduction to Suprasegmental FeaturesIntroduction to Suprasegmental Features
Introduction to Suprasegmental Features
 

Similar to More than Words: Advancing Prosodic Analysis

Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
leanCoR: lean Connection-based DL Reasoner
leanCoR: lean Connection-based DL ReasonerleanCoR: lean Connection-based DL Reasoner
leanCoR: lean Connection-based DL Reasoner
Adriano Melo
 
"SSC" - Geometria e Semantica del Linguaggio
"SSC" - Geometria e Semantica del Linguaggio"SSC" - Geometria e Semantica del Linguaggio
"SSC" - Geometria e Semantica del Linguaggio
Alumni Mathematica
 
Laplacian Colormaps: a framework for structure-preserving color transformations
Laplacian Colormaps: a framework for structure-preserving color transformationsLaplacian Colormaps: a framework for structure-preserving color transformations
Laplacian Colormaps: a framework for structure-preserving color transformations
Davide Eynard
 
POST’s CORRESPONDENCE PROBLEM
POST’s CORRESPONDENCE PROBLEMPOST’s CORRESPONDENCE PROBLEM
POST’s CORRESPONDENCE PROBLEM
Rajendran
 
textprocessingboth.pptx
textprocessingboth.pptxtextprocessingboth.pptx
textprocessingboth.pptx
bdiot
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Mustafa Jarrar
 
Verification with LoLA
Verification with LoLAVerification with LoLA
Verification with LoLA
Universität Rostock
 
Breaking the Softmax Bottleneck: a high-rank RNN Language Model
Breaking the Softmax Bottleneck: a high-rank RNN Language ModelBreaking the Softmax Bottleneck: a high-rank RNN Language Model
Breaking the Softmax Bottleneck: a high-rank RNN Language Model
Ssu-Rui Lee
 
Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014
Erik Bernhardsson
 
Frequency Based Analysis of Voting Rules
Frequency Based Analysis of Voting RulesFrequency Based Analysis of Voting Rules
Frequency Based Analysis of Voting Rules
swarnendu chatterjee
 
WFST
WFSTWFST
Breaking the spell of the spelling check
Breaking the spell of the spelling checkBreaking the spell of the spelling check
Breaking the spell of the spelling check
Khrystyna Skopyk
 
Theory of Computation Lecture Notes
Theory of Computation Lecture NotesTheory of Computation Lecture Notes
Theory of Computation Lecture Notes
FellowBuddy.com
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
Rakuten Group, Inc.
 
High-dimensional polytopes defined by oracles: algorithms, computations and a...
High-dimensional polytopes defined by oracles: algorithms, computations and a...High-dimensional polytopes defined by oracles: algorithms, computations and a...
High-dimensional polytopes defined by oracles: algorithms, computations and a...
Vissarion Fisikopoulos
 
Admission in india 2015
Admission in india 2015Admission in india 2015
Admission in india 2015
Edhole.com
 
Data Structures - Lecture 10 [Graphs]
Data Structures - Lecture 10 [Graphs]Data Structures - Lecture 10 [Graphs]
Data Structures - Lecture 10 [Graphs]
Muhammad Hammad Waseem
 
Basics of Dynamic programming
Basics of Dynamic programming Basics of Dynamic programming
Basics of Dynamic programming
Yan Xu
 

Similar to More than Words: Advancing Prosodic Analysis (20)

defense
defensedefense
defense
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
leanCoR: lean Connection-based DL Reasoner
leanCoR: lean Connection-based DL ReasonerleanCoR: lean Connection-based DL Reasoner
leanCoR: lean Connection-based DL Reasoner
 
"SSC" - Geometria e Semantica del Linguaggio
"SSC" - Geometria e Semantica del Linguaggio"SSC" - Geometria e Semantica del Linguaggio
"SSC" - Geometria e Semantica del Linguaggio
 
Laplacian Colormaps: a framework for structure-preserving color transformations
Laplacian Colormaps: a framework for structure-preserving color transformationsLaplacian Colormaps: a framework for structure-preserving color transformations
Laplacian Colormaps: a framework for structure-preserving color transformations
 
POST’s CORRESPONDENCE PROBLEM
POST’s CORRESPONDENCE PROBLEMPOST’s CORRESPONDENCE PROBLEM
POST’s CORRESPONDENCE PROBLEM
 
textprocessingboth.pptx
textprocessingboth.pptxtextprocessingboth.pptx
textprocessingboth.pptx
 
Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing Adnan: Introduction to Natural Language Processing
Adnan: Introduction to Natural Language Processing
 
Verification with LoLA
Verification with LoLAVerification with LoLA
Verification with LoLA
 
Breaking the Softmax Bottleneck: a high-rank RNN Language Model
Breaking the Softmax Bottleneck: a high-rank RNN Language ModelBreaking the Softmax Bottleneck: a high-rank RNN Language Model
Breaking the Softmax Bottleneck: a high-rank RNN Language Model
 
Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014Music recommendations @ MLConf 2014
Music recommendations @ MLConf 2014
 
Frequency Based Analysis of Voting Rules
Frequency Based Analysis of Voting RulesFrequency Based Analysis of Voting Rules
Frequency Based Analysis of Voting Rules
 
WFST
WFSTWFST
WFST
 
Breaking the spell of the spelling check
Breaking the spell of the spelling checkBreaking the spell of the spelling check
Breaking the spell of the spelling check
 
Theory of Computation Lecture Notes
Theory of Computation Lecture NotesTheory of Computation Lecture Notes
Theory of Computation Lecture Notes
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
 
High-dimensional polytopes defined by oracles: algorithms, computations and a...
High-dimensional polytopes defined by oracles: algorithms, computations and a...High-dimensional polytopes defined by oracles: algorithms, computations and a...
High-dimensional polytopes defined by oracles: algorithms, computations and a...
 
Admission in india 2015
Admission in india 2015Admission in india 2015
Admission in india 2015
 
Data Structures - Lecture 10 [Graphs]
Data Structures - Lecture 10 [Graphs]Data Structures - Lecture 10 [Graphs]
Data Structures - Lecture 10 [Graphs]
 
Basics of Dynamic programming
Basics of Dynamic programming Basics of Dynamic programming
Basics of Dynamic programming
 

More from New York City College of Technology Computer Systems Technology Colloquium

Ontology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIsOntology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIs
New York City College of Technology Computer Systems Technology Colloquium
 
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
New York City College of Technology Computer Systems Technology Colloquium
 
Data-driven, Interactive Scientific Articles in a Collaborative Environment w...
Data-driven, Interactive Scientific Articles in a Collaborative Environment w...Data-driven, Interactive Scientific Articles in a Collaborative Environment w...
Data-driven, Interactive Scientific Articles in a Collaborative Environment w...
New York City College of Technology Computer Systems Technology Colloquium
 
Cloud Technology: Virtualization
Cloud Technology: VirtualizationCloud Technology: Virtualization
Google BigTable
Google BigTableGoogle BigTable
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
New York City College of Technology Computer Systems Technology Colloquium
 
How We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad GuysHow We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad Guys
New York City College of Technology Computer Systems Technology Colloquium
 
Static Analysis and Verification of C Programs
Static Analysis and Verification of C ProgramsStatic Analysis and Verification of C Programs
Test Dependencies and the Future of Build Acceleration
Test Dependencies and the Future of Build AccelerationTest Dependencies and the Future of Build Acceleration
Test Dependencies and the Future of Build Acceleration
New York City College of Technology Computer Systems Technology Colloquium
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Introduction to new features in java 8
Introduction to new features in java 8Introduction to new features in java 8
Android Apps the Right Way
Android Apps the Right WayAndroid Apps the Right Way

More from New York City College of Technology Computer Systems Technology Colloquium (12)

Ontology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIsOntology-based Classification and Faceted Search Interface for APIs
Ontology-based Classification and Faceted Search Interface for APIs
 
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...Towards Improving Interface Modularity in Legacy Java Software Through Automa...
Towards Improving Interface Modularity in Legacy Java Software Through Automa...
 
Data-driven, Interactive Scientific Articles in a Collaborative Environment w...
Data-driven, Interactive Scientific Articles in a Collaborative Environment w...Data-driven, Interactive Scientific Articles in a Collaborative Environment w...
Data-driven, Interactive Scientific Articles in a Collaborative Environment w...
 
Cloud Technology: Virtualization
Cloud Technology: VirtualizationCloud Technology: Virtualization
Cloud Technology: Virtualization
 
Google BigTable
Google BigTableGoogle BigTable
Google BigTable
 
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
Pharmacology Powered by Computational Analysis: Predicting Cardiotoxicity of ...
 
How We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad GuysHow We Use Functional Programming to Find the Bad Guys
How We Use Functional Programming to Find the Bad Guys
 
Static Analysis and Verification of C Programs
Static Analysis and Verification of C ProgramsStatic Analysis and Verification of C Programs
Static Analysis and Verification of C Programs
 
Test Dependencies and the Future of Build Acceleration
Test Dependencies and the Future of Build AccelerationTest Dependencies and the Future of Build Acceleration
Test Dependencies and the Future of Build Acceleration
 
Big Data Challenges and Solutions
Big Data Challenges and SolutionsBig Data Challenges and Solutions
Big Data Challenges and Solutions
 
Introduction to new features in java 8
Introduction to new features in java 8Introduction to new features in java 8
Introduction to new features in java 8
 
Android Apps the Right Way
Android Apps the Right WayAndroid Apps the Right Way
Android Apps the Right Way
 

Recently uploaded

A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 

Recently uploaded (20)

A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 

More than Words: Advancing Prosodic Analysis

  • 1. More than Words
 Advancing Prosodic Analysis Andrew Rosenberg City Tech Colloquium February 5, 2015
  • 3. Prosody Syntax Semantics Pragmatics Paralinguistics Mary knows; you can do it.
 Mary knows you can do it. Bill doesn’t drink because he’s unhappy Going to Boston. Going to Boston? Three Hundred Twelve. Three Thousand Twelve. 3
  • 4. Prosody in Text ALSO FROM NORTH STATION I THINK THE ORANGE LINE RUNS BY THERE TOO SO YOU CAN ALSO CATCH THE ORANGE LINE AND THEN INSTEAD OF TRANSFERRING UM I YOU KNOW THE MAP IS REALLY OBVIOUS ABOUT THIS BUT INSTEAD OF TRANSFERRING AT PARK STREET YOU CAN TRANSFER AT UH WHAT’S THE STATION NAME DOWNTOWN CROSSING UM AND THAT’LL GET YOU BACK TO THE RED LINE JUST AS EASILY 4
  • 5. Also, from the North Station... (I think the Orange Line runs by there too so you can also catch the Orange Line... ) And then instead of transferring (um I- you know, the map is really obvious about this but) Instead of transferring at Park Street, you can transfer at (uh what’s the station name) Downtown Crossing and (um) that’ll get you back to the Red Line just as easily. Prosody in Text 5
  • 6. Prosody in Text I sooo hate you right now :-) mondays :,( Conner Thiele @St04hoEs: Madison people are so funny #sarcasm Dodie Clark @doddleoddle: RePlAcEmEnT bus SerVicEs are mY fAvOURITE #sARcASM. Michelle Lee @mlee418 finding someone who loves makeup just as much as me makes me feel warm inside #notkidding 6
  • 7. Prosody in Spoken Language Processing • Recognizing Emotions. 
 Frustration and Anger in Call Centers • Inserting punctuation in speech transcripts.
 Notably, not in mobile voice input yet… • Speaker Recognition • Speaking Style Recognition • Recognizing Native Language, Gender, Speaker Roles • Improving performance of other spoken language processing tasks. Parsing, Discourse Structure, Intent Recognition. 
 Today: Identifying (possibly misrecognized) names in speech 7
  • 8. Dimensions of Prosodic Variation Pitch in Blue Intensity in Green Duration of words/syllables Presence of
 Silence Spectral Qualities 8
  • 9. ToBI • High level dimensions of prosodic variation. • Tones and Break Indices • High and Low tones describe prosodic events, pitch accent and phrasing. • Break indices describe the degree of disjuncture between words. • Two hierarchical levels of phrasing: intermediate and intonational 9
  • 10. ToBI Example - Praat 10
  • 11. Dimensions of Prosodic Variation Prominence (bold word) 
 Phrasing (end of phrase) L-L% L-H% H-H% H-L% !H-L% H* L* L*+H L+H* H+!H* Mother TheresaGive me the brown oneis that Mariana’s money?do you really think it’s that one? (x2) get on the harvard square T stopleave the government center T stopwe will go through centralthrough Boylestongo from Harvard Square 11
  • 12. How is prosody used? Symbolic • Modular • Linguistically Meaningful • Reduced Dimensionality Direct • Task-Appropriate • Lower information loss (general) • High Dimensionality Acoustic Features D = 100s-1000s Symbolic Analysis D=10-20 Task Specific Acoustic Features D = 100s-1000s Task Specific Learned Representations • Modular • Task-Appropriate • Linguistically Meaningful • Low information loss • Reduced Dimensionality Acoustic Features D = 100s-1000s Learned Representation D=10-20 Task Specific Goal: compact, consistent, universal 12
  • 13. Direct Modeling • Topic and Sentence Segmentation.
 [Liu et al. 2008, Rosenberg et al. 2006, Ostendorf et al. 2008 etc.] • Lexical: n-grams, POS-tags, TextTiling, Lexical Chains and other Coherence measures • Prosodic: measures of acoustic “reset” across candidate boundaries. • Question Recognition for Spoken Dialog Systems
 [Liscombe et al 2006] • Lexical: n-grams, pos tags, filled pauses • Prosodic: pitch slope in last 200ms. pausing, loudness 13
  • 14. Contour Modeling Pitch in Blue Intensity in Green 14
  • 15. TILT • Describes an F0 excursion based as a single parameter Taylor 1998 • Compact representation of an excursion based on position of the maxima Contour Modeling tiltamp = |amprise| |ampfall| |amprise| + |ampfall| tiltdur = durrise durfall durrise + durfall tilt = tiltdur + tiltamp 2 15
  • 16. Quantized Contour Modeling • Each syllabic contour is laid onto an N-by-M grid with normalized time and range. Results in an M element vector with an N-sized vocabulary.
 Rosenberg 2010 • This allows for a simple classification strategy Contour Modeling L-L% L-H% type⇤ = argmax type p(type) MY i p(Ci|type, i) type⇤ = argmax type p(type) MY i p(Ci|Ci 1, type, i) 16
  • 17. Approximate Curve Fitting • Polynomial fitting • Legendre polynomials
 [orthogonal bases] • Coefficients become the representation Contour Modeling from wikipedia f(~x) = ~a ˜x(t) = kX i=0 aiti ˜x(t) = kX i=0 aiLi(t) L0 = 1; L1 = x L2 = 1 2 (3x2 1) Ln = 1 2n mX k=0 ✓ n k ◆2 (x 1)n k (x + 1)k 17
  • 18. Interactions • Most shape representations ignore the interaction between different information streams. • Pitch is assumed to be the most relevant dimension of intonation. • Combined Pitch and Energy contour.
 Can be viewed as weighting the importance of pitch values by the energy. • Energy and Duration (Area under Contour) • Very simple feature. • Improves pitch accent detection
 by >3% absolute 18
  • 19. Symbolic Modeling: AuToBI • Automatic ToBI labeling toolkit. • Unified feature extraction and ToBI label prediction • Released under Apache 2.0 • Extensible Feature Extraction Framework • Low-level digital signal processing: pitch, spectrum, intensity, FFV • Unique features: Automatic syllabification; shape modeling; context- sensitive features • Applied to English, German, Spanish, Portuguese, Mandarin, French Acoustic Features D = 100s-1000s Symbolic Analysis D=10-20 Task Specific 19
  • 20. Feature Extraction in AuToBI Mean Mean Mean ContextA ContextB ContextB normalized log F0 log F0 F0 Requested Features mean[context[norm[log[F0]],A]] mean[context[norm[log[F0]],B]] mean[context[norm[log[F0]],C]] Mean ContextA normalized log F0 log F0 F0F0 log F0 normalized log F0 ContextA Mean ContextA Mean ContextBContextB Mean ContextB Mean ContextBContextB Mean ContextB normalized log F0 log F0 F0 20
  • 21. Correcting Classifiers for Prominence Detection • Examine the predictive power of Intensity drawn from 210 different spectral regions.
 [Rosenberg & Hirschberg 2006, 2007] etc. [My name is Randy Keller] 21
  • 22. Correcting Classifiers • For each ensemble member, train an additional correcting classifier — using pitch, and duration features. • Predict if an ensemble member will be correct or incorrect • Invert the prediction if the correcting classifier predicts incorrect. score(A) = θ(A | xi )*ψ(C | yi) + (1−θ(¬A | xi))*(1−ψ(¬C | yi)) i N ∑ Correcting ClassifierEnergy Classifier 22
  • 24. Correcting Classifier Performance Corpus Unfiltered Energy Voting Corrected Voting Change BDC-read 79.80 79.87 84.38 +4.51 BDC-spon 79.12 80.67 83.20 +2.53 BURNC 82.90 83.18 85.51 +2.33 Speaker Dependent Performance 24
  • 25. Learning Representations • Find redundancy in the data. • Correlated dimensions — like PCA • Irrelevant dimensions — L1 or L0 regularization • Goal here: learn discrete categories, with no discriminative labels (as in MDS or LDA) • Clustering or Codebook learning 25
  • 26. Clustering as a Representation x 2 R2 f(x) 2 {A, B, C} g(x) 2 R3 26
  • 27. Learning Representations • Neural Net Representations • Autoencoder x 2 RD g(x) 2 Rk x xW1 W2 g(x) = s(W1s(W2x)) 27
  • 28. Learning Representations • Neural Net Representations • Bottleneck layer x 2 RD g(x) 2 Rk x W1 W2 t g(x) = s(W1s(W2x)) 28
  • 29. Applications of Prosodic Representations • Candidate Representations: • Manual ToBI Labels • Automatically hypothesized ToBI Labels • Codebook/Clusters of acoustic features
 (k-means, dpgmm) • Named Entity Tagging • Sarcasm • Prosody Sequence Modeling • Speaking Style; Nativeness; Speaker 29
  • 30. Name Tagging • Names: Persons, Geopolitical Entities (Places), Organizations. • These are often misrecognized, and sometimes completely unknown. • (Most) Speech recognition systems will never recognize a word it’s never heard before. “Out- of-vocabulary” problem. • Goal: Use prosody to help identify which words in a transcript are actually names — despite this. work with Denys Katerenchuk 30
  • 31. Approach • CRF-based Tagger
 from Heng Ji’s (RPI) group • Lexical Features • n-grams, POS, brown cluster, syntactic chunking, known dictionaries (place names, etc.) • Prosodic Features • AuToBI hypotheses: 6 features. • K-means codebook of the input features used by AuToBI with k=2-10: 8 features. Name Tagging 31
  • 32. Results • Prosody helps. Is likely approximating punctuation. • AuToBI features are robust at even worse ASR performance.
 still higher WER! Name Tagging F1-score 20 27.5 35 42.5 50 39.94 45.02 44.34 39.38 Text Features +Prosodic Clusters & AuToBI Features +AuToBI Features +Prosodic Clusters WER: 49.13% Ground Truth: marines battling for control of the bridges in the southern city of Nasiriyah Hypothesis: marines battling for control the bridges in the southern city of non <GPE> sir </GPE> re f 32
  • 33. Recognizing Sarcasm • Sarcasm: the use of irony to indicate scorn or disdain • Clips from Daria • Rated by 165 participants as sarcastic or sincere • Features: • Baseline: Mean pitch, range pitch, standard deviation of pitch, mean intensity, intensity range, speaking rate • Prosodic Representations: k=3 clustering of order-2 Legendre polynomial coefficients based on pitch and intensity • unigram and bigram rates of both pitch and intensity representations work with Rachel Rakov 33
  • 34. Results • Learned representations: • Pitch: Fast Rise, Slow Rise, Fast Fall • Intensity: Fast Rise, Stable, Moderate Fall Recognizing Sarcasm Feature Set Accuracy Chance Baseline 55.26 Standard Acoustic 65.78 +Unigram Features 78.31 +Unigram Features 
 +Intensity Bigrams 81.57 +Unigram Features 
 +Both Bigrams 76.31 Logistic Regression 34
  • 35. Modeling Prosodic Sequences • Prosodic Recognition of: • Speaking Style - Read, Spontaneous, Dialog, News • Speaker - 4 speakers all Spontaneous speech • Nativeness - Native vs. Non-native American English Speakers, reading the same material. 35
  • 36. Prosodic Sequence Modeling • 3-gram model with backoff • Clusters trained over all material. • Sequence models trained on training splits. • automatic syllabification • only 7 acoustic features: 
 mean pitch and intensity and delta, duration, pre/fol silence C⇤ = argmax C p(x0|C)p(x1|x0, C) NY i=2 p(xi|xi 1, xi 2, C) Prosodic Sequences 36
  • 37. Dirichlet Process GMMs G|{↵, G0} ⇠ DP(↵, G0) ✓n|G ⇠ G Xn|✓n ⇠ p(xn|✓n) G0 G0 i xi 0 p(x) = 1X n ⇡nN(x; µn, ⌃n) • Non-parametric infinite mixture model • No need to specify the number of clusters. • need a prior of π – the dirichlet process • and a prior over N – a zero mean gaussian • still need to set hyper parameters α & G0 • Stick-breaking & Chinese Restaurant metaphors • Blei and Jordan 2005
 Variational Inference • “Rich get Richer” Plate notation from M. Jordan 2005 NIPS tutorial Prosodic Sequences 37
  • 38. Results Prosodic Sequences Speaking Style (of 4) Nativeness (of 2) Speaker (of 6) • K-means is a clear winner on all tasks • DPGMM here fail to find effective representations ToBI K-means DPGMM variable lengthed sequences with repetition 38
  • 39. Common Representations • Previous experiments generated representations from a wide range of material. 
 (3 corpora: 1) spontaneous/read; 2) dialog; 3) news • Here: we repeat these experiments with representations learned from material from a single corpus (only news) • Also include AuToBI hypotheses, and clusters are based on full feature set. (compared to 7 before) Prosodic Sequences 39
  • 40. Results Prosodic Sequences K-meansSpeaking Style (of 4) • K-means provides a robust representation of prosody. • All speaker material is unknown during representation generations Speaker (of 12) 40
  • 41. Next Problems • Hunting for Language Universals • Additional Applications • Automatically identifying the unit of analysis. • Too short - low information; Too long - low generalization • Unify with representation learning • Identifying “discriminative” prosodic events. • In emotion, deception, foreign accent recognition, the important signal is rare, but important. • Discriminative modeling • Anomaly detection (one class modeling) 41
  • 42. Thanks Denys Katerenchuk, Rachel Rakov Adam Goodkind, Ali Raza Syed, David Guy Brizan, Felix Grezes, Guozhen An, Michelle Morales, Min Ma, Justin Richards, Syed Reza andrew@cs.qc.cuny.edu speech.cs.qc.cuny.edu
 eniac.cs.qc.cuny.edu/andrew Questions?