SlideShare a Scribd company logo
1 of 51
Language Variation in Parliamentary
Speeches
First Steps Towards Robust Phoneme Recognition
Why?
2
● General speech science goals: how do people speak?
● Path towards better training materials, iterative training
processes
● SweTerror: multidisciplinary investigation into parliamentary
discourse around the topic of terrorism
Fun personal fact:
3
● I come from Ireland, home of what used to be the world’s
best known terrorist organisation.
KonradWyszyskiPlanespotting, CC BY 2.0
<https://creativecommons.org/licenses/by/2.0>, via 4
How much data?
2019-04-09
September 2012 to January 2022:
5925 hours of raw video
Where does “terror” occur?
2019-04-09
Transcript files 826
Video files (ASR) 1018
Words (transcripts) 6741
Words (ASR) 7227
Top 10: transcripts
2019-04-09
terrorism 1605
terrorister 573
terrorismen 294
terror 273
terrordåd 211
terroristbrott 156
terrorhot 138
terrorattentat 134
terrorbrott 131
terrororganisationer 127
Top 10: ASR
2019-04-09
terrorism 1608
terrorister 524
terrorismen 353
terror 291
terrordåd 198
terroristbrott 166
terrorbrott 144
terrorhot 135
terrorattentat 133
terrororganisationer 120
Mundane issues
2019-04-09
Files cut off: [Prövning] av förslag till
2019-04-09
Normalisation: enligt 11 kap. 3 §
2019-04-09
ASR error: IX debatt
2019-04-09
ASR error: tack för talman
2019-04-09
Normalisation: solved (mostly)
2019-04-09
Normalisation
2019-04-09
Bakhturina, E., Zhang, Y., Ginsburg, B. (2022) Shallow Fusion
of Weighted Finite-State Transducer and Language Model for
Text Normalization. Proc. Interspeech 2022, 491-495, doi:
10.21437/Interspeech.2022-11074
https://github.com/NVIDIA/NeMo-text-processing
Normalisation
2019-04-09
Can we trust the transcripts?
2019-04-09
ASR model:
2019-04-09
Malmsten, M., Haffenden, C., & Börjeson, L. (2022). Hearing
voices at the National Library -- a speech corpus and acoustic
model for the Swedish language. http://arxiv.org/abs/2205.03026
https://huggingface.co/KBLab/wav2vec2-large-voxrex-swedish
Some things were never actually said
2019-04-09
2442203180006309721 1 230.46 0.06 Jag 1.0 Jag cor
2442203180006309721 1 230.6 0.08 har 1.0 har cor
2442203180006309721 1 230.76 0.2 flera 1.0 flera cor
2442203180006309721 1 231.04 0.3 kollegor 1.0 kollegor ins
2442203180006309721 1 231.4 0.159 här 1.0 <eps> sub
2442203180006309721 1 231.76 0.02 i 1.0 i cor
2442203180006309721 1 232.08 0.399 kammaren 1.0 kammaren cor
2442203180006309721 1 232.479 0.0 <eps> 1.0 som del
2442203180006309721 1 232.479 0.0 <eps> 1.0 inte del
2442203180006309721 1 232.479 0.0 <eps> 1.0 kommer del
2442203180006309721 1 232.479 0.0 <eps> 1.0 från del
2442203180006309721 1 232.479 0.0 <eps> 1.0 Stockholm. del
Phrases move
2019-04-09
2442203180006309721 1 1596.22 0.099 ett 1.0 ett cor
2442203180006309721 1 1596.38 0.199 lopp 1.0 lopp cor
2442203180006309721 1 1596.579 0.0 <eps> 1.0 på del
2442203180006309721 1 1596.579 0.0 <eps> 1.0 60 del
2442203180006309721 1 1596.579 0.0 <eps> 1.0 mil del
2442203180006309721 1 1596.72 0.16 över 1.0 över cor
2442203180006309721 1 1597.0 0.119 tre 1.0 tre cor
2442203180006309721 1 1597.32 0.279 dagar 1.0 <eps> ins
2442203180006309721 1 1597.7 0.059 på 1.0 <eps> ins
2442203180006309721 1 1597.9 0.379 sextio 1.0 <eps> ins
2442203180006309721 1 1598.36 0.32 mil 1.0 dagar. sub
Things are added in the moment
2019-04-09
2442203180006309721 1 2324.96 0.039 vi 1.0 vi cor
2442203180006309721 1 2325.1 0.32 måste 1.0 <eps> ins
2442203180006309721 1 2325.52 0.159 höra 1.0 <eps> ins
2442203180006309721 1 2325.76 0.319 talas 1.0 <eps> ins
2442203180006309721 1 2326.12 0.039 om 1.0 <eps> ins
2442203180006309721 1 2326.22 0.08 den 1.0 <eps> ins
2442203180006309721 1 2326.34 0.099 här 1.0 <eps> ins
2442203180006309721 1 2326.48 0.5 historien 1.0 <eps> ins
2442203180006309721 1 2327.14 0.34 gång 1.0 gång cor
2442203180006309721 1 2327.58 0.039 på 1.0 på cor
What can we find?
2019-04-09
False starts:
2019-04-09 23
2442207180019978121 1 1619.5 1.579 oppooppositionens 1.0
oppositionens sub
2442207150019764521 1 213.3 0.839 bberor
1.0 beror sub
2442207160019915621 1 492.7 2.0 globalisglobaliseringen 1.0
globaliseringen sub
Alternative pronunciations:
2019-04-09 24
2442207160019915621 1 398.3 0.459 resvasion 1.0
reservation sub
2442207160019915621 1 432.52 0.48 resovationen 1.0
reservationen sub
Filled pauses:
2019-04-09 25
2442207150019764521 1 622.44 0.159 ifrån 1.0 från
sub
2442207180020109821 1 326.1 0.099 nhär 1.0 här ins
Wait? Didn’t OpenAI Whisper solve ASR?
2019-04-09 26
Untrustworthy for our purposes
2019-04-09 27
Radford, A., Kim, J.W., Xu, T., Brockman, G., Mcleavey, C. & Sutskever, I.. (2023). Robust Speech Recognition via Large-Scale
Weak Supervision, in Proceedings of Machine Learning Research 202:28492-28518 Available from
https://proceedings.mlr.press/v202/radford23a.html.
Untrustworthy for our purposes
2019-04-09 28
(We can safely assume that this includes Riksdag’s data)
Disappearing “tack”
2019-04-09 29
2442207060018256921 1 13.96 0.199 tack 1.0 <eps> ins
2442207060018256921 1 14.22 0.059 Herr 1.0 Herr cor
2442207060018256921 1 14.36 0.38 talman! 1.0 talman! cor
(The official transcripts only start with “Talman!” “Herr talman!” or “Fru
talman!”)
Curious insertions
2019-04-09 30
00:00.000 --> 00:30.000
Tack till mina supporters via www.patreon.com
00:30.000 --> 00:34.000
Tack till mina supporters via www.patreon.com
01:00.000 --> 01:04.000
Tack till mina supporters via www.patreon.com
01:30.000 --> 01:34.000
Tack till mina supporters via www.patreon.com
02:00.000 --> 02:04.000
Tack till mina supporters via www.patreon.com
02:30.000 --> 02:34.000
Tack till mina supporters via www.patreon.com
03:00.000 --> 03:04.000
Tack till mina supporters via www.patreon.com
03:30.000 --> 03:34.000
Tack till mina supporters via www.patreon.com
Alternative? Phonemic recognition
2019-04-09 31
Vaxholm
32
Vaxholm
33
Vaxholm
34
Waxholm
35
Waxholm
36
A dialogue system that gave information on shipping in the Stockholm
archipelago
Incorporating text-to-speech, ASR, face synthesis, and dialog management
However: in the earliest versions, ASR was unavailable, so a Wizard of Oz setup
was used. The data from these sessions was transcribed at the word and
phoneme level, including non-speech events.
Waxholm
37
Waxholm
38
Original data
39
CORRECTED: OK jesper Jesper Hogberg Thu Jun 22 13:46:18 EET 1995
AUTOLABEL: jesper Jesper H|gberg Fri Nov 26 09:25:05 MET 1993
Waxholm dialog. /u/wax/data/scenes/fp2008/fp2008.4.04.smp
WIZARD: joakim_g Joakim Gustafson Wed Nov 24 10:47:29 MET 1993
TEXT:
jag vill }ka fr}n str|mkajen .
PHONEME: J'A:G+ V'IL+ "]:K'A FR']:N+ STR"MhyK'AJEN.
CT 1
Labels: J'A: V'IL "]:KkA F']: STtR"M#Kk`AJE0N .
FR 2500 #J >pm #J >w jag 0.156 sec
FR 3916 $'A: >pm $'A: 0.245 sec
FR 5276 $G >pm $G 0.330 sec
FR 5276 >pm $g+ 0.330 sec
FR 5276 #V >pm #V >w vill 0.330 sec
FR 5919 $'I >pm $'I 0.370 sec
FR 6752 $L >pm $L+ 0.422 sec
FR 7218 #"]: >pm #"]: >w }ka 0.451 sec
Problems
40
● Frames inconsistently labeled
● “Empty” (zero duration) frames used to mark unrealised segments
● (At least) two schools of thought regarding (generated) phoneme sequences
● Extensive copy-and-edit approach to annotation files (metadata often wrong)
Results
41
“terrorstämplade” /tærɔ<pa>stɛempladə/
2019-04-09 42
“Turkiets antiterr- så kallade antiterrorlagstiftning”
/tɵrkiːts<v> antɪt<hes> <pa> soː kalad antɪtærʊrlɑːɡstɪfnɪ/
2019-04-09 43
ASR error: IX debatt: /iː seks debat/
2019-04-09
ASR error: tack för talman: /tak fœ
̞ ː tɑːlman/
2019-04-09
It’s not perfect
46
Transcript: Herr talman! EU-samarbetet gör
Sverige starkare och säkrare. Hot som
klimatkrisen, pandemier, terrorism och
organiserad brottslighet kan inte lösas av ett
enskilt land.
KB: är talman eusamarbetet gör sverige starkare
och säkrare hotsom klimatkrisen pandemier
terrorism och organiserad brottslighet kan inte
lösas av ett enskilt land
Phone: hæː tɑː man eːʉːsamabeːtət jœ
̞ ːr
sværjə starkarə oː sɛːkrarə huːtsɔm
klɪmɑːtkriːsəm pandemiːər <pa> tærʊrɪsm oː
ɔrɡanɪseːrad brɔtslɪheːt kan ɪntə løːsas ɑːv et
eːnʂɪlt land
Pauses and hesitations
47
Transcript: Jag kan bara konstatera att ungefär 200 personer i
veckan nekas inträde i Sverige för att de inte har rätt att komma
hit, och det upptäcks tack vare de inre gränskontrollerna. Jag
kan också konstatera att Säpo gör bedömningen att
terrorhotnivån mot Sverige ligger kvar på en trea, vilket är en
ganska hög nivå som motiverar ökad säkerhet och inre
gränskontroll.
KB: vi gör kan bara konstatera att ungefär tvåhundra personer i
veckan som nekas inträde i sverige tack vare och det upptäcks
via de inre gränskontrollerna för att de inte har rätt att komma till
sverige kan också konstatera att säpo gör bedömningen att
terrorhotsnivån mot sverige ligger kvar på en trea vilket är en
ganska hög nivå vilket också motiverar ökad säkerhet och även
inre gränskontroll
Phone: vɪiːjœ
̞ ːr <pa> <hes> kam bɑːa kɔnstateːra at ɵŋefæː ʈvoː
hɵndra pæʂuːnər <pa> iː vekan sɔm neːkas ɪntrɛːdə iː sværjə
<pa> <hes> tak vɑːrə oː deː ɵptɛeks viːa dɔm ɪnrə
ɡreɛnskɔntrɔləɳa <pa> <hes> fœ
̞ ːra tɔm ɪnt ɑː ret at kɔma tɪ
sværjə <pa> <hes> kan ɔksɔ kɔnstateːra at sɛːpuː jœ
̞ ː
bedœmnɪŋn at tærɔrhʊtsnɪvoːn mʊt sværjə lɪɡə kvɑːr poː poː en
treːa <pa> vɪkət æːr eŋ ɡanska høːɡ<v> nɪvoː <pa> vɪkət ɔksɔ
mʊtɪveːrar <hes> øːkad sɛːkərheːt oː ɛːvən ɪndrəe ɡrɛnskɔntrɔl
Alternate pronunciation
48
Transcript: På den andra sidan har israeliska
ungdomars liv präglats av rädsla och oro för
terrorattentat. I båda länderna ökar
uppgivenheten och radikaliseringen.
KB: på den andra sidan har israeliska
ungdomars liv präglats av rädsla och oro för
terrorattentat i båda länderna ökar
uppgivenheten och radikaliseringen
Phone: poː den andra siːdan oː ɪsraeːlɪska
ɵŋdʊmaʂ liːv prɛːɡlas ɑːv rɛːdsla oː uːrʊ fœ
̞ ː
tærɔr atəntɑːt iː boːda lendæɳa øːkar
ɵpjiːvənheːtən oː radɪkalɪseːrɪŋən
Ongoing work
49
● Forced alignment
○ Older, HMM-style models are better at forced alignment
○ Shorter stride (10ms vs 20ms)
○ Dictionary-based
● Acoustically-validated pronunciation dictionary
○ Intersection of dictionary-derived pronunciations and phonemic transcription
○ Adding rule-based alternatives: “rs” can be /ʂ/ or /rs/
○ Dialect-specific lexica (Riksdag speakers are mostly well known)
Wiktionary validations (top 10)
50
Instances Word Pronunciation Narrow/broad
1161018 att at broad
746256 i iː broad
582874 det deː broad
537306 som sɔm broad
512887 på poː broad
507377 vi viː broad
373091 så soː broad
305332 av ɑːv broad
291260 om ɔm broad
211505 man man broad
Questions?
51

More Related Content

Recently uploaded

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Skynet Technologies
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxjbellis
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxFIDO Alliance
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTopCSSGallery
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentationyogeshlabana357357
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewDianaGray10
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireExakis Nelite
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfSrushith Repakula
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024Lorenzo Miniero
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...FIDO Alliance
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfalexjohnson7307
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistandanishmna97
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxFIDO Alliance
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 

Recently uploaded (20)

TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Language Variation in Parliamentary Speeches: First Steps Towards Robust Phoneme Recognition

  • 1. Language Variation in Parliamentary Speeches First Steps Towards Robust Phoneme Recognition
  • 2. Why? 2 ● General speech science goals: how do people speak? ● Path towards better training materials, iterative training processes ● SweTerror: multidisciplinary investigation into parliamentary discourse around the topic of terrorism
  • 3. Fun personal fact: 3 ● I come from Ireland, home of what used to be the world’s best known terrorist organisation.
  • 4. KonradWyszyskiPlanespotting, CC BY 2.0 <https://creativecommons.org/licenses/by/2.0>, via 4
  • 5. How much data? 2019-04-09 September 2012 to January 2022: 5925 hours of raw video
  • 6. Where does “terror” occur? 2019-04-09 Transcript files 826 Video files (ASR) 1018 Words (transcripts) 6741 Words (ASR) 7227
  • 7. Top 10: transcripts 2019-04-09 terrorism 1605 terrorister 573 terrorismen 294 terror 273 terrordåd 211 terroristbrott 156 terrorhot 138 terrorattentat 134 terrorbrott 131 terrororganisationer 127
  • 8. Top 10: ASR 2019-04-09 terrorism 1608 terrorister 524 terrorismen 353 terror 291 terrordåd 198 terroristbrott 166 terrorbrott 144 terrorhot 135 terrorattentat 133 terrororganisationer 120
  • 10. Files cut off: [Prövning] av förslag till 2019-04-09
  • 11. Normalisation: enligt 11 kap. 3 § 2019-04-09
  • 12. ASR error: IX debatt 2019-04-09
  • 13. ASR error: tack för talman 2019-04-09
  • 15. Normalisation 2019-04-09 Bakhturina, E., Zhang, Y., Ginsburg, B. (2022) Shallow Fusion of Weighted Finite-State Transducer and Language Model for Text Normalization. Proc. Interspeech 2022, 491-495, doi: 10.21437/Interspeech.2022-11074 https://github.com/NVIDIA/NeMo-text-processing
  • 17. Can we trust the transcripts? 2019-04-09
  • 18. ASR model: 2019-04-09 Malmsten, M., Haffenden, C., & Börjeson, L. (2022). Hearing voices at the National Library -- a speech corpus and acoustic model for the Swedish language. http://arxiv.org/abs/2205.03026 https://huggingface.co/KBLab/wav2vec2-large-voxrex-swedish
  • 19. Some things were never actually said 2019-04-09 2442203180006309721 1 230.46 0.06 Jag 1.0 Jag cor 2442203180006309721 1 230.6 0.08 har 1.0 har cor 2442203180006309721 1 230.76 0.2 flera 1.0 flera cor 2442203180006309721 1 231.04 0.3 kollegor 1.0 kollegor ins 2442203180006309721 1 231.4 0.159 här 1.0 <eps> sub 2442203180006309721 1 231.76 0.02 i 1.0 i cor 2442203180006309721 1 232.08 0.399 kammaren 1.0 kammaren cor 2442203180006309721 1 232.479 0.0 <eps> 1.0 som del 2442203180006309721 1 232.479 0.0 <eps> 1.0 inte del 2442203180006309721 1 232.479 0.0 <eps> 1.0 kommer del 2442203180006309721 1 232.479 0.0 <eps> 1.0 från del 2442203180006309721 1 232.479 0.0 <eps> 1.0 Stockholm. del
  • 20. Phrases move 2019-04-09 2442203180006309721 1 1596.22 0.099 ett 1.0 ett cor 2442203180006309721 1 1596.38 0.199 lopp 1.0 lopp cor 2442203180006309721 1 1596.579 0.0 <eps> 1.0 på del 2442203180006309721 1 1596.579 0.0 <eps> 1.0 60 del 2442203180006309721 1 1596.579 0.0 <eps> 1.0 mil del 2442203180006309721 1 1596.72 0.16 över 1.0 över cor 2442203180006309721 1 1597.0 0.119 tre 1.0 tre cor 2442203180006309721 1 1597.32 0.279 dagar 1.0 <eps> ins 2442203180006309721 1 1597.7 0.059 på 1.0 <eps> ins 2442203180006309721 1 1597.9 0.379 sextio 1.0 <eps> ins 2442203180006309721 1 1598.36 0.32 mil 1.0 dagar. sub
  • 21. Things are added in the moment 2019-04-09 2442203180006309721 1 2324.96 0.039 vi 1.0 vi cor 2442203180006309721 1 2325.1 0.32 måste 1.0 <eps> ins 2442203180006309721 1 2325.52 0.159 höra 1.0 <eps> ins 2442203180006309721 1 2325.76 0.319 talas 1.0 <eps> ins 2442203180006309721 1 2326.12 0.039 om 1.0 <eps> ins 2442203180006309721 1 2326.22 0.08 den 1.0 <eps> ins 2442203180006309721 1 2326.34 0.099 här 1.0 <eps> ins 2442203180006309721 1 2326.48 0.5 historien 1.0 <eps> ins 2442203180006309721 1 2327.14 0.34 gång 1.0 gång cor 2442203180006309721 1 2327.58 0.039 på 1.0 på cor
  • 22. What can we find? 2019-04-09
  • 23. False starts: 2019-04-09 23 2442207180019978121 1 1619.5 1.579 oppooppositionens 1.0 oppositionens sub 2442207150019764521 1 213.3 0.839 bberor 1.0 beror sub 2442207160019915621 1 492.7 2.0 globalisglobaliseringen 1.0 globaliseringen sub
  • 24. Alternative pronunciations: 2019-04-09 24 2442207160019915621 1 398.3 0.459 resvasion 1.0 reservation sub 2442207160019915621 1 432.52 0.48 resovationen 1.0 reservationen sub
  • 25. Filled pauses: 2019-04-09 25 2442207150019764521 1 622.44 0.159 ifrån 1.0 från sub 2442207180020109821 1 326.1 0.099 nhär 1.0 här ins
  • 26. Wait? Didn’t OpenAI Whisper solve ASR? 2019-04-09 26
  • 27. Untrustworthy for our purposes 2019-04-09 27 Radford, A., Kim, J.W., Xu, T., Brockman, G., Mcleavey, C. & Sutskever, I.. (2023). Robust Speech Recognition via Large-Scale Weak Supervision, in Proceedings of Machine Learning Research 202:28492-28518 Available from https://proceedings.mlr.press/v202/radford23a.html.
  • 28. Untrustworthy for our purposes 2019-04-09 28 (We can safely assume that this includes Riksdag’s data)
  • 29. Disappearing “tack” 2019-04-09 29 2442207060018256921 1 13.96 0.199 tack 1.0 <eps> ins 2442207060018256921 1 14.22 0.059 Herr 1.0 Herr cor 2442207060018256921 1 14.36 0.38 talman! 1.0 talman! cor (The official transcripts only start with “Talman!” “Herr talman!” or “Fru talman!”)
  • 30. Curious insertions 2019-04-09 30 00:00.000 --> 00:30.000 Tack till mina supporters via www.patreon.com 00:30.000 --> 00:34.000 Tack till mina supporters via www.patreon.com 01:00.000 --> 01:04.000 Tack till mina supporters via www.patreon.com 01:30.000 --> 01:34.000 Tack till mina supporters via www.patreon.com 02:00.000 --> 02:04.000 Tack till mina supporters via www.patreon.com 02:30.000 --> 02:34.000 Tack till mina supporters via www.patreon.com 03:00.000 --> 03:04.000 Tack till mina supporters via www.patreon.com 03:30.000 --> 03:34.000 Tack till mina supporters via www.patreon.com
  • 36. Waxholm 36 A dialogue system that gave information on shipping in the Stockholm archipelago Incorporating text-to-speech, ASR, face synthesis, and dialog management However: in the earliest versions, ASR was unavailable, so a Wizard of Oz setup was used. The data from these sessions was transcribed at the word and phoneme level, including non-speech events.
  • 39. Original data 39 CORRECTED: OK jesper Jesper Hogberg Thu Jun 22 13:46:18 EET 1995 AUTOLABEL: jesper Jesper H|gberg Fri Nov 26 09:25:05 MET 1993 Waxholm dialog. /u/wax/data/scenes/fp2008/fp2008.4.04.smp WIZARD: joakim_g Joakim Gustafson Wed Nov 24 10:47:29 MET 1993 TEXT: jag vill }ka fr}n str|mkajen . PHONEME: J'A:G+ V'IL+ "]:K'A FR']:N+ STR"MhyK'AJEN. CT 1 Labels: J'A: V'IL "]:KkA F']: STtR"M#Kk`AJE0N . FR 2500 #J >pm #J >w jag 0.156 sec FR 3916 $'A: >pm $'A: 0.245 sec FR 5276 $G >pm $G 0.330 sec FR 5276 >pm $g+ 0.330 sec FR 5276 #V >pm #V >w vill 0.330 sec FR 5919 $'I >pm $'I 0.370 sec FR 6752 $L >pm $L+ 0.422 sec FR 7218 #"]: >pm #"]: >w }ka 0.451 sec
  • 40. Problems 40 ● Frames inconsistently labeled ● “Empty” (zero duration) frames used to mark unrealised segments ● (At least) two schools of thought regarding (generated) phoneme sequences ● Extensive copy-and-edit approach to annotation files (metadata often wrong)
  • 43. “Turkiets antiterr- så kallade antiterrorlagstiftning” /tɵrkiːts<v> antɪt<hes> <pa> soː kalad antɪtærʊrlɑːɡstɪfnɪ/ 2019-04-09 43
  • 44. ASR error: IX debatt: /iː seks debat/ 2019-04-09
  • 45. ASR error: tack för talman: /tak fœ ̞ ː tɑːlman/ 2019-04-09
  • 46. It’s not perfect 46 Transcript: Herr talman! EU-samarbetet gör Sverige starkare och säkrare. Hot som klimatkrisen, pandemier, terrorism och organiserad brottslighet kan inte lösas av ett enskilt land. KB: är talman eusamarbetet gör sverige starkare och säkrare hotsom klimatkrisen pandemier terrorism och organiserad brottslighet kan inte lösas av ett enskilt land Phone: hæː tɑː man eːʉːsamabeːtət jœ ̞ ːr sværjə starkarə oː sɛːkrarə huːtsɔm klɪmɑːtkriːsəm pandemiːər <pa> tærʊrɪsm oː ɔrɡanɪseːrad brɔtslɪheːt kan ɪntə løːsas ɑːv et eːnʂɪlt land
  • 47. Pauses and hesitations 47 Transcript: Jag kan bara konstatera att ungefär 200 personer i veckan nekas inträde i Sverige för att de inte har rätt att komma hit, och det upptäcks tack vare de inre gränskontrollerna. Jag kan också konstatera att Säpo gör bedömningen att terrorhotnivån mot Sverige ligger kvar på en trea, vilket är en ganska hög nivå som motiverar ökad säkerhet och inre gränskontroll. KB: vi gör kan bara konstatera att ungefär tvåhundra personer i veckan som nekas inträde i sverige tack vare och det upptäcks via de inre gränskontrollerna för att de inte har rätt att komma till sverige kan också konstatera att säpo gör bedömningen att terrorhotsnivån mot sverige ligger kvar på en trea vilket är en ganska hög nivå vilket också motiverar ökad säkerhet och även inre gränskontroll Phone: vɪiːjœ ̞ ːr <pa> <hes> kam bɑːa kɔnstateːra at ɵŋefæː ʈvoː hɵndra pæʂuːnər <pa> iː vekan sɔm neːkas ɪntrɛːdə iː sværjə <pa> <hes> tak vɑːrə oː deː ɵptɛeks viːa dɔm ɪnrə ɡreɛnskɔntrɔləɳa <pa> <hes> fœ ̞ ːra tɔm ɪnt ɑː ret at kɔma tɪ sværjə <pa> <hes> kan ɔksɔ kɔnstateːra at sɛːpuː jœ ̞ ː bedœmnɪŋn at tærɔrhʊtsnɪvoːn mʊt sværjə lɪɡə kvɑːr poː poː en treːa <pa> vɪkət æːr eŋ ɡanska høːɡ<v> nɪvoː <pa> vɪkət ɔksɔ mʊtɪveːrar <hes> øːkad sɛːkərheːt oː ɛːvən ɪndrəe ɡrɛnskɔntrɔl
  • 48. Alternate pronunciation 48 Transcript: På den andra sidan har israeliska ungdomars liv präglats av rädsla och oro för terrorattentat. I båda länderna ökar uppgivenheten och radikaliseringen. KB: på den andra sidan har israeliska ungdomars liv präglats av rädsla och oro för terrorattentat i båda länderna ökar uppgivenheten och radikaliseringen Phone: poː den andra siːdan oː ɪsraeːlɪska ɵŋdʊmaʂ liːv prɛːɡlas ɑːv rɛːdsla oː uːrʊ fœ ̞ ː tærɔr atəntɑːt iː boːda lendæɳa øːkar ɵpjiːvənheːtən oː radɪkalɪseːrɪŋən
  • 49. Ongoing work 49 ● Forced alignment ○ Older, HMM-style models are better at forced alignment ○ Shorter stride (10ms vs 20ms) ○ Dictionary-based ● Acoustically-validated pronunciation dictionary ○ Intersection of dictionary-derived pronunciations and phonemic transcription ○ Adding rule-based alternatives: “rs” can be /ʂ/ or /rs/ ○ Dialect-specific lexica (Riksdag speakers are mostly well known)
  • 50. Wiktionary validations (top 10) 50 Instances Word Pronunciation Narrow/broad 1161018 att at broad 746256 i iː broad 582874 det deː broad 537306 som sɔm broad 512887 på poː broad 507377 vi viː broad 373091 så soː broad 305332 av ɑːv broad 291260 om ɔm broad 211505 man man broad