Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Workshop @ NWAV46
2nd November 2017
Forensics and Sociolinguistics
Vincent Hughes, Jessica Wormald, Erica Gold
Overview
• Methods of analysis in FVC
• Typicality
• Paradigm shift across forensic science
• What’s stopping us?
• Propos...
1. Forensic voice comparison
3
Similarity
Typicality
How similar are the
offender and suspect
voices to each other?
(wrt t...
4
• Role of the expert = establish whether the evidence
supports prosecution or defence and the
strength/weight of support...
Linguistic-phonetic approach
• Range of variables analysed at various
linguistic levels (segmental, suprasegmental,
gramma...
Automatic approach
• Developed within speech technology
• Features of short-term power spectrum
(cepstral coefficients, e....
• 2011= >20% using ASR
• 2017= 41% using ASR
– Those using ASR always involve some human
analysis
• Only about half are ph...
8
3. Assessing typicality
• How ‘unusual’ is feature x?
– Fronting of /θ/ and /ð/
• Well researched
• Available reference ...
9
3. Assessing typicality
• Population statistics are used to assess typicality in comparisons
• ~ 70% of experts consider...
10
4. Paradigm shift
• Saks and Koehler (2005): shift across all
forensic sciences towards…
– More scientific methods
– Re...
11
4. Paradigm shift
Two essential elements of the paradigm shift:
• Robust estimation of typicality
– Defensible statisti...
12
4. Paradigm shift
Ideal situation for FVC
• Typicality assessed using empirical data
• Comprehensive descriptions of va...
13
5. What’s stopping us?
(a) Availability of corpora
• Few forensically realistic corpora
– Some exceptions (e.g. DyViS)
...
14
5. What’s stopping us?
(b) Limitations of available corpora
• Sociolinguistic corpora:
– Good coverage of regional/soci...
15
5. What’s stopping us?
(c) Not enough population statistics
• Existing data based on linguistic analyses
• Huge range o...
16
5. What’s stopping us?
(d) Lack of descriptions of language varieties
• Not fashionable (at least for English)
– Some e...
17
6. Proposed solutions
• Data sharing
• Use of platforms for uploading recordings and
transcriptions
– SLAAP, ONZE, FAVE...
18
6. Proposed solutions
• data collection methods
– Capture more real world variation
– Forensically realistic conditions...
19
7. Benefits for Sociolinguistics
Expands the scope/scale of sociolinguistic
research
• Availability of data facilitates...
20
7. Benefits for Sociolinguistics
Better understanding of wider range of within-
speaker variability
• Forensic recordin...
21
7. Benefits for Sociolinguistics
Value of descriptive material
• Baseline data of expected patterns for a
variety in or...
22
8. Conclusions
• Importance of closer collaboration between
forensics and sociolinguistics
– Paradigm shift in forensic...
Upcoming SlideShare
Loading in …5
×

Forensics and Sociolinguistics

186 views

Published on

Hughes, V., Wormald, J. and Gold, E. (2017) Forensics and sociolinguistics. Paper presented at the 'Sociolinguistics and Forensic Speech Science' Workshop at NWAV46, University of Wisconsin at Madison, WI. 2 November 2017.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Forensics and Sociolinguistics

  1. 1. Workshop @ NWAV46 2nd November 2017 Forensics and Sociolinguistics Vincent Hughes, Jessica Wormald, Erica Gold
  2. 2. Overview • Methods of analysis in FVC • Typicality • Paradigm shift across forensic science • What’s stopping us? • Proposed solutions 2
  3. 3. 1. Forensic voice comparison 3 Similarity Typicality How similar are the offender and suspect voices to each other? (wrt the features analysed) How unusual are those features relative to the wider population?
  4. 4. 4 • Role of the expert = establish whether the evidence supports prosecution or defence and the strength/weight of support ProsecutionDefence 1. Forensic voice comparison
  5. 5. Linguistic-phonetic approach • Range of variables analysed at various linguistic levels (segmental, suprasegmental, grammatical, lexical…) • Combination of auditory and acoustic analysis • Application of traditional linguistic methods 5 2. Methods of analysis
  6. 6. Automatic approach • Developed within speech technology • Features of short-term power spectrum (cepstral coefficients, e.g. MFCCs) • Holistic – non-segmental 6 2. Methods of analysis
  7. 7. • 2011= >20% using ASR • 2017= 41% using ASR – Those using ASR always involve some human analysis • Only about half are phoneticians/speech scientists • US is predominately using ASR now • UK is using linguistic-phonetic approach 7 2. Methods of analysis
  8. 8. 8 3. Assessing typicality • How ‘unusual’ is feature x? – Fronting of /θ/ and /ð/ • Well researched • Available reference literature • Wide coverage – regionally, socially, phonologically – Multiple bursts on /t/ and /d/ releases • Not well researched • Accent feature? • Individual feature? – Currently – determined by experience
  9. 9. 9 3. Assessing typicality • Population statistics are used to assess typicality in comparisons • ~ 70% of experts consider population statistics (Gold & French 2011; in prep) – These are derived from a source or personally collected • Most experts have commented that they would use them more frequently if they were readily available • Those not using population statistics for a given parameter rely on experience and ‘mentally’ assess typicality • Ross (2015) looked at consistency between experts assessments of typicality = large variation • Ross (in progress) working on eliciting expert opinions for typicality judgments
  10. 10. 10 4. Paradigm shift • Saks and Koehler (2005): shift across all forensic sciences towards… – More scientific methods – Replicability – Objectivity – Data-driven approaches
  11. 11. 11 4. Paradigm shift Two essential elements of the paradigm shift: • Robust estimation of typicality – Defensible statistics based on population data – Use of appropriate conclusion framework • Validation (i.e. error rates) – Demonstrate to the court that your methods work – Large scale testing under forensic conditions
  12. 12. 12 4. Paradigm shift Ideal situation for FVC • Typicality assessed using empirical data • Comprehensive descriptions of varieties • Theoretical and methodological developments in collaboration
  13. 13. 13 5. What’s stopping us? (a) Availability of corpora • Few forensically realistic corpora – Some exceptions (e.g. DyViS) – But even these are limited • Rarely have complete coverage of varieties – Case could involve speakers of any variety – Not much focus on L2/non-native varieties
  14. 14. 14 5. What’s stopping us? (b) Limitations of available corpora • Sociolinguistic corpora: – Good coverage of regional/social variation, but… – Generally small • Speech technology corpora: – Large, but… – Insufficient coverage of regional/social variation
  15. 15. 15 5. What’s stopping us? (c) Not enough population statistics • Existing data based on linguistic analyses • Huge range of potential features: – Segmental realisations (auditory and acoustic) – F0 distributions – (Long term) formant distributions – Voice quality • Within-/between-speaker variation • Technical variations
  16. 16. 16 5. What’s stopping us? (d) Lack of descriptions of language varieties • Not fashionable (at least for English) – Some exceptions; e.g. illustrations of the IPA • Understandable focus on theory and individual variables • Forensics = reliance on experience or out-of- date descriptions – e.g. Survey of English Dialects, Wells (1982)
  17. 17. 17 6. Proposed solutions • Data sharing • Use of platforms for uploading recordings and transcriptions – SLAAP, ONZE, FAVE suite, SPADE – Forced alignment – Searching for internal and external sources of variation – Easy extraction of large amounts of data – Continually updated (longitudinal resource)
  18. 18. 18 6. Proposed solutions • data collection methods – Capture more real world variation – Forensically realistic conditions – e.g. multiple recordings per speaker, technical factors…
  19. 19. 19 7. Benefits for Sociolinguistics Expands the scope/scale of sociolinguistic research • Availability of data facilitates large-scale sociolinguistic projects – see Tyler’s talk • More robust statistical testing • Exploration of more subtle effects • Real-time change
  20. 20. 20 7. Benefits for Sociolinguistics Better understanding of wider range of within- speaker variability • Forensic recordings = real world • Within-speaker variability • Distinguishing between patterns due to situational context and those due to identity construction
  21. 21. 21 7. Benefits for Sociolinguistics Value of descriptive material • Baseline data of expected patterns for a variety in order to understand variation and change (deviations from expected patterns) • Useful for other disciplines too, e.g. SLT, language teaching
  22. 22. 22 8. Conclusions • Importance of closer collaboration between forensics and sociolinguistics – Paradigm shift in forensics – Theoretical and practical benefits for socio • Talks will explore these themes in more detail – Building platforms for data sharing (Yvan) – Possibilities of big data in sociolinguistics (Tyler) – Ethical issues/considerations (Natalie)

×