Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

(Socio)-linguistic considerations in the default definition of the relevant population when computing numerical likelihood ratios

68 views

Published on

Hughes, V. and Foulkes, P. (2013) (Socio)-linguistic considerations in the default definition of the relevant population when computing numerical likelihood ratios. Paper presented at International Association for Forensic Phonetics and Acoustics (IAFPA) Conference, University of South Florida, Tampa. 21-24 July 2013.

Published in: Education
  • Be the first to comment

  • Be the first to like this

(Socio)-linguistic considerations in the default definition of the relevant population when computing numerical likelihood ratios

  1. 1. (Socio)linguistic considerations for defining the relevant population Vincent Hughes Paul Foulkes
  2. 2. 1.0 Introduction Hughes & Foulkes IAFPA 2013 2 • likelihood ratio (LR) involves assessment of both similarity and typicality – it matters “whether the values found matching … are vanishingly rare, or sporadic, or near universal” (Nolan 2001:16) • typicality = defined by patterns in the relevant population (Aitken and Taroni 2004) – quantified relative to a sample of the population
  3. 3. 1.0 Introduction Hughes & Foulkes IAFPA 2013 3 • common approach = ‘logical relevance’ – between-S variation: language (region)/ sex – within-S variation: non-contemporaneity (Rose 2004) But: • many other sources of systematic between-S and within-S variation identified in (socio)linguistics
  4. 4. 1.0 Introduction Hughes & Foulkes IAFPA 2013 4 What to control and how narrowly? Between-speaker variation Within-speaker variation Regional background Phonological environment Sex (and gender) Topic Ethnicity Interlocutor Social networks Health Age Emotion Social class Non-contemporaneity … …
  5. 5. 1.0 Introduction Hughes & Foulkes IAFPA 2013 5 Research questions (i) to what extent are LRs affected by different definitions of the relevant population, in respect of social class and age? (i) how do these sources of between-S variation interact with each other to affect LR output?
  6. 6. 2.1 Data: /eɪ/ (FACE) 6 • ONZE corpus (Gordon et al 2007) – NZ English, from Christchurch NZ – all males (for the purposes of this study) – spontaneous speech, sociolinguistic interviews • auto-generated F1, F2, F3 trajectories – time-normalisedmeasurements at +10% steps (McDougall 2004, 2006) – procedures implemented for formant correction (incl. auditory analysis) Hughes & Foulkes IAFPA 2013
  7. 7. 2.1 Data: /eɪ/ (FACE) 7 • 8 tokens per speaker – phonological conditioning – excluded pre-/l/, adjacent /r/ • formant trajectories fitted with corrected cubic polynomial curves – coefficients used as input data for LRs N=101 Professional Non-professional Younger (18-35) 29 24 Older (35-70) 25 23
  8. 8. 2.1 Data: /eɪ/ (FACE) 8 • variation as a function of class and age Cultivated NZE Broad NZE From Hay et al (2008:97)
  9. 9. 2.2 Raw data: class 500 1000 1500 2000 2500 25 50 75 +10% Step Frequency(HZ) Non-prof Prof F2 Trajectory 1500 1600 1700 25 50 75 +10% Step Frequency(HZ) Non-prof Prof
  10. 10. 2.2 Raw data: age 10 500 1000 1500 2000 2500 25 50 75 +10% Step Frequency(HZ) Old Young
  11. 11. 500 1000 1500 2000 2500 25 50 75 +10% Step Frequency(HZ) Old Young 2.2 Raw data: age 11 Aging (cf. Rhodes 2013)? Or apparent time generational difference?
  12. 12. 2.2 Raw data: class and age 12 2000 1800 1600 1400 1200 800700600500400 F2(Hz) F1(Hz) Old Prof Young Prof Old Non-prof Young Non-prof
  13. 13. 2.3 Experiment 1: class-only Hughes & Foulkes IAFPA 2013 13 • development set = 22 professionals • test set = 22 professionals • reference sets: Condition Speakers Hd: “it wasn’t the suspect, it was… Tailored 40 prof another professional…” Mixed 20 prof/ 20 non-prof another man…” Mismatch 40 non-prof another non-professional…” ✓ (-) ✗
  14. 14. 2.4 Experiment 2: age-only Hughes & Foulkes IAFPA 2013 14 • development set = 22 younger speakers • test set = 22 younger speakers • reference sets: Condition Speakers Hd: “it wasn’t the suspect, it was… Tailored 40 young another young man…” Mixed 20 young/ 20 older another man…” Mismatch 40 older another older man…” ✓ (-) ✗
  15. 15. 2.4 Experiment 3: class and age Hughes & Foulkes IAFPA 2013 15 • test set = 29 young professionals (no calibration) • reference sets: Condition Speakers Hd Tailored 20 young prof …another young professional…” Mixed 5 young prof/5 old prof/ 5 young non-prof/ 5 old non-prof …another man…” Mismatch (both) 20 older non-prof …another older non- professional…” Mismatch (class-only) 20 young non-prof …another young non- professional…” ✓ (-) ✗ ✗
  16. 16. 2.5 Method Hughes & Foulkes IAFPA 2013 16 • LRs computed using Multivariate Kernel Density formula (MVKD) (Aitken and Lucy 2004) • cross-validation used to combat small N – i.e. dev/ test speakers function simultaneously as reference data (with leave-one-out approach) • LR scores for test data calibrated using logistic regression based on weights for dev data • system validity assessed using Cllr
  17. 17. 3.1 Results: class Hughes & Foulkes IAFPA 2013 17 Mismatch Mixed Tailored 0 1 SS SS SS Log10LR Reference Data Mismatch Mixed Tailored Mismatch Mixed Tailored -6 -4 -2 0 DS DS DS Log10LR Referenc Mism Mixed Tailo Same-Speaker Pairs Different-Speaker Pairs Mismatch Mixed Tailored Mismatch Mixed Tailored
  18. 18. 3.2 Results: age Hughes & Foulkes IAFPA 2013 18 Mismatch Mixed Tailored 0 1 SS SS SS Log10LR Reference Data Mismatch Mixed Tailored Mismatch Mixed Tailored -4 -2 0 DS DS DS Log10LR Referenc Mism Mixe Tailo Same-Speaker Pairs Different-Speaker Pairs Mismatch Mixed Tailored Mismatch Mixed Tailored
  19. 19. Mismatch (Both) Mismatch (Class only) Mixed Tailored -3 -2 -1 0 1 2 SS SS SS SS Log10LR Reference Data Mismatch (Both) Mismatch (Class only) Mixed Tailored 3.3 Results: class and age Same-Speaker Pairs
  20. 20. Mismatch (Both) Mismatch (Class only) Mixed Tailored -10 -5 0 DS DS DS DS Log10LR Reference Data Mismatch (Both) Mismatch (Class only) Mixed Tailored 3.3 Results: class and age Different-Speaker Pairs
  21. 21. 3.4 Results: performance Hughes & Foulkes IAFPA 2013 21
  22. 22. 4.0 Discussion Hughes & Foulkes IAFPA 2013 22 • magnitude of LRs and system validity affected by different delimitations of class and age – even relatively subtle levels of variability (e.g. class in this data) are reflected in strength of evidence • effects on the LLRs differ according to different grouping variables (also SS or DS pairs) – class: differences in variability/ range – age: overoptimistic LLRs with mismatch
  23. 23. 4.0 Discussion Hughes & Foulkes IAFPA 2013 23 • effects on validity appear systematic (and predictable): – mixed = under optimistic Cllr relative to tailored – mismatch = over optimistic Cllr relative to tailored – age a more significant factor than class • ‘getting it wrong’ much more problematic than ‘keeping it general’ – paradox: without knowing who the offender is, we can’t know the population of which he is a member
  24. 24. 4.0 Discussion Hughes & Foulkes IAFPA 2013 24 • ‘getting it wrong’ is more of an issue for some grouping variables over others (e.g. age vs. class) • importantly these results highlight that social variables do not function in isolation – rather they interact with each other – their combined weight affects LR output in different ways to sources of variation in isolation
  25. 25. 5.0 Conclusion Hughes & Foulkes IAFPA 2013 25 • essential to consider the relevant population with respect to (socio)linguistic dimensions – between-S variation = complex and multi- dimensional – important to be aware of how sources of variation interact with each other (e.g. age and class) • understanding the sources of variability ensures estimates of strength of evidence are more meaningful
  26. 26. Thanks! Questions? Acknowledgements: Erica Gold, Peter French, Dom Watt, FSS Research Group (York), Jen Hay, Robert Fromont, Pat LaShell, NZILBB, Ashley Brereton vh503@york.ac.uk

×