Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Defining the relevant population according to class and age when computing numerical likelihood ratios for forensic voice comparison

31 views

Published on

Hughes, V. and Foulkes, P. (2014) Defining the relevant population according to class and age when computing numerical likelihood ratios for forensic voice comparison. Paper presented at BAAP 2014 Colloquium, University of Oxford. 7-9 April 2014.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Defining the relevant population according to class and age when computing numerical likelihood ratios for forensic voice comparison

  1. 1. Defining the relevant population according to class and age for forensic voice comparison Vincent Hughes Paul Foulkes Department of Language and Linguistic Science BAAP Colloquium University of Oxford 8th April 2014
  2. 2. 2 1.0 Introduction • forensic voice comparison (FVC) = voice of offender (unknown) vs. voice of suspect (known) is the person on the criminal recording the same as the person on the suspect recording? • auditory-acoustic linguistic-phonetic analysis – analysis of a range of segmental (vowels, consonants), suprasegmental (f0, intonation, AR), higher-order linguistic (lexical choice, syntax) and VQ/vocal setting Hughes & Foulkes BAAP 2014
  3. 3. 1.0 Introduction 3 • the expert cannot say how likely it is that the suspect and offender are the same speaker (SS): – this is the job of the trier of fact (judge/ jury) – requires access to all of the evidence in the case • the expert can assess the strength (or weight) of a piece of the evidence under the competing hypotheses of the prosecution (same-speaker, SS) and defence (different-speaker, DS) p(Hss|E) ✗ Hughes & Foulkes BAAP 2014
  4. 4. Hughes & Foulkes BAAP 2014 1.0 Introduction 4 • the expert cannot say how likely it is that the suspect and offender are the same speaker (SS): – this is the job of the trier of fact (judge/ jury) – requires access to all of the evidence in the case • the expert can assess the strength (or weight) of a piece of the evidence under the competing hypotheses of the prosecution (same-speaker, SS) and defence (different-speaker, DS) p(Hss|E) ✗ p(E|Hss) p(E|Hds)LR = ✓
  5. 5. 1.0 Introduction 5 • likelihood ratio (LR) involves assessment of both similarity and typicality – it matters “whether the values found matching … are vanishingly rare, or sporadic, or near universal” (Nolan 2001:16) • typicality = defined by patterns in the relevant population (Aitken & Taroni 2004) – quantified relative to a sample of the population – outcome = value centered on 1 Hughes & Foulkes BAAP 2014
  6. 6. 1.0 Introduction Hughes & Foulkes BAAP 2014 6 • defence hypothesis (Hd): “it wasn’t the suspect, (it was … someone else)” • common approach = ‘logical relevance’ – between-S variation: language (region)/ sex – within-S variation: non-contemporaneity (Rose 2004) But: many other sources of systematic between-S and within-S variation identified in (socio)linguistics
  7. 7. 1.0 Introduction 7 – paradox: without knowing who the offender is, we can’t know (for certain) the population of which he is a member Between-speaker variation Within-speaker variation Regional background Phonological environment Sex (and gender) Topic Ethnicity Interlocutor Social networks Health Age Emotion Social class Non-contemporaneity
  8. 8. 1.0 Introduction 8 Research questions (i) to what extent are LRs affected by different definitions of the relevant population, in respect of social class and age? (i) how do these sources of between-S variation interact with each other to affect LR output? Hughes & Foulkes BAAP 2014
  9. 9. 2.1 Data: /eɪ/ (FACE) 9 • ONZE corpus (Gordon et al 2007) – NZ English, from Christchurch NZ – all males (for this study) – spontaneous speech, sociolinguistic interviews • auto-generated F1, F2, F3 trajectories – time-normalisedmeasurements at +10% steps (McDougall 2004, 2006) – procedures implemented for formant correction (incl. auditory analysis) Hughes & Foulkes BAAP 2014
  10. 10. 2.1 Data: /eɪ/ (FACE) 10 • 8 tokens per speaker – excluded pre-/l/, adjacent /r/ • formant trajectories fitted with corrected cubic polynomial curves – coefficients used as input data for LRs N=101 Professional Non-professional Younger (18-35) 29 24 Older (35-70) 25 23 Hughes & Foulkes BAAP 2014
  11. 11. 2.1 Data: /eɪ/ (FACE) 11 • variation as a function of class and age from Hay et al (2008:97) Hughes & Foulkes BAAP 2014
  12. 12. 2.2 Method Hughes & Foulkes BAAP 2014 12 • structure: – 1 set of test data (multiple pairs of same- (SS) and different-speaker (DS) pairs of samples – multiple sets of reference data • matched with test data for social factor of interest • mismatched with test data for social factor of interest • mixed: no control over social factor of interest • LRs computed using MVKD (Aitken & Lucy 2004) – transformed using base-10 logarithm (0 = neutral evidence)
  13. 13. 2.3 Experiment 1: class-only Hughes & Foulkes BAAP 2014 13 • test set = 22 professionals • reference sets: Condition Speakers Hd: “it wasn’t the suspect, it was… Matched 40 prof another professional…” Mixed 20 prof/ 20 non-prof another man…” Mismatched 40 non-prof another non-professional…” ✓ (-) ✗
  14. 14. 2.3 Experiment 2: age-only Hughes & Foulkes BAAP 2014 14 • test set = 22 younger speakers • reference sets: Condition Speakers Hd: “it wasn’t the suspect, it was… Matched 40 young another young man…” Mixed 20 young/ 20 older another man…” Mismatched 40 older another older man…” ✓ (-) ✗
  15. 15. 2.3 Experiment 3: class & age Hughes & Foulkes IAFPA 2013 15 • test set = 29 young professionals • reference sets: Condition Speakers Hd Matched 20 young prof …another young professional…” Mixed 5 young prof/5 old prof/ 5 young non-prof/ 5 old non-prof …another man…” Mismatched (both) 20 older non-prof …another older non- professional…” Mismatched (class-only) 20 young non-prof …another young non- professional…” ✓ (-) ✗ ✗
  16. 16. 16 Hughes & Foulkes BAAP 2014 16 Raw LR Log10 LR Verbal expression >10000 4à5 Very strong evidence 1000à10000 3à4 Strong evidence 100à1000 2à3 Moderately strong evidence 10à100 1à2 Moderate evidence 1à10 0à1 Limited evidence 1à0.1 0à-1 Limited evidence 0.1à0.01 -1à-2 Moderate evidence 0.01à0.001 -2à-3 Moderately strong evidence 0.001à0.0001 -3à-4 Strong evidence <0.0001 -4à-5 Very strong evidence Champod and Evett (2000) Hp Hd
  17. 17. Hughes & Foulkes BAAP 2014 17 Matched Mismatched Mixed -7 -6 -5 -4 -3 -2 -1 0 1 DS SS DS SS DS SS Log10LR 3.1 Results: class-only
  18. 18. Hughes & Foulkes BAAP 2014 18 Matched Mismatched Mixed -7 -6 -5 -4 -3 -2 -1 0 1 DS SS DS SS DS SS Log10LR 3.1 Results: class-only But quite substantial differences for certain high magnitude DS LRs
  19. 19. Matched Mismatched Mixed -4 -3 -2 -1 0 1 DS SS DS SS DS SS Log10LR 3.2 Results: age-only
  20. 20. Matched Mismatched Mixed -4 -3 -2 -1 0 1 DS SS DS SS DS SS Log10LR 3.2 Results: age-only Again big differences in individual SS & DS LRs according to reference set
  21. 21. 3.3 Results: performance 21 • Log LR cost (Cllr): penalises the system for high magnitude contrary to fact LRs – errors: false misses (SS as DS) and false hits (DS as SS) – but an incorrect log LR close to 0 is much less important than an incorrect LR with very high magnitude – Cllr ideally close to 0 when no errors are made Hughes & Foulkes BAAP 2014
  22. 22. 3.3 Results: performance 22
  23. 23. 4.0 Discussion Hughes & Foulkes BAAP 2014 23 • magnitude of LRs and system validity affected by different delimitations of class and age – even relatively subtle variability can be reflected in strength of evidence • effects on the LRs most marked for outlying comparisons – overestimation of strength of evidence by up to 3 log10 orders of magnitude – DS pairs more sensitive than SS pairs
  24. 24. 4.0 Discussion 24 • effects on Cllr appear systematic : – mixed = under optimistic Cllr relative to matched – mismatch = over optimistic Cllr relative to matched – age a more significant factor than class • ‘getting it wrong’ more problematic than ‘keeping it general’ – distributions of LRs and Cllr closer to baseline in mixed condition than in mismatched Hughes & Foulkes BAAP 2014
  25. 25. 5.0 Conclusion 25 • essential to consider the relevant population with respect to (socio)linguistic dimensions – between-S variation = complex and multi- dimensional – important to be aware of how sources of variation interact with each other (e.g. age and class) • understanding the sources of variability ensures estimates of strength of evidence are more meaningful Hughes & Foulkes BAAP 2014
  26. 26. Thanks! Questions? Acknowledgements: Erica Gold, Peter French, Dom Watt, FSS Research Group (York), Jen Hay, Robert Fromont, Pat LaShell, NZILBB, Ashley Brereton vh503@york.ac.uk

×