Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Defining the relevant population: can the DNA approach work for speech?

86 views

Published on

Hughes, V. (2014) Defining the relevant population: can the DNA approach work for speech? Glasgow University Laboratory of Phonetics (GULP) Colloquium, University of Glasgow, Glasgow, UK. 30 January 2014. (INVITED TALK)

Published in: Education
  • Hi there! I just wanted to share a list of sites that helped me a lot during my studies: .................................................................................................................................... www.EssayWrite.best - Write an essay .................................................................................................................................... www.LitReview.xyz - Summary of books .................................................................................................................................... www.Coursework.best - Online coursework .................................................................................................................................... www.Dissertations.me - proquest dissertations .................................................................................................................................... www.ReMovie.club - Movies reviews .................................................................................................................................... www.WebSlides.vip - Best powerpoint presentations .................................................................................................................................... www.WritePaper.info - Write a research paper .................................................................................................................................... www.EddyHelp.com - Homework help online .................................................................................................................................... www.MyResumeHelp.net - Professional resume writing service .................................................................................................................................. www.HelpWriting.net - Help with writing any papers ......................................................................................................................................... Save so as not to lose
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

Defining the relevant population: can the DNA approach work for speech?

  1. 1. Defining the relevant population: can the DNA approach work for speech? Vincent Hughes Department of Language and Linguistic Science Glasgow University Laboratory of Phonetics University of Glasgow 30th January 2014
  2. 2. 0. Outline focus of this talk: • introduction to forensic voice comparison (FVC) and the likelihood ratio (LR) • what is meant by the ‘relevant population’? – application in forensic DNA analysis – complexities of language variation • current approaches in FVC • three potential alternatives GULP 30th January 2014 2
  3. 3. 1. Introduction
  4. 4. 4 1.1 Forensic voice comparison • forensic voice comparison (FVC) = voice of offender (unknown) vs. voice of suspect (known) is the person on the criminal recording the same as the person on the suspect recording? • auditory-acoustic linguistic-phonetic analysis – analysis of a range of segmental (vowels, consonants), suprasegmental (f0, intonation, AR), higher-order linguistic (lexical choice, syntax) and VQ/vocal setting GULP 30th January 2014
  5. 5. 5 1.2 Expressing conclusions GULP 30th January 2014 what can’t the expert say? POSITIVE IDENTIFICATION NEGATIVE IDENTIFICATION sure beyond reasonable doubt probable there can be very little doubt quite probable highly likely likely likely highly likely very probable probable quite possible possible … that they are the same person … that they are different people from French & Baldwin (1990) ✗
  6. 6. 6 1.2 Expressing conclusions why? • how likely it is that the suspect is the offender given the evidence? – it’s an assessment of guilt – this is the job of the judge/ jury (trier of fact) • requires access to all of the evidence • possibility doesn’t tell you about probability – continuum isn’t equal on both sides (bias towards positive identification?) GULP 30th January 2014
  7. 7. 7 1.3 Likelihood ratio (LR) what can the expert say? • provides a gradient assessment of the strength/ weight of evidence • ratio = value centered on 1 where: – support for prosecution = > 1 – support for defence= < 1 ✓ p(E|Hp) p(E|Hd) p = probability E = evidence | = ‘given’ Hp = prosecution hyp Hd = defence hyp GULP 30th January 2014
  8. 8. 8 1.3 Likelihood ratio (LR) why? • evaluation of the evidence, rather than the hypotheses (innocence vs. guilt) – separates the role of the expert and trier of fact • explicit consideration of both prosecution and defence hypotheses (objective) • clear (??) probabilistic statement presented to the Court GULP 30th January 2014
  9. 9. 9 1.4 Computing a LR • LR = similarity and typicality – it matters “whether the values found matching (…) are vanishingly rare (…) or near universal” (Nolan 2001:16) – typicality of values within- and between-speakers • typicality = dependent on patterns in the “relevant population” (Aitken & Taroni 2004) – reference data = sample of that population – distributions modelledstatistically to generate numerical output GULP 30th January 2014
  10. 10. 10 1.4 Computing a LR p(E|Hp) p(E|Hd) = 0.047 0.0115 = 4.08 GULP 30th January 2014
  11. 11. 11 Raw LR Log10 LR Strength of evidence Support > 10,000 4 à 5 very strong 1000 à 10,000 3 à 4 strong 100 à 1000 2 à 3 moderately strong Prosecution 10 à 100 1 à 2 moderate 1 à 10 0 à 1 limited 1 0 neutral Neither 0.1 ß 1 -1 ß 0 limited 0.01 ß 0.1 -2 ß -1 moderate 0.001 ß 0.01 -3 ß -2 moderately strong Defence 0.0001 ß 0.001 -4 ß -3 strong > 0.0001 -5 ß -4 very strong 4.08 = limited support for the prosecution GULP 30th January 2014
  12. 12. 2. The relevant population
  13. 13. 13 2.1 The defence hypothesis (Hd) • relevant population = depends on the question – defined by the defencehypothesis BUT impossible to assess the probability of the evidence with a vague (or no) Hd p(E|Hd) “it wasn’t our client (the suspect), it was… …someone else” GULP 30th January 2014
  14. 14. 14 2.2 DNA approach • “logical relevance” (Kaye 2004, 2008) – factors which affect the distribution a variable in the population (determine sub-populations) • ethnicity = logically relevant for DNA – allele frequencies differ between racial groups – 3 databases used for forensic DNA analysis in UK (Fraser & Williams 2009) • multiple LRs based on different Hd assumptions – 3 LRs using the 3 different databases GULP 30th January 2014
  15. 15. 15 2.3 Issues for speech • multidimensionality of between-speaker variation in spontaneous speech – social stratification = much more complex than DNA – mass of evidence from socio-linguistics/-phonetics • linguistic-phonetic variables affected by different social factors to different extents within- and between-dialects • logically relevant information is accessible – we can make numerous inferences about the offender from a sample of his/her speech GULP 30th January 2014
  16. 16. 3. Definitions of the relevant population in FVC logical relevance
  17. 17. 17 3.1 Logical relevance • Rose (2004:4): “quite often (Hd) will simply be that the voice of the unknown speaker does not belong to the accused, but to another same-sex speaker of the language” • reflected in the majority of LR-based studies: – Kinoshita (2002), Aldermann (2004), Kinoshita (2005), Rose (2006), Rose et al. (2006), Rose (2007), Morrison & Kinoshita (2008), Morrison (2009), Morrison et al. (2011) Morrison (2011), Zhang et al. (2011)… GULP 30th January 2014
  18. 18. 18 3.2 Issues • why just sex and language? (i) sex/ language are easily accessible in the speech signal (but see French et al. (2010:145), Foulkes &French (2012:569)) (ii) sex/ language are the most significant sources of social variation defining sub-populations - lack of understanding of complexity of socially stratified variation in speech • paradox = without knowing who the offender is we can’t know (for sure) the population of which (s)he is a member GULP 30th January 2014
  19. 19. 19 3.3 Empirical testing • general structure: – 1 set of test data multiple sets of reference data • matched with test data for social factor of interest • mismatched with test data for social factor of interest • mixed: no control over social factor of interest • system error evaluated using: Log LR cost (Cllr): – penalises the system for high magnitude contrary to fact LRs • theory = an incorrect LR close to unity is much less important than an incorrect LR with very high magnitude • MVKD formula used to compute LRs (Aitken & Lucy 2004) GULP 30th January 2014
  20. 20. 20 3.3.1 Empirical testing: PRICE • formant dynamics (F1, F2 + F3) for /aɪ/ – measured at +10% steps • fitted with cubic polynomial curves – coefficients used as input for LRs GULP 30th January 2014
  21. 21. 21 3.3.1 Empirical testing: PRICE • test data = 20 speakers of Standard Southern British English (SSBE) – DyViS database (Nolan et al. 2009): Task 1 – young (18-25 yrs), male – 10 tokens per speaker • reference data – matched: 32 DyViS (SSBE) – mixed (BrEng): 8 DyViS/ 8 Manchester/ 8 York/ 8 Newcastle GULP 30th January 2014
  22. 22. 22 3.3.1 Empirical testing: PRICE GULP Colloquium 30th January 2014
  23. 23. 23 3.3.1 Empirical testing: PRICE GULP Colloquium 30th January 2014
  24. 24. 24 3.3.3 Empirical testing: FACE • investigating (i) socio-economic class, (ii) age and (iii) class*age (interaction) – class: professional vs. non-professional – age: younger (born after 1960) vs. older (born after 1960) • data for 101 male speakers of NZE (ONZE) – 8 tokens per speaker – cubic polynomial coefficients of F1, F2 and F3 • for each experiment, reference data: (a) matched, (b) mismatched, (c) mixed GULP Colloquium 30th January 2014
  25. 25. 25 3.3.3 Empirical testing: FACE why NZE? GULP Colloquium 30th January 2014 adapted from Hay et al. (2008: 97)
  26. 26. 26 Experiment 1: Class GULP Colloquium 30th January 2014 Matched Mismatched Mixed -7 -6 -5 -4 -3 -2 -1 0 1 DS SS DS SS DS SS Log10LR
  27. 27. 27 Experiment 2: Age GULP Colloquium 30th January 2014 Matched Mismatched Mixed -4 -3 -2 -1 0 1 DS SS DS SS DS SS Log10LR
  28. 28. 28 General patterns: Cllr
  29. 29. 29 3.4 Issues • so many sources of between-sp variation: which to control? – inevitable mismatch between evidential recordings and any set of reference data • don’t want to make the relevant population too narrow: – reductioad absurdum – issue with prior odds (Rose 2013) • paradox remains: still don’t know if we’re right GULP Colloquium 30th January 2014
  30. 30. 4. Definitions of the relevant population in FVC speaker similarity
  31. 31. 31 4.1 Speaker similarity GULP Colloquium 30th January 2014 “it wasn’t our client (the suspect), it was… …a member of a population of speakers who sound sufficiently similar that an investigator or prosecutor would submit recordings of these speakers for forensic analysis” from Morrison et al. (2012)
  32. 32. 32 4.1 Speaker similarity • similar sounding speakers to the offender as judged by lay listeners – lay listener (police officer) who made the decision to submit the samples for analysis – ∴ it can include males + females, different accents… as long as they ‘sound similar’ (Morrison et al. 2012) • listeners match characteristics of the person who made the original decision: – e.g. young, male police officer from X… GULP Colloquium 30th January 2014
  33. 33. 4.1 Speaker similarity problems • limited view of variation in production and perception • what factors do we control in our listeners? • what do the listeners hear? – how to replicate the conditions of the original decision – some controls over what is played to the listeners (usually sex and language again) • lack of replicability • lay listeners are linguistically erratic when it comes to assessing speaker-similarity (McDougall 2011) 33 GULP Colloquium 30th January 2014 ✗
  34. 34. 5. Discussion
  35. 35. 5. Discussion • direct application of logical relevance (DNA) clearly inappropriate • but speaker similarity is as problematic, if not more so, on linguistic grounds • need new ways of defining the relevant populations – logically/ legally/ linguistically appropriate • there might be elements of the DNA approach which can be applied to speech 35 GULP Colloquium 30th January 2014
  36. 36. 5.1 Multiple Hd (from DNA) • offer multiple LRs based on different definitions of the relevant population – “if the relevant population is x, then the LR is y” BUT: – still have to control some factors and ignore others – need multiple sets of reference data = impractical • easier in DNA with one grouping factor (ethnicity) and available databases – outcome isn’t particularly clear for the Court 36 GULP Colloquium 30th January 2014
  37. 37. 5.2 Normalisation(from DNA) • control big sources of (e.g. regional background, sex) variation in the database and use a correction factor to normalise for lower level variation (see Balding & Nichols 1994) BUT: – requires a priori knowledge of the type of variation in the dataset – not clear mathematically how this should be done for such multidimensional data 37 GULP Colloquium 30th January 2014
  38. 38. 5.3 Speaker similarity • best to develop the speaker similarity approach in Morrison et al. (2012) • probably the underlying assumption should be that “it wasn’t our client, it was someone else who sounds like the offender” but decisions relating to linguistic evidence are best made by linguists! 38 GULP Colloquium 30th January 2014
  39. 39. 5.3 Speaker similarity • speaker similarity based on objective similarity (rather than lay listener judgments) – using distance scores (Euclidean distances ??) based on auditory judgment/ acoustic measurements – *should* capture speakers of the same sociolinguistic background too! • approach used by ASR systems – BatVox identifies the 30 ‘closest’ speakers to the suspect (but should be based on offender) 39 GULP Colloquium 30th January 2014
  40. 40. 6. Conclusions • speech is complex and multivariate: - both in terms of the things we analysis and the degree of within- and between-speaker variation • defining the relevant population for speech is a difficult issue – current approaches are inadequate – reflect a lack of awareness of the complexity of speech – preference for logical correctness over linguistic correctness 40 GULP Colloquium 30th January 2014
  41. 41. 6. Conclusions • DNA model is problematic for speech: – but elements of it may be able to be adapted to improve the situation • probably best to look at speaker similarity – expert doesn’t have to commit to saying that the offender is a white, middle class male from X – more simple but remaining logically/ legally/ linguistically appropriate 41 GULP Colloquium 30th January 2014
  42. 42. Thanks! Questions? Acknowledgements: Paul Foulkes, Erica Gold, Peter French, Dom Watt, Ashley Brereton, FSS Research Group (York) 42

×