Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Issues and opportunities for the application of the numerical likelihood ratio framework to forensic voice comparison

Gold, E. and Hughes, V. (2013) Issues and opportunities for the application of the numerical likelihood ratio framework to forensic voice comparison. Paper presented at International Association for Forensic Phonetics and Acoustics (IAFPA) Conference, University of South Florida, Tampa. 21-24 July 2013.

  • Be the first to comment

  • Be the first to like this

Issues and opportunities for the application of the numerical likelihood ratio framework to forensic voice comparison

  1. 1. Erica Gold & Vincent Hughes July 24, 2013 Issues and opportunities for the application of the numerical likelihood ratio framework to forensic speaker comparison 1
  2. 2. outline n introduction n background n difficulties: (i) relevant population (ii) modelling (iii) correlations n discussion 2
  3. 3. introduction n likelihood ratio (LR) increasingly accepted as the “logically and legally correct” (Rose and Morrison 2009:143) framework for the expression of expert conclusions n relatively strong representation of the LR framework in forensic speaker comparison (FSC) n roughly 20% of experts using either a numerical or verbal LR (Gold and French 2011) n the development of the LR framework in the field of forensic phonetics is largely thanks to a small community of researchers 3
  4. 4. LR in FSC research n LR-based research in FSC focused on two main areas: (i) speaker-discriminatory power of individual parameters - exclusively focused on continuous data (and almost exclusively on vowels) (ii) methodological advances - procedures for assessing validity and reliability - calibration (for improving system validity) - fusion (for combining correlated parameters) 4
  5. 5. n calculation of LRs: ignore parameters which can’t be handled by current models - our view: we have a duty to analyse all of the (relevant) evidence - Morrison: “this is a fallacy. Consider what proportion of the genome is analysed in forensic DNA analysis” n combination of LRs: fusion of parameters based on correlations in resulting LRs (rather than linguistic correlations in the data) 5 LR in practice
  6. 6. complexity of speech evidence n inherent variability of speech: - within-speaker variation: no two utterances are ever the same n interlocutor n topic n emotion n time of day etc… 6 ∴ p(E|Hp) ≠ 1
  7. 7. complexity of speech evidence n inherent variability of speech: - between-speaker variation: n biological/ physiological factors (e.g. size/ shape of the vocal tract) n social factors (regional varieties/ class/ ethnicity…) n attitude (towards speech community) n habitual/ idiosyncratic features etc… 7
  8. 8. i. Relevant Population 8
  9. 9. relevant population: theory 9 “logical relevance” (Kaye 2004, 2008) n ‘language/ sex/ non-contemporaneity’ (Rose 2004) BUT: n can’t know what is logically relevant (paradox) n reflects a limited view of variation within- and between-individuals(why just sex and region?) n parameters are affected by sources of variation in different ways/ to different degrees/ interact with each other
  10. 10. relevant population: theory 10 speaker-similarity (Morrison et al 2012) n similar soundingspeakers to the offender as judged by lay listeners BUT: n limited view of variation in perception and production n what to control in our listeners? n What do listeners hear? (pre controls over speakers = sex and region) n lay listeners = linguistically erratic
  11. 11. relevant population: practice 11 i. case-by-case basis (Rose 2007) n need reference data for every feature we would want to analyse n time consuming/ still inevitable mismatch with facts of the case at trial ii. existing database (e.g. Nolan et al 2009) n may be forensic/sociolinguistic n appropriate? enough data?
  12. 12. ii. Modelling 12
  13. 13. modelling n speech data in generally very complex n currently only a small handful of models available to calculate LRs for continuous data only n i.e. univariate LRs (Lindley 1977), MVKD (Aitken and Lucy 2004), UBM-GMM (Reynolds et al. 2000) n both continuous and discrete data present n combination of both = multi-levelled variation e.g. 1st level discrete & 2nd level continuous 13
  14. 14. modelling 14 Word-initial /t/ Discrete phonetic level[th] [ts] [s] [d]… - energy in the release burst (centre of gravity, dispersion etc.) - duration of hold/ release phases - overall duration … Continuous level
  15. 15. modelling n not all data is continuous n some of the most helpful parameters are discrete - i.e. voice quality using the VPA (Gold and French 2011) n many speech parameters are not considered under a numerical LR framework due to these lack of models - effectively only providing a partial analysis of an individuals’ speech characteristics 15
  16. 16. iii. Correlations 16
  17. 17. correlations n naïve Bayes: LRs from multiple parameters may be combined using the independent product rule - but ling-phon parameters are often highly correlated (between- and/or within-individuals) (as a result of anatomical/ social factors) n early LR-based research used naïve Bayes to combine parameters - often with disregard for linguistic correlations - but some with empirical testing 17
  18. 18. correlations n MVKD as a means of modeling multivariate data - accountingfor correlations withinparameters - but still issues with correlations between parameters n logistic-regression fusion as a potential solution - currently the only alternative - procedure developed for automatic speaker recognitionsystems 18
  19. 19. correlations n Fusion = “back-end processing” - considering the correlationsin the LRs between parameters - generates overall LR “it is…possible…that two segments which are not correlated by virtue of their internal structure and which therefore should be naively combined, nevertheless have LRs which do correlate” (Rose 2010:32) 19
  20. 20. Discussion 20
  21. 21. discussion: relevant population n linguistically informed assessment of the relevant population - linguistically-objective speaker-similarity? - logical relevance? - are these the same thing? n more overt/conscious awareness of the range of factors which affect variation within and between individuals n more testing of the effects of mismatch on LRs 21
  22. 22. discussion: modelling n creating models to fit the data - rather than applying models from other fields n more fair representation of speaker characteristics by incorporating all (relevant) parameters - linguistics/phonetics informing the choice of parameter rather than statistical models 22
  23. 23. discussion: correlations n thinking about linguistic correlations in the data rather than in the LRs - front-end processing: through theory and structural learning - Bayesian networking or graphical models n empirical testing of how well fusion captures linguistic correlations n avoids unnecessary doubling of evidence/ weakening strength of evidence 23
  24. 24. conclusion n linguistics/ phonetics should inform how we apply and calculate LRs - rather than the framework dictating what can/ can’t incorporate n the complexity of linguistic-phonetic evidence shouldn’t be ignored - offers opportunity for speech to be at the forefront of forensic science - solutions for other fields… 24
  25. 25. references n Aitken, C. G. G. and Lucy, D. (2004) Evaluation of trace evidence in the form of multivariate data. Applied Statistics 54: 109-122. n Bayes, T. (1763) An essay towards solving a problem in the doctrine of chances. Philosophical Transcripts of the Royal Society of London 53: 370-418. n Brümmer, N., Burget, L., Cernocký, J. H., Glembek, O., Grézl, F., Karafiát, M., van Leeuwen, D. A., Matejka, P., Schwarz, P. and Strasheim, A. (2007) Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST SRE 2006. IEEE Transactions on Audio Speech and Language Processing 15: 2072-2084. n Gold, E. and French, P. (2011) International practices in forensic speaker comparison. International Journal of Speech, Language and the Law 18: 293-307. n Lindley, D. V. (1977) A problem in forensic science. Biometrika 64: 207-213. n Morrison, G. S. (2009a) The place of forensic voice comparison in the ongoing paradigm shift. Written version of an invited presentation at the 2nd International Conference on Evidence Law and Forensic Science. 25-26 July 2009. Beijing, China. 1-16. n Rose, P. (2010) The effect of correlation on strength of evidence estimates in forensic voice comparison: uni- and multivariate likelihood ratio-based discrimination with Australian English vowel acoustics. International Journal of Biometrics 2(4): 316-329. 25
  26. 26. acknowledgements This research has received funding from the European Community's Seventh Framework Programme (FP7/2007-2013) under grant agreement number 238803 and the Economic Social Research Council Thanks also go to Colin Aitken, Paul Foulkes, Peter French, Michael Jessen, Tereza Neocleous 26