Balancing validity and reliability as a function of sampling variability in forensic voice comparison.

Gespeichert in:
Bibliographische Detailangaben
Titel: Balancing validity and reliability as a function of sampling variability in forensic voice comparison.
Autoren: Wang BX; Department of English and Communication, The Hong Kong Polytechnic University, Hong Kong, China. Electronic address: brucex.wang@polyu.edu.hk., Hughes V; Department of Language and Linguistic Science, University of York, UK. Electronic address: vincent.hughes@york.ac.uk.
Quelle: Science & justice : journal of the Forensic Science Society [Sci Justice] 2024 Nov; Vol. 64 (6), pp. 649-659. Date of Electronic Publication: 2024 Oct 10.
Publikationsart: Journal Article
Sprache: English
Info zur Zeitschrift: Publisher: Elsevier Country of Publication: England NLM ID: 9508563 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1876-4452 (Electronic) Linking ISSN: 13550306 NLM ISO Abbreviation: Sci Justice Subsets: MEDLINE
Imprint Name(s): Publication: London : Elsevier
Original Publication: Harrogate, North Yorkshire, UK ; Middlesex, NJ : The Society, c1995-
MeSH-Schlagworte: Forensic Sciences*/methods , Voice*, Humans ; Reproducibility of Results ; Speech Recognition Software ; Computer Simulation
Abstract: In forensic comparison sciences, experts are required to compare samples of known and unknown origin to evaluate the strength of the evidence assuming they came from the same- and different-sources. The application of valid (if the method measures what it is intended to) and reliable (if that method produces consistent results) forensic methods is required across many jurisdictions, such as the England & Wales Criminal Practice Directions 19A and UK Crown Prosecution Service and highlighted in the 2009 National Academy of Sciences report and by the President's Council of Advisors on Science and Technology in 2016. The current study uses simulation to examine the effect of number of speakers and sampling variability and on the evaluation of validity and reliability using different generations of automatic speaker recognition (ASR) systems in forensic voice comparison (FVC). The results show that the state-of-the-art system had better overall validity compared with less advanced systems. However, better validity does not necessarily lead to high reliability, and very often the opposite is true. Better system validity and higher discriminability have the potential of leading to a higher degree of uncertainty and inconsistency in the output (i.e. poorer reliability). This is particularly the case when dealing with small number of speakers, where the observed data does not adequately support density estimation, resulting in extrapolation, as is commonly expected in FVC casework.
(Copyright © 2024 The Chartered Society of Forensic Sciences. Published by Elsevier B.V. All rights reserved.)
Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Entry Date(s): Date Created: 20241205 Date Completed: 20241205 Latest Revision: 20241205
Update Code: 20250114
DOI: 10.1016/j.scijus.2024.10.002
PMID: 39638484
Datenbank: MEDLINE
Beschreibung
Abstract:In forensic comparison sciences, experts are required to compare samples of known and unknown origin to evaluate the strength of the evidence assuming they came from the same- and different-sources. The application of valid (if the method measures what it is intended to) and reliable (if that method produces consistent results) forensic methods is required across many jurisdictions, such as the England & Wales Criminal Practice Directions 19A and UK Crown Prosecution Service and highlighted in the 2009 National Academy of Sciences report and by the President's Council of Advisors on Science and Technology in 2016. The current study uses simulation to examine the effect of number of speakers and sampling variability and on the evaluation of validity and reliability using different generations of automatic speaker recognition (ASR) systems in forensic voice comparison (FVC). The results show that the state-of-the-art system had better overall validity compared with less advanced systems. However, better validity does not necessarily lead to high reliability, and very often the opposite is true. Better system validity and higher discriminability have the potential of leading to a higher degree of uncertainty and inconsistency in the output (i.e. poorer reliability). This is particularly the case when dealing with small number of speakers, where the observed data does not adequately support density estimation, resulting in extrapolation, as is commonly expected in FVC casework.<br /> (Copyright © 2024 The Chartered Society of Forensic Sciences. Published by Elsevier B.V. All rights reserved.)
ISSN:1876-4452
DOI:10.1016/j.scijus.2024.10.002