DuReS: An R Package for Denoising Experimental Tandem Mass Spectra and Metabolite Annotation

Gespeichert in:
Bibliographische Detailangaben
Titel: DuReS: An R Package for Denoising Experimental Tandem Mass Spectra and Metabolite Annotation
Autoren: Shayantan Banerjee, Prajval Nakrani, Aviral Singh, Pramod P. Wangikar
Publikationsjahr: 2025
Bestand: The University of Auckland: Figshare
Schlagwörter: Biochemistry, Genetics, Cancer, Space Science, Biological Sciences not elsewhere classified, Chemical Sciences not elsewhere classified, profiling small molecules, optional tuning module, https :// github, dependent acquisition mode, denoised mzml files, consuming manual verification, accepts mzml files, unlike random noise, maximizing noise reduction, intrinsic noise characteristics, main denoising module, fewer false positives, based untargeted metabolomics, often data set, data set ’, random noise peaks, tandem mass spectra, conventional intensity thresholding, mass spectra, removing noise, intensity thresholding, intensity peaks, metabolomics repositories, false annotations
Beschreibung: Mass spectrometry-based untargeted metabolomics is a powerful technique for profiling small molecules in biological samples, yet accurate metabolite identification remains challenging. The presence of random noise peaks in tandem mass spectra can lead to false annotations and necessitate time-consuming manual verification. A common method for removing noise from mass spectra is intensity thresholding, where low-intensity peaks are discarded by applying a user-defined cutoff. However, determining an optimal threshold is often data set-specific and may still retain many noisy peaks. We hypothesize that true signal peaks consistently recur across replicate tandem spectra generated from the same precursor ion, unlike random noise. Here, we present a freely available R package, Denoising Using Replicate Spectra (DuReS) (https://github.com/BiosystemEngineeringLab-IITB/dures), which accepts mzML files and feature lists and returns high-quality annotations and denoised mzML files, enabling users to integrate the denoising pipeline into their workflow seamlessly. This package is designed for data-dependent acquisition mode (DDA) data. It has (i) the main denoising module and (i) an optional tuning module to determine each data set’s optimal recurrence frequency cutoff ( F threshold ), considering variations in the intrinsic noise characteristics. We tested the tool on eight representative data sets selected from those available in metabolomics repositories. Our approach minimizes signal loss while maximizing noise reduction, effectively preserving diagnostically significant low-intensity fragments that would otherwise be lost through conventional intensity thresholding. This improves spectral matching metrics, leading to more accurate annotations and fewer false positives.
Publikationsart: article in journal/newspaper
Sprache: unknown
Relation: https://figshare.com/articles/journal_contribution/DuReS_An_R_Package_for_Denoising_Experimental_Tandem_Mass_Spectra_and_Metabolite_Annotation/29255777
DOI: 10.1021/acs.analchem.5c01726.s001
Verfügbarkeit: https://doi.org/10.1021/acs.analchem.5c01726.s001
https://figshare.com/articles/journal_contribution/DuReS_An_R_Package_for_Denoising_Experimental_Tandem_Mass_Spectra_and_Metabolite_Annotation/29255777
Rights: CC BY-NC 4.0
Dokumentencode: edsbas.C68FF941
Datenbank: BASE
Beschreibung
Abstract:Mass spectrometry-based untargeted metabolomics is a powerful technique for profiling small molecules in biological samples, yet accurate metabolite identification remains challenging. The presence of random noise peaks in tandem mass spectra can lead to false annotations and necessitate time-consuming manual verification. A common method for removing noise from mass spectra is intensity thresholding, where low-intensity peaks are discarded by applying a user-defined cutoff. However, determining an optimal threshold is often data set-specific and may still retain many noisy peaks. We hypothesize that true signal peaks consistently recur across replicate tandem spectra generated from the same precursor ion, unlike random noise. Here, we present a freely available R package, Denoising Using Replicate Spectra (DuReS) (https://github.com/BiosystemEngineeringLab-IITB/dures), which accepts mzML files and feature lists and returns high-quality annotations and denoised mzML files, enabling users to integrate the denoising pipeline into their workflow seamlessly. This package is designed for data-dependent acquisition mode (DDA) data. It has (i) the main denoising module and (i) an optional tuning module to determine each data set’s optimal recurrence frequency cutoff ( F threshold ), considering variations in the intrinsic noise characteristics. We tested the tool on eight representative data sets selected from those available in metabolomics repositories. Our approach minimizes signal loss while maximizing noise reduction, effectively preserving diagnostically significant low-intensity fragments that would otherwise be lost through conventional intensity thresholding. This improves spectral matching metrics, leading to more accurate annotations and fewer false positives.
DOI:10.1021/acs.analchem.5c01726.s001