ROASMI: accelerating small molecule identification by repurposing retention data

The limited replicability of retention data hinders its application in untargeted metabolomics for small molecule identification. While retention order models hold promise in addressing this issue, their predictive reliability is limited by uncertain generalizability. Here, we present the ROASMI mod...

Full description

Saved in:
Bibliographic Details
Published in:Journal of cheminformatics Vol. 17; no. 1; pp. 20 - 15
Main Authors: Sun, Fang-Yuan, Yin, Ying-Hao, Liu, Hui-Jun, Shen, Lu-Na, Kang, Xiu-Lin, Xin, Gui-Zhong, Liu, Li-Fang, Zheng, Jia-Yi
Format: Journal Article
Language:English
Published: Cham Springer International Publishing 14.02.2025
BioMed Central Ltd
Springer Nature B.V
BMC
Subjects:
ISSN:1758-2946, 1758-2946
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The limited replicability of retention data hinders its application in untargeted metabolomics for small molecule identification. While retention order models hold promise in addressing this issue, their predictive reliability is limited by uncertain generalizability. Here, we present the ROASMI model, which enables reliable prediction of retention order within a well-defined application domain by coupling data-driven molecular representation and mechanistic insights. The generalizability of ROASMI is proven by 71 independent reversed-phase liquid chromatography (RPLC) datasets. The application of ROASMI to four real-world datasets demonstrates its advantages in distinguishing coexisting isomers with similar fragmentation patterns and in annotating detection peaks without informative spectra. ROASMI is flexible enough to be retrained with user-defined reference sets and is compatible with other MS/MS scorers, making further improvements in small-molecule identification.  Scientific Contribution Our work discovers the dependence of buffer pH on the replicability of retention sequences in RPLC systems. Building upon this mechanistic insight, we have constructed a generalizability-oriented retention order prediction model called ROASMI, which is capable of providing reliable predictions across heterogeneous datasets with diverse chromatographic and chemical spaces.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1758-2946
1758-2946
DOI:10.1186/s13321-025-00968-8