To Fly, or Not to Fly, That Is the Question: A Deep Learning Model for Peptide Detectability Prediction in Mass Spectrometry
Saved in:
| Title: | To Fly, or Not to Fly, That Is the Question: A Deep Learning Model for Peptide Detectability Prediction in Mass Spectrometry |
|---|---|
| Authors: | Naim Abdul-Khalek, Mario Picciani, Omar Shouman, Reinhard Wimmer, Michael Toft Overgaard, Mathias Wilhelm, Simon Gregersen Echers |
| Publication Year: | 2025 |
| Subject Terms: | Biochemistry, Sociology, Science Policy, Biological Sciences not elsewhere classified, Information Systems not elsewhere classified, specific experimental conditions, resulting physicochemical properties, ms data challenges, adaptability allows researchers, synthetic peptide library, peptide identification due, offering high performance, improving predictive capacity, https :// github, deep learning model, peptide detectability prediction, peptide detectability, detectability prediction, peptide sequences, peptide sequence, improving accuracy, high variability, subsequently fine, strongly related, species datasets, reliable state, protein abundance, outperforming state, negative impact, mass spectrometry |
| Description: | Identifying detectable peptides, known as flyers, is key in mass spectrometry-based proteomics. Peptide detectability is strongly related to peptide sequences and their resulting physicochemical properties. Moreover, the high variability in MS data challenges the development of a generic model for detectability prediction, underlining the need for customizable tools. We present Pfly, a deep learning model developed to predict peptide detectability based solely on peptide sequence. Pfly is a versatile and reliable state-of-the-art tool, offering high performance, accessibility, and easy customizability for end-users. This adaptability allows researchers to tailor Pfly to specific experimental conditions, improving accuracy and expanding applicability across various research fields. Pfly is an encoder-decoder with an attention mechanism, classifying peptides as flyers or non-flyers, and providing both binary and categorical probabilities for four distinct classes defined in this study. The model was initially trained on a synthetic peptide library and subsequently fine-tuned with a biological dataset to mitigate bias toward synthesizability, improving predictive capacity and outperforming state-of-the-art predictors in benchmark comparisons across different human and cross-species datasets. The study further investigates the influence of protein abundance and rescoring, illustrating the negative impact on peptide identification due to misclassification. Pfly has been integrated into the DLOmix framework and is accessible on GitHub at https://github.com/wilhelm-lab/dlomix. |
| Document Type: | article in journal/newspaper |
| Language: | unknown |
| Relation: | https://figshare.com/articles/journal_contribution/To_Fly_or_Not_to_Fly_That_Is_the_Question_A_Deep_Learning_Model_for_Peptide_Detectability_Prediction_in_Mass_Spectrometry/28996342 |
| DOI: | 10.1021/acs.jproteome.4c00973.s001 |
| Availability: | https://doi.org/10.1021/acs.jproteome.4c00973.s001 https://figshare.com/articles/journal_contribution/To_Fly_or_Not_to_Fly_That_Is_the_Question_A_Deep_Learning_Model_for_Peptide_Detectability_Prediction_in_Mass_Spectrometry/28996342 |
| Rights: | CC BY-NC 4.0 |
| Accession Number: | edsbas.77636B3C |
| Database: | BASE |
| Abstract: | Identifying detectable peptides, known as flyers, is key in mass spectrometry-based proteomics. Peptide detectability is strongly related to peptide sequences and their resulting physicochemical properties. Moreover, the high variability in MS data challenges the development of a generic model for detectability prediction, underlining the need for customizable tools. We present Pfly, a deep learning model developed to predict peptide detectability based solely on peptide sequence. Pfly is a versatile and reliable state-of-the-art tool, offering high performance, accessibility, and easy customizability for end-users. This adaptability allows researchers to tailor Pfly to specific experimental conditions, improving accuracy and expanding applicability across various research fields. Pfly is an encoder-decoder with an attention mechanism, classifying peptides as flyers or non-flyers, and providing both binary and categorical probabilities for four distinct classes defined in this study. The model was initially trained on a synthetic peptide library and subsequently fine-tuned with a biological dataset to mitigate bias toward synthesizability, improving predictive capacity and outperforming state-of-the-art predictors in benchmark comparisons across different human and cross-species datasets. The study further investigates the influence of protein abundance and rescoring, illustrating the negative impact on peptide identification due to misclassification. Pfly has been integrated into the DLOmix framework and is accessible on GitHub at https://github.com/wilhelm-lab/dlomix. |
|---|---|
| DOI: | 10.1021/acs.jproteome.4c00973.s001 |
Nájsť tento článok vo Web of Science