Oktoberfest: Open‐source spectral library generation and rescoring pipeline based on Prosit

Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico reference libraries. These libraries are used in various applications, ranging from data‐independent acquisition (DIA) data analysis to data‐driven r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proteomics (Weinheim) Jg. 24; H. 8; S. e2300112 - n/a
Hauptverfasser: Picciani, Mario, Gabriel, Wassim, Giurcoiu, Victor‐George, Shouman, Omar, Hamood, Firas, Lautenbacher, Ludwig, Jensen, Cecilia Bang, Müller, Julian, Kalhor, Mostafa, Soleymaniniya, Armin, Kuster, Bernhard, The, Matthew, Wilhelm, Mathias
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Germany Wiley Subscription Services, Inc 01.04.2024
Schlagworte:
ISSN:1615-9853, 1615-9861, 1615-9861
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico reference libraries. These libraries are used in various applications, ranging from data‐independent acquisition (DIA) data analysis to data‐driven rescoring of search engine results. Here, we present Oktoberfest, an open source Python package of our spectral library generation and rescoring pipeline originally only available online via ProteomicsDB. Oktoberfest is largely search engine agnostic and provides access to online peptide property predictions, promoting the adoption of state‐of‐the‐art ML/DL models in proteomics analysis pipelines. We demonstrate its ability to reproduce and even improve our results from previously published rescoring analyses on two distinct use cases. Oktoberfest is freely available on GitHub (https://github.com/wilhelm‐lab/oktoberfest) and can easily be installed locally through the cross‐platform PyPI Python package.
AbstractList Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico reference libraries. These libraries are used in various applications, ranging from data‐independent acquisition (DIA) data analysis to data‐driven rescoring of search engine results. Here, we present Oktoberfest, an open source Python package of our spectral library generation and rescoring pipeline originally only available online via ProteomicsDB. Oktoberfest is largely search engine agnostic and provides access to online peptide property predictions, promoting the adoption of state‐of‐the‐art ML/DL models in proteomics analysis pipelines. We demonstrate its ability to reproduce and even improve our results from previously published rescoring analyses on two distinct use cases. Oktoberfest is freely available on GitHub ( https://github.com/wilhelm‐lab/oktoberfest ) and can easily be installed locally through the cross‐platform PyPI Python package.
Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico reference libraries. These libraries are used in various applications, ranging from data-independent acquisition (DIA) data analysis to data-driven rescoring of search engine results. Here, we present Oktoberfest, an open source Python package of our spectral library generation and rescoring pipeline originally only available online via ProteomicsDB. Oktoberfest is largely search engine agnostic and provides access to online peptide property predictions, promoting the adoption of state-of-the-art ML/DL models in proteomics analysis pipelines. We demonstrate its ability to reproduce and even improve our results from previously published rescoring analyses on two distinct use cases. Oktoberfest is freely available on GitHub (https://github.com/wilhelm-lab/oktoberfest) and can easily be installed locally through the cross-platform PyPI Python package.Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico reference libraries. These libraries are used in various applications, ranging from data-independent acquisition (DIA) data analysis to data-driven rescoring of search engine results. Here, we present Oktoberfest, an open source Python package of our spectral library generation and rescoring pipeline originally only available online via ProteomicsDB. Oktoberfest is largely search engine agnostic and provides access to online peptide property predictions, promoting the adoption of state-of-the-art ML/DL models in proteomics analysis pipelines. We demonstrate its ability to reproduce and even improve our results from previously published rescoring analyses on two distinct use cases. Oktoberfest is freely available on GitHub (https://github.com/wilhelm-lab/oktoberfest) and can easily be installed locally through the cross-platform PyPI Python package.
Author Hamood, Firas
Jensen, Cecilia Bang
Shouman, Omar
Gabriel, Wassim
Lautenbacher, Ludwig
Picciani, Mario
Soleymaniniya, Armin
Kalhor, Mostafa
Müller, Julian
Wilhelm, Mathias
The, Matthew
Giurcoiu, Victor‐George
Kuster, Bernhard
Author_xml – sequence: 1
  givenname: Mario
  orcidid: 0000-0003-0428-1703
  surname: Picciani
  fullname: Picciani, Mario
  organization: Technical University of Munich
– sequence: 2
  givenname: Wassim
  orcidid: 0000-0001-6440-9794
  surname: Gabriel
  fullname: Gabriel, Wassim
  organization: Technical University of Munich
– sequence: 3
  givenname: Victor‐George
  orcidid: 0000-0002-1190-6954
  surname: Giurcoiu
  fullname: Giurcoiu, Victor‐George
  organization: Technical University of Munich
– sequence: 4
  givenname: Omar
  orcidid: 0000-0002-9077-3036
  surname: Shouman
  fullname: Shouman, Omar
  organization: Technical University of Munich
– sequence: 5
  givenname: Firas
  orcidid: 0000-0002-4141-7051
  surname: Hamood
  fullname: Hamood, Firas
  organization: Technical University of Munich
– sequence: 6
  givenname: Ludwig
  orcidid: 0000-0002-1540-5911
  surname: Lautenbacher
  fullname: Lautenbacher, Ludwig
  organization: Technical University of Munich
– sequence: 7
  givenname: Cecilia Bang
  orcidid: 0009-0007-7227-3840
  surname: Jensen
  fullname: Jensen, Cecilia Bang
  organization: Technical University of Munich
– sequence: 8
  givenname: Julian
  orcidid: 0000-0003-4108-7926
  surname: Müller
  fullname: Müller, Julian
  organization: Technical University of Munich
– sequence: 9
  givenname: Mostafa
  orcidid: 0009-0006-2548-4154
  surname: Kalhor
  fullname: Kalhor, Mostafa
  organization: Technical University of Munich
– sequence: 10
  givenname: Armin
  orcidid: 0000-0002-7799-6091
  surname: Soleymaniniya
  fullname: Soleymaniniya, Armin
  organization: Technical University of Munich
– sequence: 11
  givenname: Bernhard
  orcidid: 0000-0002-9094-1677
  surname: Kuster
  fullname: Kuster, Bernhard
  organization: Technical University of Munich
– sequence: 12
  givenname: Matthew
  orcidid: 0000-0002-5401-5553
  surname: The
  fullname: The, Matthew
  organization: Technical University of Munich
– sequence: 13
  givenname: Mathias
  orcidid: 0000-0002-9224-3258
  surname: Wilhelm
  fullname: Wilhelm, Mathias
  email: mathias.wilhelm@tum.de
  organization: Technical University of Munich
BackLink https://www.ncbi.nlm.nih.gov/pubmed/37672792$$D View this record in MEDLINE/PubMed
BookMark eNqFkb1OHDEUha0IFH6SNmVkiSbNLrbvjD2mQ6tAkIiWIikjy_bcQSaznsGeFaLjEXhGngSTJVsgRals6X7HvuecA7ITh4iEfOJszhkTx-Mq-LlgAhjjXLwj-1zyeqYbyXe29xr2yEHONwVRjVbvyR4oqYTSYp_8Wv6eBoepwzyd0OWI8enhMQ_r5JHmEf2UbE_74JJN9_QaIyY7hSFSG1uaMPshhXhNxzBiHyJSZzO2tMyv0pDD9IHsdrbP-PH1PCQ_z77-WHybXS7PLxanlzNfMS1nnrWoamiZFsppjU40FVTCSa1cA9CCtbWCrpPaAfMICLypUNeia1pWVx0cki-bd8c03K6LFbMK2WPf24jDOhvRSFFC4sAKevQGvSluY9nOlKluBJdMF-rzK7V2K2zNmMKqJGD-BleA-QbwxWhO2G0RzsxLM-alGbNtpgiqNwIfpj9ZlohD_29ZvZHdhR7v__OJufp-seCgQMIz3AyigQ
CitedBy_id crossref_primary_10_1002_mas_21933
crossref_primary_10_1038_s41467_024_46408_3
crossref_primary_10_1038_s41587_025_02701_0
crossref_primary_10_1002_pmic_202300081
crossref_primary_10_1038_s41596_024_01033_8
crossref_primary_10_1016_j_cell_2024_12_016
crossref_primary_10_1016_j_mcpro_2025_101010
crossref_primary_10_1002_mas_21905
crossref_primary_10_1186_s12864_024_10521_w
crossref_primary_10_1002_pmic_202400225
crossref_primary_10_1016_j_mcpro_2025_100924
crossref_primary_10_1016_j_tibtech_2025_07_026
crossref_primary_10_1093_bioadv_vbaf125
crossref_primary_10_1002_pmic_202300336
crossref_primary_10_1002_rcm_9937
crossref_primary_10_1016_j_mcpro_2025_100937
crossref_primary_10_1038_s41467_024_48322_0
crossref_primary_10_1038_s41467_025_61203_4
crossref_primary_10_1021_acs_jproteome_4c00973
crossref_primary_10_1021_acs_analchem_5c02002
crossref_primary_10_1021_acs_jproteome_4c00304
crossref_primary_10_1021_acs_jproteome_5c00279
crossref_primary_10_1021_acs_jproteome_4c00967
crossref_primary_10_1038_s41592_025_02719_x
Cites_doi 10.1038/nmeth.4256
10.1038/s41592‐019‐0426‐7
10.1021/acs.jproteome.8b00993
10.1038/s41592‐019‐0686‐2
10.1038/s41467‐023‐40129‐9
10.1038/s41467‐021‐23713‐9
10.1007/978‐1‐60761‐444‐9_5
10.1021/acs.jproteome.0c01010
10.1038/s41467‐021‐23667‐y
10.1021/acs.jproteome.2c00672
10.1016/j.mcpro.2022.100432
10.1016/j.mcpro.2022.100437
10.1002/mas.21540
10.1021/acs.jproteome.2c00609
10.1002/pmic.201100386
10.1101/2021.11.02.466886
10.1016/j.mcpro.2022.100238
10.25080/Majora-92bf1922-00a
10.1038/s41586‐020‐2649‐2
10.1038/s41467‐022‐34904‐3
10.1021/acs.jproteome.2c00821
10.21105/joss.03021
10.1021/acs.jproteome.2c00423
10.1021/acs.jproteome.8b00717
10.1074/mcp.O113.036475
10.3390/nu15030783
10.1007/s13361‐016‐1460‐7
10.1101/2022.02.07.479481
10.1016/j.mcpro.2021.100138
10.1038/s41592‐021‐01184‐6
10.1093/nar/gkz299
10.1038/nmeth1113
10.1038/nbt.2942
10.3389/fmicb.2016.00813
10.1038/s41587‐023‐01714‐x
10.1021/acs.jproteome.1c00096
10.1021/acs.jproteome.0c00518
10.1002/rcm.9128
10.1021/acs.analchem.1c05435
10.1002/rcm.9088
10.1038/s41592‐019‐0638‐x
10.1038/s41592‐022‐01526‐y
10.1038/s41467‐020‐15346‐1
10.1021/pr7006818
10.1007/s13361‐012‐0516‐6
10.1021/acs.jproteome.9b00328
10.1074/mcp.R110.000133
10.1074/mcp.RA117.000314
10.1038/s41467‐021‐27542‐8
10.1093/bioinformatics/bty046
10.1038/nmeth1019
10.1038/nmeth.4153
10.1002/pmic.200600625
10.1002/pmic.201700263
10.1038/s41592‐021‐01301‐5
10.1002/pmic.202100257
10.1038/s41587‐019‐0322‐9
ContentType Journal Article
Copyright 2023 The Authors. published by Wiley‐VCH GmbH.
2023 The Authors. PROTEOMICS published by Wiley‐VCH GmbH.
2023. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2023 The Authors. published by Wiley‐VCH GmbH.
– notice: 2023 The Authors. PROTEOMICS published by Wiley‐VCH GmbH.
– notice: 2023. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 24P
AAYXX
CITATION
NPM
7QO
7QP
7TK
7TM
8FD
FR3
K9.
M7N
P64
RC3
7X8
DOI 10.1002/pmic.202300112
DatabaseName Wiley Online Library Open Access
CrossRef
PubMed
Biotechnology Research Abstracts
Calcium & Calcified Tissue Abstracts
Neurosciences Abstracts
Nucleic Acids Abstracts
Technology Research Database
Engineering Research Database
ProQuest Health & Medical Complete (Alumni)
Algology Mycology and Protozoology Abstracts (Microbiology C)
Biotechnology and BioEngineering Abstracts
Genetics Abstracts
MEDLINE - Academic
DatabaseTitle CrossRef
PubMed
Genetics Abstracts
Biotechnology Research Abstracts
Technology Research Database
Algology Mycology and Protozoology Abstracts (Microbiology C)
Nucleic Acids Abstracts
ProQuest Health & Medical Complete (Alumni)
Engineering Research Database
Calcium & Calcified Tissue Abstracts
Neurosciences Abstracts
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
DatabaseTitleList CrossRef

MEDLINE - Academic
Genetics Abstracts
PubMed
Database_xml – sequence: 1
  dbid: 24P
  name: Wiley Online Library Open Access
  url: https://authorservices.wiley.com/open-science/open-access/browse-journals.html
  sourceTypes: Publisher
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Anatomy & Physiology
Chemistry
EISSN 1615-9861
EndPage n/a
ExternalDocumentID 37672792
10_1002_pmic_202300112
PMIC13736
Genre shortCommunication
Journal Article
GrantInformation_xml – fundername: European Research Council
  funderid: 101077037; 833710
– fundername: Bundesministerium für Bildung und Forschung
  funderid: 031L0168; 031L0305A
– fundername: H2020 Marie Skłodowska‐Curie Actions
  funderid: 956148
– fundername: European Proteomics Infrastructure Consortium providing access
  funderid: 823839
– fundername: Elitenetzwerk Bayern
  funderid: F‐6‐M5613.6.K‐NW‐2021‐411/1/1
– fundername: Munich Data Science Institute (MDSI) Seed‐Funds
– fundername: European Research Council
  grantid: 101077037
– fundername: European Research Council
  grantid: 833710
GroupedDBID ---
.3N
.GA
.Y3
05W
0R~
10A
123
1L6
1OC
24P
31~
33P
3SF
3WU
4.4
4ZD
50Y
50Z
51W
51X
52M
52N
52O
52P
52S
52T
52U
52W
52X
53G
5VS
66C
702
7PT
8-1
8-4
8-5
8UM
930
A03
AAESR
AAEVG
AAHHS
AAHQN
AAMNL
AANHP
AANLZ
AAONW
AASGY
AAXRX
AAYCA
AAZKR
ABCQN
ABCUV
ABEML
ABIJN
ABJNI
ABPVW
ACAHQ
ACBWZ
ACCFJ
ACCZN
ACFBH
ACGFS
ACIWK
ACPOU
ACPRK
ACRPL
ACSCC
ACXBN
ACXQS
ACYXJ
ADBBV
ADEOM
ADIZJ
ADKYN
ADMGS
ADNMO
ADOZA
ADXAS
ADZMN
AEEZP
AEIGN
AEIMD
AENEX
AEQDE
AEUQT
AEUYR
AFBPY
AFFPM
AFGKR
AFPWT
AFRAH
AFWVQ
AFZJQ
AHBTC
AHMBA
AITYG
AIURR
AIWBW
AJBDE
AJXKR
ALAGY
ALMA_UNASSIGNED_HOLDINGS
ALUQN
ALVPJ
AMBMR
AMYDB
ASPBG
ATUGU
AUFTA
AVWKF
AZBYB
AZFZN
AZVAB
BAFTC
BDRZF
BFHJK
BHBCM
BMNLL
BMXJE
BNHUX
BROTX
BRXPI
BY8
CS3
D-F
DCZOG
DPXWK
DR2
DRFUL
DRSTM
DU5
EBD
EBS
EJD
EMOBN
F00
F01
F04
F5P
FEDTE
G-S
G.N
GNP
GODZA
H.T
H.X
HBH
HF~
HGLYW
HHY
HHZ
HVGLF
HZ~
IX1
J0M
JPC
KQQ
LATKE
LAW
LC2
LC3
LEEKS
LH4
LITHE
LOXES
LP6
LP7
LUTES
LW6
LYRES
MEWTI
MK4
MRFUL
MRSTM
MSFUL
MSSTM
MXFUL
MXSTM
N04
N05
N9A
NF~
NNB
O66
O9-
OIG
P2P
P2W
P2X
P4D
PQQKQ
Q.N
Q11
QB0
QRW
R.K
RNS
ROL
RWI
RX1
RYL
SUPJJ
SV3
UB1
V2E
W8V
W99
WBKPD
WIH
WIK
WJL
WNSPC
WOHZO
WQJ
WRC
WXSBR
WYISQ
XG1
XPP
XV2
Y6R
ZGI
ZZTAW
~IA
~KM
~WT
AAMMB
AAYXX
AEFGJ
AEYWJ
AGHNM
AGQPQ
AGXDD
AGYGG
AIDQK
AIDYY
CITATION
O8X
NPM
7QO
7QP
7TK
7TM
8FD
FR3
K9.
M7N
P64
RC3
7X8
ID FETCH-LOGICAL-c4096-c0de753d0927b99eb284342b697b833d3aa573ff69b30ce3e3184e952f8d054f3
IEDL.DBID 24P
ISICitedReferencesCount 33
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001059264800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1615-9853
1615-9861
IngestDate Fri Jul 11 10:06:57 EDT 2025
Sat Nov 29 14:31:39 EST 2025
Thu Apr 03 06:59:51 EDT 2025
Sat Nov 29 07:08:52 EST 2025
Tue Nov 18 21:10:18 EST 2025
Wed Jan 22 17:21:16 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 8
Keywords bottom‐up proteomics
technology
data processing and analysis
bioinformatics
mass spectrometry LC‐MS/MS
Language English
License Attribution
2023 The Authors. PROTEOMICS published by Wiley‐VCH GmbH.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c4096-c0de753d0927b99eb284342b697b833d3aa573ff69b30ce3e3184e952f8d054f3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ORCID 0009-0006-2548-4154
0000-0003-0428-1703
0000-0002-7799-6091
0000-0002-1190-6954
0009-0007-7227-3840
0000-0002-9094-1677
0000-0002-9077-3036
0000-0001-6440-9794
0000-0002-1540-5911
0000-0003-4108-7926
0000-0002-4141-7051
0000-0002-9224-3258
0000-0002-5401-5553
OpenAccessLink https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fpmic.202300112
PMID 37672792
PQID 3039821609
PQPubID 1016439
PageCount 11
ParticipantIDs proquest_miscellaneous_2862202130
proquest_journals_3039821609
pubmed_primary_37672792
crossref_primary_10_1002_pmic_202300112
crossref_citationtrail_10_1002_pmic_202300112
wiley_primary_10_1002_pmic_202300112_PMIC13736
PublicationCentury 2000
PublicationDate April 2024
2024-04-00
2024-Apr
20240401
PublicationDateYYYYMMDD 2024-04-01
PublicationDate_xml – month: 04
  year: 2024
  text: April 2024
PublicationDecade 2020
PublicationPlace Germany
PublicationPlace_xml – name: Germany
– name: Weinheim
PublicationTitle Proteomics (Weinheim)
PublicationTitleAlternate Proteomics
PublicationYear 2024
Publisher Wiley Subscription Services, Inc
Publisher_xml – name: Wiley Subscription Services, Inc
References 2021; 6
2021; 20
2023; 14
2010; 604
2023; 15
2013; 24
2010
2020; 17
2020; 39
2020; 38
2019; 16
2008; 7
2019; 18
2011; 10
2011; 12
2022; 21
2020; 11
2022; 22
2012; 12
2020; 19
2021; 35
2016; 7
2021; 12
2023; 22
2023
2017; 14
2022
2021
2017; 17
2017; 16
2021; 18
2019; 47
2022; 13
2020; 357
2014; 13
2007; 7
2007; 4
2018; 34
2016; 27
2014; 32
2022; 19
e_1_2_7_5_1
e_1_2_7_3_1
e_1_2_7_9_1
e_1_2_7_7_1
e_1_2_7_19_1
e_1_2_7_60_1
e_1_2_7_17_1
e_1_2_7_15_1
e_1_2_7_41_1
e_1_2_7_13_1
e_1_2_7_11_1
e_1_2_7_45_1
e_1_2_7_47_1
e_1_2_7_26_1
e_1_2_7_49_1
e_1_2_7_28_1
e_1_2_7_50_1
e_1_2_7_25_1
e_1_2_7_31_1
e_1_2_7_52_1
e_1_2_7_23_1
e_1_2_7_33_1
e_1_2_7_54_1
e_1_2_7_21_1
e_1_2_7_35_1
e_1_2_7_56_1
e_1_2_7_37_1
e_1_2_7_58_1
e_1_2_7_39_1
Pedregosa F. (e_1_2_7_43_1) 2011; 12
e_1_2_7_6_1
e_1_2_7_4_1
e_1_2_7_8_1
e_1_2_7_18_1
e_1_2_7_16_1
e_1_2_7_40_1
e_1_2_7_2_1
e_1_2_7_14_1
e_1_2_7_42_1
e_1_2_7_12_1
e_1_2_7_44_1
e_1_2_7_10_1
e_1_2_7_46_1
e_1_2_7_48_1
e_1_2_7_27_1
e_1_2_7_29_1
e_1_2_7_51_1
e_1_2_7_30_1
e_1_2_7_53_1
e_1_2_7_24_1
e_1_2_7_32_1
e_1_2_7_55_1
e_1_2_7_22_1
e_1_2_7_34_1
e_1_2_7_57_1
e_1_2_7_20_1
e_1_2_7_36_1
e_1_2_7_59_1
e_1_2_7_38_1
References_xml – volume: 22
  start-page: 1298
  issue: 4
  year: 2023
  end-page: 1308
  article-title: Comparison of database searching programs for the analysis of single‐cell proteomics data
  publication-title: Journal of Proteome Research
– volume: 20
  start-page: 3388
  issue: 6
  year: 2021
  end-page: 3394
  article-title: Universal spectrum explorer: A standalone (Web‐)application for cross‐resource spectrum comparison
  publication-title: Journal of Proteome Research
– volume: 14
  start-page: 259
  issue: 3
  year: 2017
  end-page: 262
  article-title: Building ProteomeTools based on a complete synthetic human proteome
  publication-title: Nature Methods
– year: 2021
  article-title: INFERYS rescoring: Boosting peptide identifications and scoring confidence of database search results
  publication-title: Rapid Communications in Mass Spectrometry: RCM
– volume: 24
  start-page: 301
  issue: 2
  year: 2013
  end-page: 304
  article-title: Pyteomics—A python framework for exploratory data analysis and rapid software prototyping in proteomics
  publication-title: Journal of the American Society for Mass Spectrometry
– volume: 21
  issue: 8
  year: 2022
  article-title: SIMSI‐transfer: Software‐assisted reduction of missing values in phosphoproteomic and proteomic isobaric labeling data using tandem mass spectrum clustering
  publication-title: Molecular & Cellular Proteomics
– year: 2021
– volume: 22
  start-page: 557
  issue: 2
  year: 2023
  end-page: 560
  article-title: psm_utils: A high‐level python API for parsing and handling peptide‐spectrum matches and proteomics search results
  publication-title: Journal of Proteome Research
– volume: 17
  issue: 21
  year: 2017
  article-title: PROCAL: A set of 40 peptide standards for retention time indexing, column performance monitoring, and collision energy calibration
  publication-title: Proteomics
– volume: 12
  start-page: 1151
  issue: 8
  year: 2012
  end-page: 1159
  article-title: Chromatographic retention time prediction for posttranslationally modified peptides
  publication-title: Proteomics
– start-page: 1
  year: 2023
  end-page: 11
  article-title: Global detection of human variants and isoforms by deep proteome sequencing
  publication-title: Nature Biotechnology
– volume: 11
  start-page: 1548
  issue: 1
  year: 2020
  article-title: Generating high quality libraries for DIA MS with empirically corrected peptide predictions
  publication-title: Nature Communications
– start-page: 25514
  year: 2022
  end-page: 25522
  article-title: De novo mass spectrometry peptide sequencing with a transformer model
– volume: 20
  start-page: 1966
  issue: 4
  year: 2021
  end-page: 1971
  article-title: mokapot: Fast and flexible semisupervised learning for peptide detection
  publication-title: Journal of Proteome Research
– volume: 12
  issue: 1
  year: 2021
  article-title: Quantitative single‐cell proteomics as a tool to characterize cellular hierarchies
  publication-title: Nature Communications
– volume: 39
  start-page: 229
  issue: 3
  year: 2020
  end-page: 244
  article-title: The Skyline ecosystem: Informatics for quantitative mass spectrometry proteomics
  publication-title: Mass Spectrometry Reviews
– volume: 604
  start-page: 55
  year: 2010
  end-page: 71
  article-title: Target‐decoy search strategy for mass spectrometry‐based proteomics
  publication-title: Methods in Molecular Biology (Clifton, N.J.)
– volume: 16
  start-page: 509
  issue: 6
  year: 2019
  end-page: 518
  article-title: Prosit: Proteome‐wide prediction of peptide tandem mass spectra by deep learning
  publication-title: Nature Methods
– volume: 17
  start-page: 261
  issue: 3
  year: 2020
  end-page: 272
  article-title: SciPy 1.0: Fundamental algorithms for scientific computing in Python
  publication-title: Nature Methods
– volume: 7
  start-page: 813
  year: 2016
  article-title: Identification of quantitative proteomic differences between mycobacterium tuberculosis lineages with altered virulence
  publication-title: Frontiers in Microbiology
– volume: 18
  start-page: 2747
  issue: 7
  year: 2019
  end-page: 2758
  article-title: pValid: Validation Beyond the target‐decoy approach for peptide identification in shotgun proteomics
  publication-title: Journal of Proteome Research
– volume: 4
  start-page: 923
  issue: 11
  year: 2007
  end-page: 925
  article-title: Semi‐supervised learning for peptide identification from shotgun proteomics datasets
  publication-title: Nature Methods
– volume: 35
  issue: 11
  year: 2021
  article-title: MS Amanda 2.0: Advancements in the standalone implementation
  publication-title: Rapid Communications in Mass Spectrometry
– volume: 21
  issue: 12
  year: 2022
  article-title: inSPIRE: An open‐source tool for increased mass spectrometry identification rates using prosit spectral prediction
  publication-title: Molecular & Cellular Proteomics
– volume: 18
  start-page: 709
  issue: 2
  year: 2019
  end-page: 714
  article-title: Pyteomics 4.0: Five years of development of a python proteomics framework
  publication-title: Journal of Proteome Research
– volume: 18
  start-page: 1363
  issue: 11
  year: 2021
  end-page: 1369
  article-title: DeepLC can predict retention times for peptides that carry as‐yet unseen modifications
  publication-title: Nature Methods
– volume: 20
  year: 2021
  article-title: Trapped Ion mobility spectrometry and parallel accumulation‐serial fragmentation in proteomics
  publication-title: Molecular & Cellular Proteomics
– volume: 7
  start-page: 286
  issue: 1
  year: 2008
  end-page: 292
  article-title: Statistical validation of peptide identifications in large‐scale proteomics using the target‐decoy database search strategy and flexible mixture modeling
  publication-title: Journal of Proteome Research
– volume: 27
  start-page: 1719
  issue: 11
  year: 2016
  end-page: 1727
  article-title: Fast and accurate protein false discovery rates on large‐scale proteomics data sets with percolator 3.0
  publication-title: Journal of the American Society for Mass Spectrometry
– volume: 34
  start-page: 2513
  issue: 14
  year: 2018
  end-page: 2514
  article-title: pymzML v2.0: Introducing a highly compressed and seekable gzip format
  publication-title: Bioinformatics
– volume: 12
  start-page: 2825
  issue: null
  year: 2011
  end-page: 2830
  article-title: Scikit‐learn: Machine learning in python
  publication-title: The Journal of Machine Learning Research
– volume: 38
  start-page: 199
  issue: 2
  year: 2020
  end-page: 209
  article-title: A large peptidome dataset improves HLA class I epitope prediction across most of the human population
  publication-title: Nature Biotechnology
– volume: 6
  start-page: 3021
  issue: 60
  year: 2021
  article-title: seaborn: Statistical data visualization
  publication-title: Journal of Open Source Software
– volume: 13
  start-page: 7238
  issue: 1
  year: 2022
  article-title: AlphaPeptDeep: A modular deep learning framework to predict peptide properties for proteomics
  publication-title: Nature Communications
– volume: 22
  issue: 19–20
  year: 2022
  article-title: Predicting fragment intensities and retention time of iTRAQ‐ and TMTPro‐labeled peptides with Prosit‐TMT
  publication-title: Proteomics
– volume: 357
  start-page: 7825
  issue: 7825
  year: 2020
  article-title: Array programming with NumPy
  publication-title: Nature
– volume: 10
  issue: 1
  year: 2011
  article-title: MzML—a community standard for mass spectrometry data
  publication-title: Molecular & Cellular Proteomics
– volume: 14
  start-page: 513
  issue: 5
  year: 2017
  end-page: 520
  article-title: MSFragger: Ultrafast and comprehensive peptide identification in mass spectrometry–based proteomics
  publication-title: Nature Methods
– volume: 18
  start-page: 768
  issue: 7
  year: 2021
  end-page: 770
  article-title: Universal spectrum identifier for mass spectra
  publication-title: Nature Methods
– volume: 20
  start-page: 474
  issue: 1
  year: 2021
  end-page: 484
  article-title: Tailoring to search engines: Bottom‐up proteomics with collision energies optimized for identification confidence
  publication-title: Journal of Proteome Research
– start-page: 7181
  year: 2022
  end-page: 7190
  article-title: Prosit‐TMT: Deep learning boosts identification of TMT‐labeled peptides
  publication-title: Analytical Chemistry
– volume: 19
  start-page: 803
  issue: 7
  year: 2022
  end-page: 811
  article-title: Mass spectrometry‐based draft of the mouse proteome
  publication-title: Nature Methods
– volume: 32
  start-page: 834
  issue: 8
  year: 2014
  end-page: 841
  article-title: An integrated catalog of reference genes in the human gut microbiome
  publication-title: Nature Biotechnology
– volume: 7
  start-page: 655
  issue: 5
  year: 2007
  end-page: 667
  article-title: Development and validation of a spectral library searching method for peptide identification from MS/MS
  publication-title: Proteomics
– volume: 4
  start-page: 207
  issue: 3
  year: 2007
  end-page: 214
  article-title: Target‐decoy search strategy for increased confidence in large‐scale protein identifications by mass spectrometry
  publication-title: Nature Methods
– start-page: 56
  year: 2010
  end-page: 61
  article-title: Data structures for statistical computing in python
– volume: 22
  start-page: 350
  issue: 2
  year: 2023
  end-page: 358
  article-title: Quality control for the target decoy approach for peptide identification
  publication-title: Journal of Proteome Research
– volume: 13
  start-page: 2056
  issue: 8
  year: 2014
  end-page: 2071
  article-title: Conserved peptide fragmentation as a benchmarking tool for mass spectrometers and a discriminating feature for targeted proteomics
  publication-title: Molecular & Cellular Proteomics
– volume: 15
  start-page: 783
  issue: 3
  year: 2023
  article-title: Getting ready for large‐scale proteomics in crop plants
  publication-title: Nutrients
– volume: 12
  issue: 1
  year: 2021
  article-title: Critical Assessment of MetaProteome Investigation (CAMPI): A multi‐laboratory comparison of established workflows
  publication-title: Nature Communications
– volume: 16
  start-page: 2296
  issue: 12
  year: 2017
  end-page: 2309
  article-title: Optimization of experimental parameters in data‐independent mass spectrometry significantly increases depth and reproducibility of results
  publication-title: Molecular & Cellular Proteomics
– volume: 14
  issue: 1
  year: 2023
  article-title: MSBooster: Improving peptide identification rates using deep learning‐based features
  publication-title: Nature Communications
– volume: 22
  start-page: 482
  issue: 2
  year: 2023
  end-page: 490
  article-title: Scribe: Next generation library searching for DDA experiments
  publication-title: Journal of Proteome Research
– volume: 21
  issue: 12
  year: 2022
  article-title: Reanalysis of proteomics DB using an accurate, sensitive, and scalable false discovery rate estimation approach for protein groups
  publication-title: Molecular & Cellular Proteomics
– volume: 12
  start-page: 3346
  issue: 1
  year: 2021
  article-title: Deep learning boosts sensitivity of mass spectrometry‐based immunopeptidomics
  publication-title: Nature Communications
– volume: 19
  start-page: 537
  issue: 1
  year: 2020
  end-page: 542
  article-title: ThermoRawFileParser: Modular, scalable, and cross‐platform RAW file conversion
  publication-title: Journal of Proteome Research
– volume: 17
  start-page: 41
  issue: 1
  year: 2020
  end-page: 44
  article-title: DIA‐NN: Neural networks and interference correction enable deep proteome coverage in high throughput
  publication-title: Nature Methods
– volume: 47
  start-page: W295
  issue: W1
  year: 2019
  end-page: W299
  article-title: Updated MS PIP web server delivers fast and accurate MS peak intensity prediction for multiple fragmentation methods, instruments and labeling techniques
  publication-title: Nucleic Acids Research
– ident: e_1_2_7_11_1
  doi: 10.1038/nmeth.4256
– ident: e_1_2_7_23_1
  doi: 10.1038/s41592‐019‐0426‐7
– ident: e_1_2_7_20_1
  doi: 10.1021/acs.jproteome.8b00993
– ident: e_1_2_7_42_1
  doi: 10.1038/s41592‐019‐0686‐2
– ident: e_1_2_7_31_1
  doi: 10.1038/s41467‐023‐40129‐9
– ident: e_1_2_7_32_1
  doi: 10.1038/s41467‐021‐23713‐9
– ident: e_1_2_7_16_1
  doi: 10.1007/978‐1‐60761‐444‐9_5
– ident: e_1_2_7_22_1
  doi: 10.1021/acs.jproteome.0c01010
– ident: e_1_2_7_5_1
  doi: 10.1038/s41467‐021‐23667‐y
– ident: e_1_2_7_38_1
  doi: 10.1021/acs.jproteome.2c00672
– ident: e_1_2_7_30_1
  doi: 10.1016/j.mcpro.2022.100432
– ident: e_1_2_7_60_1
  doi: 10.1016/j.mcpro.2022.100437
– ident: e_1_2_7_54_1
  doi: 10.1002/mas.21540
– ident: e_1_2_7_58_1
  doi: 10.1021/acs.jproteome.2c00609
– ident: e_1_2_7_52_1
  doi: 10.1002/pmic.201100386
– ident: e_1_2_7_28_1
  doi: 10.1101/2021.11.02.466886
– ident: e_1_2_7_59_1
  doi: 10.1016/j.mcpro.2022.100238
– ident: e_1_2_7_46_1
  doi: 10.25080/Majora-92bf1922-00a
– ident: e_1_2_7_45_1
  doi: 10.1038/s41586‐020‐2649‐2
– ident: e_1_2_7_24_1
  doi: 10.1038/s41467‐022‐34904‐3
– ident: e_1_2_7_14_1
  doi: 10.1021/acs.jproteome.2c00821
– ident: e_1_2_7_47_1
  doi: 10.21105/joss.03021
– ident: e_1_2_7_56_1
  doi: 10.1021/acs.jproteome.2c00423
– ident: e_1_2_7_40_1
  doi: 10.1021/acs.jproteome.8b00717
– ident: e_1_2_7_51_1
  doi: 10.1074/mcp.O113.036475
– ident: e_1_2_7_36_1
  doi: 10.3390/nu15030783
– ident: e_1_2_7_19_1
  doi: 10.1007/s13361‐016‐1460‐7
– ident: e_1_2_7_10_1
  doi: 10.1101/2022.02.07.479481
– ident: e_1_2_7_6_1
  doi: 10.1016/j.mcpro.2021.100138
– ident: e_1_2_7_13_1
  doi: 10.1038/s41592‐021‐01184‐6
– ident: e_1_2_7_25_1
  doi: 10.1093/nar/gkz299
– ident: e_1_2_7_21_1
  doi: 10.1038/nmeth1113
– ident: e_1_2_7_57_1
  doi: 10.1038/nbt.2942
– ident: e_1_2_7_27_1
  doi: 10.3389/fmicb.2016.00813
– ident: e_1_2_7_2_1
  doi: 10.1038/s41587‐023‐01714‐x
– ident: e_1_2_7_12_1
  doi: 10.1021/acs.jproteome.1c00096
– ident: e_1_2_7_15_1
  doi: 10.1021/acs.jproteome.0c00518
– ident: e_1_2_7_29_1
  doi: 10.1002/rcm.9128
– ident: e_1_2_7_34_1
  doi: 10.1021/acs.analchem.1c05435
– ident: e_1_2_7_7_1
  doi: 10.1002/rcm.9088
– ident: e_1_2_7_44_1
– ident: e_1_2_7_9_1
  doi: 10.1038/s41592‐019‐0638‐x
– ident: e_1_2_7_33_1
  doi: 10.1038/s41592‐022‐01526‐y
– ident: e_1_2_7_37_1
  doi: 10.1038/s41467‐020‐15346‐1
– ident: e_1_2_7_17_1
  doi: 10.1021/pr7006818
– ident: e_1_2_7_39_1
  doi: 10.1007/s13361‐012‐0516‐6
– volume: 12
  start-page: 2825
  year: 2011
  ident: e_1_2_7_43_1
  article-title: Scikit‐learn: Machine learning in python
  publication-title: The Journal of Machine Learning Research
– ident: e_1_2_7_48_1
  doi: 10.1021/acs.jproteome.9b00328
– ident: e_1_2_7_49_1
  doi: 10.1074/mcp.R110.000133
– ident: e_1_2_7_55_1
  doi: 10.1074/mcp.RA117.000314
– ident: e_1_2_7_3_1
  doi: 10.1038/s41467‐021‐27542‐8
– ident: e_1_2_7_41_1
  doi: 10.1093/bioinformatics/bty046
– ident: e_1_2_7_18_1
  doi: 10.1038/nmeth1019
– ident: e_1_2_7_8_1
  doi: 10.1038/nmeth.4153
– ident: e_1_2_7_53_1
  doi: 10.1002/pmic.200600625
– ident: e_1_2_7_50_1
  doi: 10.1002/pmic.201700263
– ident: e_1_2_7_26_1
  doi: 10.1038/s41592‐021‐01301‐5
– ident: e_1_2_7_35_1
  doi: 10.1002/pmic.202100257
– ident: e_1_2_7_4_1
  doi: 10.1038/s41587‐019‐0322‐9
SSID ssj0017897
Score 2.5869126
Snippet Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico...
SourceID proquest
pubmed
crossref
wiley
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage e2300112
SubjectTerms bioinformatics
bottom‐up proteomics
Data analysis
data processing and analysis
Deep learning
Libraries
Machine learning
mass spectrometry LC‐MS/MS
Proteomics
Python
Search engines
technology
Title Oktoberfest: Open‐source spectral library generation and rescoring pipeline based on Prosit
URI https://onlinelibrary.wiley.com/doi/abs/10.1002%2Fpmic.202300112
https://www.ncbi.nlm.nih.gov/pubmed/37672792
https://www.proquest.com/docview/3039821609
https://www.proquest.com/docview/2862202130
Volume 24
WOSCitedRecordID wos001059264800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVWIB
  databaseName: Wiley Online Library Full Collection 2020
  customDbUrl:
  eissn: 1615-9861
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017897
  issn: 1615-9853
  databaseCode: DRFUL
  dateStart: 20010101
  isFulltext: true
  titleUrlDefault: https://onlinelibrary.wiley.com
  providerName: Wiley-Blackwell
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NbtQwEB6hFgl6KLAFGiiVkRCcrCZ2Nra5VYUVh6qsEJX2gqLEP9UKmq52t0jc-gh9xj4JM042dIUQQlxyiO1kZM94vrHHnwFe5doR7h3yAtEoz4XIeO2E4TYTToRCGesiu_6xOjnRk4kZ3zrF3_JD9AtuZBlxviYDr-rFwS_S0Nn5lCgIEUKjiuIkvJllUpFei3zc7yMo3V6vgn6bG_RMK9rGVByst193S79hzXXoGn3P6MH_S_0QtjvcyQ5bRXkEd3wzgJ3DBmPu8x_sNYuZoHGJfQD3jla3wA1g6xZh4Q58-fh1SanYAaV-yygZ5ebqul3_Z_HM5hz_0QnIziKjNQ08qxrHMK63MduPzaYzOgXvGblQx7B8PKfcscdwOnr_-egD7y5o4BbDwoLb1HkMd1xqhKqNwSBd5zIXdWFUraV0sqqGSoZQmFqm1sf11tyboQjaIVQM8glsNBeN3wVW1T5UzhLcCnmtUl0NQyCSxAIjRivzBPhqfErbsZfTJRrfypZ3WZTUs2Xfswm86evPWt6OP9bcWw132dnvokTHbrTIitQk8LIvxq6n7ZSq8ReXi1JgMIhfQRCQwNNWTfpfEUcOUTMm0GrDX2Qox2hzqL6yePavDZ7DfXzbJRPtwcZyfulfwF37fTldzPejNeBTTfQ-bL77NDo9_gnG7gyV
linkProvider Wiley-Blackwell
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3daxQxEB_kKtQ--HHVulo1guhT6F6yX_GtVI-K53lIC32RsJsPOdpuj7ur4Jt_Qv_G_iXOZPdWDxERfN5sMiQzmd8kk98AvEgKS7g35RmiUZ4IMeCVFYqbgbDCZ7kyNrDrj_LxuDg5UZM2m5DewjT8EN2BG1lG2K_JwOlAeu8na-jsfEochIihUUdxF95IUJfSHmy8-TQ8HnVXCXnRVFhB180VOqcVc2Ms9tZ7WPdMv8HNdfQa3M_wzn8Q_C7cbrEn22-U5R7ccHUftvdrjLvPv7GXLGSDhmP2PmwerCrB9WHrF9LCbfj88XRJ6dgexX7NKCHl-vtVcwfAwrvNOY7RSsi-BFZrWnxW1pZhbG9Cxh-bTWf0Et4xcqOW4ffJnPLH7sPx8O3RwSFvizRwg6Fhxk1sHYY8NlYir5TCQL1IZCKqTOVVIaWVZZnm0vtMVTI2Lpy5Jk6lwhcW4aKXD6BXX9TuIbCycr60hiCXT6o8LsrUeyJKzDBqNDKJgK8WSJuWwZwKaZzphntZaJpZ3c1sBK-69rOGu-OPLXdX661bG15odO6qEIMsVhE87z7j1NOVSlm7i8uFFhgQYi8IBCLYafSkG4p4coieMYJGHf4ig56g3Q1kLrNH__rDM9g8PPow0qN34_eP4Ra2aJOLdqG3nF-6J3DTfF1OF_OnrXH8AOY3D4o
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LixQxEC5kFR8HH7M-WleNIHoK25OkH_G2rA6Ky9gHhb1I6M5DBt3eZmZW8OZP8Df6S6xK97QOIiJ4TrpTJFWpr5LKVwCPVekI92Y8RzTKlRBT3jihuZ0KJ0JeaOsiu_5RMZ-Xx8e6GrIJ6S1Mzw8xHriRZcT9mgzcdy7s_2QN7U4WxEGIGBp1FHfh8yrDjZbInVU1XiQUZV9fBR031-iaNryNqdjf_n7bL_0GNrexa3Q-s2v_QezrcHVAnuygV5UbcM63E9g9aDHqPvnCnrCYCxoP2Sdw6XBTB24CV36hLNyF928-rikZO6DYzxilo3z_-q2_AWDx1eYSxxgkZB8ipzUtPatbxzCytzHfj3WLjt7Be0ZO1DFsr5aUPXYT3s1evD18yYcSDdxiYJhzmzqPAY9LtSgarTFML5VUosl10ZRSOlnXWSFDyHUjU-vjiavyOhOhdAgWg7wFO-1p6-8AqxsfamcJcAXVFGlZZyEQTWKOMaOVKgG-WSBjB_5yKqPxyfTMy8LQzJpxZhN4OvbveuaOP_bc26y3GSx4ZdC161JM81Qn8GhsxqmnC5W69adnKyMwHMS_IAxI4HavJ-NQxJJD5IwJ9OrwFxlMhVY3lYXM7_7rBw_hYvV8Zo5ezV_fg8vYYcgs2oOd9fLM34cL9vN6sVo-iJbxA4RZDXM
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Oktoberfest%3A+Open-source+spectral+library+generation+and+rescoring+pipeline+based+on+Prosit&rft.jtitle=Proteomics+%28Weinheim%29&rft.au=Picciani%2C+Mario&rft.au=Gabriel%2C+Wassim&rft.au=Giurcoiu%2C+Victor-George&rft.au=Shouman%2C+Omar&rft.date=2024-04-01&rft.issn=1615-9861&rft.eissn=1615-9861&rft.volume=24&rft.issue=8&rft.spage=e2300112&rft_id=info:doi/10.1002%2Fpmic.202300112&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1615-9853&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1615-9853&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1615-9853&client=summon