Proteoform identification based on top-down tandem mass spectra with peak error corrections

Abstract In this paper, we study the problem for finding complex proteoforms from protein databases based on top-down tandem mass spectrum data. The main difficulty to solve the problem is to handle the combinatorial explosion of various alterations on a protein. To overcome the combinatorial explos...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Briefings in bioinformatics Ročník 23; číslo 2
Hlavní autoři: Zhan, Zhaohui, Wang, Lusheng
Médium: Journal Article
Jazyk:angličtina
Vydáno: England Oxford University Press 10.03.2022
Oxford Publishing Limited (England)
Témata:
ISSN:1467-5463, 1477-4054, 1477-4054
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Abstract In this paper, we study the problem for finding complex proteoforms from protein databases based on top-down tandem mass spectrum data. The main difficulty to solve the problem is to handle the combinatorial explosion of various alterations on a protein. To overcome the combinatorial explosion of various alterations on a protein, the problem has been formulated as the alignment problem of a proteoform mass graph (PMG) and a spectrum mass graph (SMG). The other important issue is to handle mass errors of peaks in the input spectrum. In previous methods, an error tolerance value is used to handle the mass differences between the matched consecutive nodes/peaks in PMG and SMG. However, such a way to handle mass error can not guarantee that the mass difference between any pairs of nodes in the alignment is approximately the same for both PMG and SMG. It may lead to large error accumulation if positive (or negative) errors occur consecutively for a large number of consecutive matched node pairs. The problem is severe so that some existing software packages include a step to further refine the alignments. In this paper, we propose a new model to handle the mass errors of peaks based on the formulation of the PMG and SMG. Note that the masses of sub-paths on the PMG are theoretical and suppose to be accurate. Our method allows each peak in the input spectrum to have a predefined error range. In the alignment of PMG and SMG, we need to give a correction of the mass for each matched peak within the predefined error range. After the correction, we impose that the mass between any two (not necessarily consecutive) matched nodes in the PMG is identical to that of the corresponding two matched peaks in the SMG. Intuitively, this kind of alignment is more accurate. We design an algorithm to find a maximum number of matched node and peak pairs in the two (PMG and SMG) mass graphs under the new constraint. The obtained alignment can show matched node and peak pairs as well as the corrected positions of peaks. The algorithm works well for moderate size input instances and takes very long time as well as huge size memory for large input size instances. Therefore, we propose an algorithm to do diagonal alignment. The diagonal alignment algorithm can solve large input size instances in reasonable time. Experiments show that our new algorithms can report alignments with much larger number of matched node pairs. The software package and test data sets are available at https://github.com/Zeirdo/TopMGRefine.
AbstractList In this paper, we study the problem for finding complex proteoforms from protein databases based on top-down tandem mass spectrum data. The main difficulty to solve the problem is to handle the combinatorial explosion of various alterations on a protein. To overcome the combinatorial explosion of various alterations on a protein, the problem has been formulated as the alignment problem of a proteoform mass graph (PMG) and a spectrum mass graph (SMG). The other important issue is to handle mass errors of peaks in the input spectrum. In previous methods, an error tolerance value is used to handle the mass differences between the matched consecutive nodes/peaks in PMG and SMG. However, such a way to handle mass error can not guarantee that the mass difference between any pairs of nodes in the alignment is approximately the same for both PMG and SMG. It may lead to large error accumulation if positive (or negative) errors occur consecutively for a large number of consecutive matched node pairs. The problem is severe so that some existing software packages include a step to further refine the alignments. In this paper, we propose a new model to handle the mass errors of peaks based on the formulation of the PMG and SMG. Note that the masses of sub-paths on the PMG are theoretical and suppose to be accurate. Our method allows each peak in the input spectrum to have a predefined error range. In the alignment of PMG and SMG, we need to give a correction of the mass for each matched peak within the predefined error range. After the correction, we impose that the mass between any two (not necessarily consecutive) matched nodes in the PMG is identical to that of the corresponding two matched peaks in the SMG. Intuitively, this kind of alignment is more accurate. We design an algorithm to find a maximum number of matched node and peak pairs in the two (PMG and SMG) mass graphs under the new constraint. The obtained alignment can show matched node and peak pairs as well as the corrected positions of peaks. The algorithm works well for moderate size input instances and takes very long time as well as huge size memory for large input size instances. Therefore, we propose an algorithm to do diagonal alignment. The diagonal alignment algorithm can solve large input size instances in reasonable time. Experiments show that our new algorithms can report alignments with much larger number of matched node pairs. The software package and test data sets are available at https://github.com/Zeirdo/TopMGRefine.
In this paper, we study the problem for finding complex proteoforms from protein databases based on top-down tandem mass spectrum data. The main difficulty to solve the problem is to handle the combinatorial explosion of various alterations on a protein. To overcome the combinatorial explosion of various alterations on a protein, the problem has been formulated as the alignment problem of a proteoform mass graph (PMG) and a spectrum mass graph (SMG). The other important issue is to handle mass errors of peaks in the input spectrum. In previous methods, an error tolerance value is used to handle the mass differences between the matched consecutive nodes/peaks in PMG and SMG. However, such a way to handle mass error can not guarantee that the mass difference between any pairs of nodes in the alignment is approximately the same for both PMG and SMG. It may lead to large error accumulation if positive (or negative) errors occur consecutively for a large number of consecutive matched node pairs. The problem is severe so that some existing software packages include a step to further refine the alignments. In this paper, we propose a new model to handle the mass errors of peaks based on the formulation of the PMG and SMG. Note that the masses of sub-paths on the PMG are theoretical and suppose to be accurate. Our method allows each peak in the input spectrum to have a predefined error range. In the alignment of PMG and SMG, we need to give a correction of the mass for each matched peak within the predefined error range. After the correction, we impose that the mass between any two (not necessarily consecutive) matched nodes in the PMG is identical to that of the corresponding two matched peaks in the SMG. Intuitively, this kind of alignment is more accurate. We design an algorithm to find a maximum number of matched node and peak pairs in the two (PMG and SMG) mass graphs under the new constraint. The obtained alignment can show matched node and peak pairs as well as the corrected positions of peaks. The algorithm works well for moderate size input instances and takes very long time as well as huge size memory for large input size instances. Therefore, we propose an algorithm to do diagonal alignment. The diagonal alignment algorithm can solve large input size instances in reasonable time. Experiments show that our new algorithms can report alignments with much larger number of matched node pairs. The software package and test data sets are available at https://github.com/Zeirdo/TopMGRefine.In this paper, we study the problem for finding complex proteoforms from protein databases based on top-down tandem mass spectrum data. The main difficulty to solve the problem is to handle the combinatorial explosion of various alterations on a protein. To overcome the combinatorial explosion of various alterations on a protein, the problem has been formulated as the alignment problem of a proteoform mass graph (PMG) and a spectrum mass graph (SMG). The other important issue is to handle mass errors of peaks in the input spectrum. In previous methods, an error tolerance value is used to handle the mass differences between the matched consecutive nodes/peaks in PMG and SMG. However, such a way to handle mass error can not guarantee that the mass difference between any pairs of nodes in the alignment is approximately the same for both PMG and SMG. It may lead to large error accumulation if positive (or negative) errors occur consecutively for a large number of consecutive matched node pairs. The problem is severe so that some existing software packages include a step to further refine the alignments. In this paper, we propose a new model to handle the mass errors of peaks based on the formulation of the PMG and SMG. Note that the masses of sub-paths on the PMG are theoretical and suppose to be accurate. Our method allows each peak in the input spectrum to have a predefined error range. In the alignment of PMG and SMG, we need to give a correction of the mass for each matched peak within the predefined error range. After the correction, we impose that the mass between any two (not necessarily consecutive) matched nodes in the PMG is identical to that of the corresponding two matched peaks in the SMG. Intuitively, this kind of alignment is more accurate. We design an algorithm to find a maximum number of matched node and peak pairs in the two (PMG and SMG) mass graphs under the new constraint. The obtained alignment can show matched node and peak pairs as well as the corrected positions of peaks. The algorithm works well for moderate size input instances and takes very long time as well as huge size memory for large input size instances. Therefore, we propose an algorithm to do diagonal alignment. The diagonal alignment algorithm can solve large input size instances in reasonable time. Experiments show that our new algorithms can report alignments with much larger number of matched node pairs. The software package and test data sets are available at https://github.com/Zeirdo/TopMGRefine.
Abstract In this paper, we study the problem for finding complex proteoforms from protein databases based on top-down tandem mass spectrum data. The main difficulty to solve the problem is to handle the combinatorial explosion of various alterations on a protein. To overcome the combinatorial explosion of various alterations on a protein, the problem has been formulated as the alignment problem of a proteoform mass graph (PMG) and a spectrum mass graph (SMG). The other important issue is to handle mass errors of peaks in the input spectrum. In previous methods, an error tolerance value is used to handle the mass differences between the matched consecutive nodes/peaks in PMG and SMG. However, such a way to handle mass error can not guarantee that the mass difference between any pairs of nodes in the alignment is approximately the same for both PMG and SMG. It may lead to large error accumulation if positive (or negative) errors occur consecutively for a large number of consecutive matched node pairs. The problem is severe so that some existing software packages include a step to further refine the alignments. In this paper, we propose a new model to handle the mass errors of peaks based on the formulation of the PMG and SMG. Note that the masses of sub-paths on the PMG are theoretical and suppose to be accurate. Our method allows each peak in the input spectrum to have a predefined error range. In the alignment of PMG and SMG, we need to give a correction of the mass for each matched peak within the predefined error range. After the correction, we impose that the mass between any two (not necessarily consecutive) matched nodes in the PMG is identical to that of the corresponding two matched peaks in the SMG. Intuitively, this kind of alignment is more accurate. We design an algorithm to find a maximum number of matched node and peak pairs in the two (PMG and SMG) mass graphs under the new constraint. The obtained alignment can show matched node and peak pairs as well as the corrected positions of peaks. The algorithm works well for moderate size input instances and takes very long time as well as huge size memory for large input size instances. Therefore, we propose an algorithm to do diagonal alignment. The diagonal alignment algorithm can solve large input size instances in reasonable time. Experiments show that our new algorithms can report alignments with much larger number of matched node pairs. The software package and test data sets are available at https://github.com/Zeirdo/TopMGRefine.
Author Zhan, Zhaohui
Wang, Lusheng
Author_xml – sequence: 1
  givenname: Zhaohui
  surname: Zhan
  fullname: Zhan, Zhaohui
– sequence: 2
  givenname: Lusheng
  surname: Wang
  fullname: Wang, Lusheng
  email: cswangl@cityu.edu.hk
BackLink https://www.ncbi.nlm.nih.gov/pubmed/35136947$$D View this record in MEDLINE/PubMed
BookMark eNp90UtLxDAUBeAgiuOMrtxLQBBBqnm1mSxFfIGgC125KLnNLXacNjVpGfz3RmZ04cJVDtwvl5AzJdud75CQQ87OOTPyAhq4ALCQG7NF9rjSOlMsV9vfudBZrgo5IdMYF4wJpud8l0xkzmVhlN4jr0_BD-hrH1raOOyGpm4qOzS-o2AjOprC4PvM-VUKtnPY0tbGSGOP1RAsXTXDG-3RvlMMwQda-RDSJC2I-2SntsuIB5tzRl5urp-v7rKHx9v7q8uHrJKsGDKnhLEODAdj65ojFsDBGrB57fI5QMWApTnj0iiFqAs510wqoQGEccLIGTld7-2D_xgxDmXbxAqXS9uhH2MpCqG5FOl6osd_6MKPoUuvS0qxQs9zIZM62qgRWnRlH5rWhs_y59sSOFuDKvgYA9a_hLPyu5QylVJuSkn6ZK392P8LvwAggI2O
Cites_doi 10.1126/scisignal.aaf7329
10.1586/14789450.2014.878652
10.1126/science.277.5331.1453
10.1093/nar/gkm371
10.1074/mcp.M111.008524
10.1002/pmic.201400206
10.1016/j.molcel.2013.01.029
10.1021/pr400849y
10.1021/pr400294c
10.1038/nmeth.2369
10.1093/bioinformatics/btu678
10.1111/j.1742-4658.2007.06147.x
10.1146/annurev-anchem-071015-041550
10.1371/journal.pone.0179280
10.1093/bioinformatics/btw398
10.1093/bioinformatics/btw806
10.1109/BIBM.2017.8217653
10.1016/j.bbrc.2014.02.041
10.1002/pmic.201800361
ContentType Journal Article
Copyright The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2022
The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Copyright_xml – notice: The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2022
– notice: The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
– notice: The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7QO
7SC
8FD
FR3
JQ2
K9.
L7M
L~C
L~D
P64
RC3
7X8
DOI 10.1093/bib/bbab599
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Biotechnology Research Abstracts
Computer and Information Systems Abstracts
Technology Research Database
Engineering Research Database
ProQuest Computer Science Collection
ProQuest Health & Medical Complete (Alumni)
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Biotechnology and BioEngineering Abstracts
Genetics Abstracts
MEDLINE - Academic
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Genetics Abstracts
Biotechnology Research Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Health & Medical Complete (Alumni)
Engineering Research Database
Advanced Technologies Database with Aerospace
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts Professional
MEDLINE - Academic
DatabaseTitleList CrossRef
MEDLINE - Academic
MEDLINE
Genetics Abstracts

Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Biology
EISSN 1477-4054
ExternalDocumentID 35136947
10_1093_bib_bbab599
10.1093/bib/bbab599
Genre Research Support, Non-U.S. Gov't
Journal Article
GroupedDBID ---
-E4
.2P
.I3
0R~
1TH
23N
2WC
36B
4.4
48X
53G
5GY
5VS
6J9
70D
8VB
AAGQS
AAHBH
AAIJN
AAIMJ
AAJKP
AAJQQ
AAMDB
AAMVS
AAOGV
AAPQZ
AAPXW
AARHZ
AAUQX
AAVAP
AAVLN
ABDBF
ABEJV
ABEUO
ABGNP
ABIXL
ABNKS
ABPQP
ABPTD
ABQLI
ABQTQ
ABWST
ABXVV
ABXZS
ABZBJ
ACGFO
ACGFS
ACGOD
ACIWK
ACPRK
ACUFI
ACUHS
ACUXJ
ACYTK
ADBBV
ADEYI
ADFTL
ADGKP
ADGZP
ADHKW
ADHZD
ADOCK
ADPDF
ADQBN
ADRDM
ADRTK
ADVEK
ADYVW
ADZTZ
ADZXQ
AECKG
AEGPL
AEGXH
AEJOX
AEKKA
AEKSI
AELWJ
AEMDU
AEMOZ
AENEX
AENZO
AEPUE
AETBJ
AEWNT
AFFZL
AFGWE
AFIYH
AFOFC
AFRAH
AGINJ
AGKEF
AGQXC
AGSYK
AHMBA
AHQJS
AHXPO
AIAGR
AIJHB
AJEEA
AJEUX
AKHUL
AKVCP
AKWXX
ALMA_UNASSIGNED_HOLDINGS
ALTZX
ALUQC
ALXQX
AMNDL
ANAKG
APIBT
APWMN
ARIXL
AXUDD
AYOIW
AZVOD
BAWUL
BAYMD
BEYMZ
BHONS
BQDIO
BQUQU
BSWAC
BTQHN
C1A
C45
CAG
CDBKE
COF
CS3
CZ4
DAKXR
DIK
DILTD
DU5
D~K
E3Z
EAD
EAP
EAS
EBA
EBC
EBD
EBR
EBS
EBU
EE~
EJD
EMB
EMK
EMOBN
EST
ESX
F5P
F9B
FHSFR
FLIZI
FLUFQ
FOEOM
FQBLK
GAUVT
GJXCC
GROUPED_DOAJ
GX1
H13
H5~
HAR
HW0
HZ~
IOX
J21
JXSIZ
K1G
KBUDW
KOP
KSI
KSN
M-Z
M49
MK~
ML0
N9A
NGC
NLBLG
NMDNZ
NOMLY
NU-
O0~
O9-
OAWHX
ODMLO
OJQWA
OK1
OVD
OVEED
P2P
PAFKI
PEELM
PQQKQ
Q1.
Q5Y
QWB
RD5
RPM
RUSNO
RW1
RXO
SV3
TEORI
TH9
TJP
TLC
TOX
TR2
TUS
W8F
WOQ
X7H
YAYTL
YKOAZ
YXANX
ZKX
ZL0
~91
77I
AAYXX
AHGBF
CITATION
ROX
ADRIX
AFXEN
BCRHZ
CGR
CUY
CVF
ECM
EIF
NPM
7QO
7SC
8FD
FR3
JQ2
K9.
L7M
L~C
L~D
P64
RC3
7X8
ID FETCH-LOGICAL-c306t-d429adb91b9aff1ee6b1ba9ba5fd58bbc0b0adb013944ee7638703427bb29d293
IEDL.DBID TOX
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000792162200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1467-5463
1477-4054
IngestDate Fri Sep 05 11:21:14 EDT 2025
Sun Nov 30 04:05:36 EST 2025
Wed Feb 19 02:26:55 EST 2025
Sat Nov 29 05:43:29 EST 2025
Wed Apr 02 07:00:33 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 2
Keywords proteoform identification
peak error correction
dynamic programming algorithms
top-down tandem mass spectra
Language English
License This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)
https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
The Author(s) 2022. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c306t-d429adb91b9aff1ee6b1ba9ba5fd58bbc0b0adb013944ee7638703427bb29d293
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
PMID 35136947
PQID 2640678523
PQPubID 26846
ParticipantIDs proquest_miscellaneous_2627132394
proquest_journals_2640678523
pubmed_primary_35136947
crossref_primary_10_1093_bib_bbab599
oup_primary_10_1093_bib_bbab599
PublicationCentury 2000
PublicationDate 2022-03-10
PublicationDateYYYYMMDD 2022-03-10
PublicationDate_xml – month: 03
  year: 2022
  text: 2022-03-10
  day: 10
PublicationDecade 2020
PublicationPlace England
PublicationPlace_xml – name: England
– name: Oxford
PublicationTitle Briefings in bioinformatics
PublicationTitleAlternate Brief Bioinform
PublicationYear 2022
Publisher Oxford University Press
Oxford Publishing Limited (England)
Publisher_xml – name: Oxford University Press
– name: Oxford Publishing Limited (England)
References Deng (2022031506310459200_ref15) 2015; 31
Yang (2022031506310459200_ref14) 2017
Woo (2022031506310459200_ref17) 2014; 13
Smith (2022031506310459200_ref1) 2013; 10
Liu (2022031506310459200_ref11) 2012; 11
Woo (2022031506310459200_ref18) 2014; 14
Fania (2022031506310459200_ref2) 2017; 12
Lisitsa (2022031506310459200_ref3) 2014; 11
Kou (2022031506310459200_ref13) 2017; 33
Kou (2022031506310459200_ref16) 2016; 32
Mann (2022031506310459200_ref7) 2013; 49
Catherman (2022031506310459200_ref5) 2014; 445
Toby (2022031506310459200_ref6) 2016; 9
Wagner-Rousset (2022031506310459200_ref4) 2014
Larsen (2022031506310459200_ref9) 2016; 9
Blattner (2022031506310459200_ref20) 1997; 277
Zamdborg (2022031506310459200_ref10) 2007; 35
Liu (2022031506310459200_ref12) 2013; 12
McLafferty (2022031506310459200_ref19) 2007; 274
Schaffer (2022031506310459200_ref8) 2019; 19
References_xml – volume: 9
  start-page: rs9
  issue: 443
  year: 2016
  ident: 2022031506310459200_ref9
  article-title: Proteome-wide analysis of arginine monomethylation reveals widespread occurrence in human cells
  publication-title: Sci Signal
  doi: 10.1126/scisignal.aaf7329
– volume: 11
  start-page: 121
  issue: 1
  year: 2014
  ident: 2022031506310459200_ref3
  article-title: Profiling proteoforms: promising follow-up of proteomics for biomarker discovery
  publication-title: Expert Rev Proteomics
  doi: 10.1586/14789450.2014.878652
– volume: 277
  start-page: 1453
  issue: 5331
  year: 1997
  ident: 2022031506310459200_ref20
  article-title: The complete genome sequence of escherichia coli k-12
  publication-title: Science
  doi: 10.1126/science.277.5331.1453
– volume: 35
  start-page: W701
  issue: suppl_2
  year: 2007
  ident: 2022031506310459200_ref10
  article-title: Prosight ptm 2.0: improved protein identification and characterization for top down mass spectrometry
  publication-title: Nucleic Acids Res
  doi: 10.1093/nar/gkm371
– volume: 11
  issue: 6
  year: 2012
  ident: 2022031506310459200_ref11
  article-title: Protein identification using top-down spectra
  publication-title: Mol Cell Proteomics
  doi: 10.1074/mcp.M111.008524
– volume: 14
  start-page: 2719
  issue: 23–24
  year: 2014
  ident: 2022031506310459200_ref18
  article-title: Proteogenomic strategies for identification of aberrant cancer peptides using large-scale next-generation sequencing data
  publication-title: Proteomics
  doi: 10.1002/pmic.201400206
– volume: 49
  start-page: 583
  issue: 4
  year: 2013
  ident: 2022031506310459200_ref7
  article-title: The coming age of complete, accurate, and ubiquitous proteomes
  publication-title: Mol Cell
  doi: 10.1016/j.molcel.2013.01.029
– volume: 12
  start-page: 5830
  issue: 12
  year: 2013
  ident: 2022031506310459200_ref12
  article-title: Identification of ultramodified proteins using top-down tandem mass spectra
  publication-title: J Proteome Res
  doi: 10.1021/pr400849y
– volume: 13
  start-page: 21
  issue: 1
  year: 2014
  ident: 2022031506310459200_ref17
  article-title: Proteogenomic database construction driven from large scale rna-seq data
  publication-title: J Proteome Res
  doi: 10.1021/pr400294c
– volume: 10
  start-page: 186
  issue: 3
  year: 2013
  ident: 2022031506310459200_ref1
  article-title: Proteoform: a single term describing protein complexity
  publication-title: Nat Methods
  doi: 10.1038/nmeth.2369
– volume: 31
  start-page: 532
  issue: 4
  year: 2015
  ident: 2022031506310459200_ref15
  article-title: An efficient algorithm for the blocked pattern matching problem
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btu678
– volume: 274
  start-page: 6256
  issue: 24
  year: 2007
  ident: 2022031506310459200_ref19
  article-title: Top-down ms, a powerful complement to the high capabilities of proteolysis proteomics
  publication-title: FEBS J
  doi: 10.1111/j.1742-4658.2007.06147.x
– start-page: 173
  volume-title: MAbs
  year: 2014
  ident: 2022031506310459200_ref4
  article-title: Antibody-drug conjugate model fast characterization by lc-ms following ides proteolytic digestion
– volume: 9
  start-page: 499
  year: 2016
  ident: 2022031506310459200_ref6
  article-title: Progress in top-down proteomics and the analysis of proteoforms
  publication-title: Annu Rev Anal Chem
  doi: 10.1146/annurev-anchem-071015-041550
– volume: 12
  issue: 6
  year: 2017
  ident: 2022031506310459200_ref2
  article-title: Protein signature in cerebrospinal fluid and serum of alzheimer’s disease patients: The case of apolipoprotein a-1 proteoforms
  publication-title: PloS one
  doi: 10.1371/journal.pone.0179280
– volume: 32
  start-page: 3495
  issue: 22
  year: 2016
  ident: 2022031506310459200_ref16
  article-title: Toppic: a software tool for top-down mass spectrometry-based proteoform identification and characterization
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw398
– volume: 33
  start-page: 1309
  issue: 9
  year: 2017
  ident: 2022031506310459200_ref13
  article-title: A mass graph-based approach for the identification of modified proteoforms using top-down tandem mass spectra
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btw806
– start-page: 222
  volume-title: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)
  year: 2017
  ident: 2022031506310459200_ref14
  article-title: A spectrum graph-based protein sequence filtering algorithm for proteoform identification by top-down mass spectrometry
  doi: 10.1109/BIBM.2017.8217653
– volume: 445
  start-page: 683
  issue: 4
  year: 2014
  ident: 2022031506310459200_ref5
  article-title: Top down proteomics: facts and perspectives
  publication-title: Biochem Biophys Res Commun
  doi: 10.1016/j.bbrc.2014.02.041
– volume: 19
  start-page: 1800361
  issue: 10
  year: 2019
  ident: 2022031506310459200_ref8
  article-title: Identification and quantification of proteoforms by mass spectrometry
  publication-title: Proteomics
  doi: 10.1002/pmic.201800361
SSID ssj0020781
Score 2.346467
Snippet Abstract In this paper, we study the problem for finding complex proteoforms from protein databases based on top-down tandem mass spectrum data. The main...
In this paper, we study the problem for finding complex proteoforms from protein databases based on top-down tandem mass spectrum data. The main difficulty to...
SourceID proquest
pubmed
crossref
oup
SourceType Aggregation Database
Index Database
Publisher
SubjectTerms Algorithms
Alignment
Combinatorial analysis
Computer programs
Databases, Protein
Error correction
Mass spectra
Nodes
Proteins
Software
Software packages
Tandem Mass Spectrometry - methods
Title Proteoform identification based on top-down tandem mass spectra with peak error corrections
URI https://www.ncbi.nlm.nih.gov/pubmed/35136947
https://www.proquest.com/docview/2640678523
https://www.proquest.com/docview/2627132394
Volume 23
WOSCitedRecordID wos000792162200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVASL
  databaseName: Open Access: Oxford University Press Open Journals
  customDbUrl:
  eissn: 1477-4054
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0020781
  issn: 1467-5463
  databaseCode: TOX
  dateStart: 20000101
  isFulltext: true
  titleUrlDefault: https://academic.oup.com/journals/
  providerName: Oxford University Press
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3dS8MwED90KPji98f8jLDX4NImS_MoovikPkwY-FByaQpDXEe3Cf73XtpuMBX1rZCUlLte7nfJ_e4AOon1xnktuYxQc-kiwRMrFE-coC25q1SOsmo2oR8eksHAPDUJspMfrvBNfIVDvEK0qEzg6QmVhEYF_cfBIq4K9WpqEpHmobp7Q8P78u6S41kis33DlJVvudv671dtw2aDHtl1re4dWPGjXViv-0l-7MHLU6i6UAQcyoZZkwdUiZ4Fb5UxepgWY55R6M3CGYJ_Y2-EnlnFtywtC6eybOztK_NlWZTMhd4dFfNhsg_Pd7f9m3vedE_gjsKAKc_I09gMjUBj81x430OB1qBVeaYSRNfFLo0HCCil97TPkOnGMtKIkckIBRxAa1SM_BEwCnssAS2Njpy560orRZITNsmd0pkVURs6c9Gm47pIRlpfbscpySlt5NSGCxL77zNO5ypJG1uapATZgkuliLkNl4thsoJwtWFHvpiFORFF26HNexsOa1Uu1omViHtG6uM_lz-BjSiwG6p0vVNoTcuZP4M19z4dTspzWNWD5Lz69T4Bhr3VXQ
linkProvider Oxford University Press
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Proteoform+identification+based+on+top-down+tandem+mass+spectra+with+peak+error+corrections&rft.jtitle=Briefings+in+bioinformatics&rft.au=Zhan%2C+Zhaohui&rft.au=Wang%2C+Lusheng&rft.date=2022-03-10&rft.pub=Oxford+Publishing+Limited+%28England%29&rft.issn=1467-5463&rft.eissn=1477-4054&rft.volume=23&rft.issue=2&rft_id=info:doi/10.1093%2Fbib%2Fbbab599&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1467-5463&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1467-5463&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1467-5463&client=summon