Deciding unique decodability of bigram counts via finite automata

We revisit the problem of deciding by means of a finite automaton whether a string is uniquely decodable from its bigram counts. An efficient algorithm for constructing a polynomial-size Nondeterministic Finite Automaton (NFA) that decides unique decodability is given. This NFA may be simulated effi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of computer and system sciences Jg. 80; H. 2; S. 450 - 456
Hauptverfasser: Kontorovich, Aryeh, Trachtenberg, Ari
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Inc 01.03.2014
Schlagworte:
ISSN:0022-0000, 1090-2724
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract We revisit the problem of deciding by means of a finite automaton whether a string is uniquely decodable from its bigram counts. An efficient algorithm for constructing a polynomial-size Nondeterministic Finite Automaton (NFA) that decides unique decodability is given. This NFA may be simulated efficiently in time and space. Conversely, we show that the minimum deterministic finite automaton for deciding unique decodability has exponentially many states in alphabet size, and compute the correct order of magnitude of the exponent. •We prove that the bigram-decodable strings form a regular language.•We construct a cubic-sized NFA for recognizing this language.•We show how to simulate this NFA efficiently.•We give a lower bound on the size of any DFA for this language.
AbstractList We revisit the problem of deciding by means of a finite automaton whether a string is uniquely decodable from its bigram counts. An efficient algorithm for constructing a polynomial-size Nondeterministic Finite Automaton (NFA) that decides unique decodability is given. This NFA may be simulated efficiently in time and space. Conversely, we show that the minimum deterministic finite automaton for deciding unique decodability has exponentially many states in alphabet size, and compute the correct order of magnitude of the exponent.
We revisit the problem of deciding by means of a finite automaton whether a string is uniquely decodable from its bigram counts. An efficient algorithm for constructing a polynomial-size Nondeterministic Finite Automaton (NFA) that decides unique decodability is given. This NFA may be simulated efficiently in time and space. Conversely, we show that the minimum deterministic finite automaton for deciding unique decodability has exponentially many states in alphabet size, and compute the correct order of magnitude of the exponent. •We prove that the bigram-decodable strings form a regular language.•We construct a cubic-sized NFA for recognizing this language.•We show how to simulate this NFA efficiently.•We give a lower bound on the size of any DFA for this language.
Author Kontorovich, Aryeh
Trachtenberg, Ari
Author_xml – sequence: 1
  givenname: Aryeh
  surname: Kontorovich
  fullname: Kontorovich, Aryeh
  email: karyeh@cs.bgu.ac.il
  organization: Department of Computer Science, Ben-Gurion University, Beer Sheva 84105, Israel
– sequence: 2
  givenname: Ari
  surname: Trachtenberg
  fullname: Trachtenberg, Ari
  email: trachten@bu.edu
  organization: Department of Electrical & Computer Engineering, Boston University, 8 Saint Maryʼs Street, Boston, MA 02215, United States
BookMark eNp9kD1v3DAMhoUgBXJJ-wc6eexilxItnw_IEqSfQIAu2QUdRQc8-KREkgPk39eH69QhXLi8zwvyuVaXMUVW6rOGToMevh66A5XSGdDYwa4DwAu10bCD1mxNf6k2AMa0sM6Vui7lAKC1HXCj7r4xSZD41CxRXhZuAlMKfi-z1LcmTc1enrI_NpSWWEvzKr6ZJErlxi81HX31H9WHyc-FP_3bN-rxx_fH-1_tw5-fv-_vHlpCxNoOECyRwRD60cAIpAkBRx8GIqQx2J5tPyFPZt8TI_bYW5q8xZ3dbweNN-rLufY5p_XOUt1RCvE8-8hpKU5bGHCw23FYo-YcpZxKyTy55yxHn9-cBnfS5Q7upMuddDnYuVXXCo3_QSTVV0mxZi_z--jtGeX1_Vfh7AoJR-Igmam6kOQ9_C_WtYgQ
CitedBy_id crossref_primary_10_1016_j_jcss_2014_12_016
Cites_doi 10.1093/bioinformatics/bth205
10.1016/j.jcss.2007.10.004
10.1137/060651380
10.1109/TPDS.2006.148
10.1016/j.tcs.2004.10.010
ContentType Journal Article
Copyright 2013 Elsevier Inc.
Copyright_xml – notice: 2013 Elsevier Inc.
DBID 6I.
AAFTH
AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1016/j.jcss.2013.09.003
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1090-2724
EndPage 456
ExternalDocumentID 10_1016_j_jcss_2013_09_003
S0022000013001670
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1OL
1~.
1~5
29K
4.4
457
4G.
5GY
5VS
6I.
6TJ
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
AAYJJ
ABBOA
ABEFU
ABJNI
ABMAC
ABTAH
ABVKL
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADFGL
ADMUD
AEBSH
AEKER
AENEX
AETEA
AEXQZ
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CAG
COF
CS3
D-I
DM4
DU5
EBS
EFBJH
EFLBG
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HVGLF
HZ~
IHE
IXB
J1W
KOM
LG5
LG9
LY7
M41
MO0
MVM
N9A
NCXOZ
O-L
O9-
OAUVE
OHT
OK1
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SSV
SSZ
T5K
TN5
TWZ
UPT
WH7
WUQ
XJT
XOL
XPP
YQT
ZCG
ZMT
ZU3
ZY4
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
ADVLN
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c333t-60d5cc23dd482080c1c3038ad6cc3c8d54e54f3ef2b4ce334345cfa5395b7613
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000327289300009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0022-0000
IngestDate Sun Nov 09 13:57:19 EST 2025
Sat Nov 29 04:16:28 EST 2025
Tue Nov 18 22:27:07 EST 2025
Fri Feb 23 02:19:39 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Keywords Sequence reconstruction
Eulerian graph
Finite-state automata
Uniqueness
Language English
License http://www.elsevier.com/open-access/userlicense/1.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c333t-60d5cc23dd482080c1c3038ad6cc3c8d54e54f3ef2b4ce334345cfa5395b7613
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
OpenAccessLink https://dx.doi.org/10.1016/j.jcss.2013.09.003
PQID 1506365786
PQPubID 23500
PageCount 7
ParticipantIDs proquest_miscellaneous_1506365786
crossref_primary_10_1016_j_jcss_2013_09_003
crossref_citationtrail_10_1016_j_jcss_2013_09_003
elsevier_sciencedirect_doi_10_1016_j_jcss_2013_09_003
PublicationCentury 2000
PublicationDate 2014-03-01
PublicationDateYYYYMMDD 2014-03-01
PublicationDate_xml – month: 03
  year: 2014
  text: 2014-03-01
  day: 01
PublicationDecade 2010
PublicationTitle Journal of computer and system sciences
PublicationYear 2014
Publisher Elsevier Inc
Publisher_xml – name: Elsevier Inc
References Sipser (br0020) 1996
Lewis, Papadimitriou (br0060) 1997
Li (br0130) 2012
Chaisson, Pevzner, Tang (br0070) 2004; 20
Agarwal, Chauhan, Trachtenberg (br0090) 2006; 17
Filtser, Jin, Kontorovich, Trachtenberg (br0040) 2013
Broder (br0030) 1997
Shi, Xie, Zhang, Hao (br0100) 2007; 50
Dodis, Ostrovsky, Reyzin, Smith (br0110) 2008; 38
Rumelhart, McClelland (br0010) 1986
Kozen (br0050) 1997
Li, Xie (br0120) 2008; 74
Kontorovich (br0080) 2004; 329
Rumelhart (10.1016/j.jcss.2013.09.003_br0010) 1986
Sipser (10.1016/j.jcss.2013.09.003_br0020) 1996
Agarwal (10.1016/j.jcss.2013.09.003_br0090) 2006; 17
Dodis (10.1016/j.jcss.2013.09.003_br0110) 2008; 38
Filtser (10.1016/j.jcss.2013.09.003_br0040) 2013
Kontorovich (10.1016/j.jcss.2013.09.003_br0080) 2004; 329
Shi (10.1016/j.jcss.2013.09.003_br0100) 2007; 50
Broder (10.1016/j.jcss.2013.09.003_br0030) 1997
Li (10.1016/j.jcss.2013.09.003_br0130) 2012
Chaisson (10.1016/j.jcss.2013.09.003_br0070) 2004; 20
Li (10.1016/j.jcss.2013.09.003_br0120) 2008; 74
Kozen (10.1016/j.jcss.2013.09.003_br0050) 1997
Lewis (10.1016/j.jcss.2013.09.003_br0060) 1997
References_xml – volume: 74
  start-page: 870
  year: 2008
  end-page: 874
  ident: br0120
  article-title: Finite automata for testing composition-based reconstructibility of sequences
  publication-title: J. Comput. Syst. Sci.
– volume: 329
  start-page: 271
  year: 2004
  end-page: 284
  ident: br0080
  article-title: Uniquely decodable
  publication-title: Theor. Comput. Sci.
– year: 1996
  ident: br0020
  article-title: Introduction to the Theory of Computation
– year: 2013
  ident: br0040
  article-title: Efficient determination of the unique decodability of a string
  publication-title: IEEE International Symposium on Information Theory (ISIT)
– volume: 17
  start-page: 1217
  year: 2006
  end-page: 1225
  ident: br0090
  article-title: Bandwidth efficient string reconciliation using puzzles
  publication-title: IEEE Trans. Parallel Distrib. Syst.
– start-page: 216
  year: 1986
  end-page: 271
  ident: br0010
  article-title: On learning past tenses of English verbs
  publication-title: Parallel Distributed Processing: Vol. 2: Psychological and Biological Models
– start-page: 21
  year: 1997
  end-page: 29
  ident: br0030
  article-title: On the resemblance and containment of documents
  publication-title: Compression and Complexity of Sequences (SEQUENCES ʼ97)
– year: 1997
  ident: br0050
  article-title: Automata and Computability
– volume: 20
  start-page: 2067
  year: 2004
  end-page: 2074
  ident: br0070
  article-title: Fragment assembly with short reads
  publication-title: Bioinformatics
– volume: 38
  start-page: 97
  year: 2008
  end-page: 139
  ident: br0110
  article-title: Fuzzy extractors: how to generate strong keys from biometrics and other noisy data
  publication-title: SIAM J. Comput.
– year: 1997
  ident: br0060
  article-title: Elements of the Theory of Computation
– year: 2012
  ident: br0130
– volume: 50
  start-page: 118
  year: 2007
  end-page: 123
  ident: br0100
  article-title: Decomposition and reconstruction of protein sequences: The problem of uniqueness and factorizable language
  publication-title: J. Korean Phys. Soc.
– year: 1996
  ident: 10.1016/j.jcss.2013.09.003_br0020
– year: 1997
  ident: 10.1016/j.jcss.2013.09.003_br0060
– volume: 20
  start-page: 2067
  issue: 13
  year: 2004
  ident: 10.1016/j.jcss.2013.09.003_br0070
  article-title: Fragment assembly with short reads
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/bth205
– year: 1997
  ident: 10.1016/j.jcss.2013.09.003_br0050
– volume: 74
  start-page: 870
  issue: 5
  year: 2008
  ident: 10.1016/j.jcss.2013.09.003_br0120
  article-title: Finite automata for testing composition-based reconstructibility of sequences
  publication-title: J. Comput. Syst. Sci.
  doi: 10.1016/j.jcss.2007.10.004
– volume: 50
  start-page: 118
  issue: 1
  year: 2007
  ident: 10.1016/j.jcss.2013.09.003_br0100
  article-title: Decomposition and reconstruction of protein sequences: The problem of uniqueness and factorizable language
  publication-title: J. Korean Phys. Soc.
– volume: 38
  start-page: 97
  issue: 1
  year: 2008
  ident: 10.1016/j.jcss.2013.09.003_br0110
  article-title: Fuzzy extractors: how to generate strong keys from biometrics and other noisy data
  publication-title: SIAM J. Comput.
  doi: 10.1137/060651380
– volume: 17
  start-page: 1217
  issue: 11
  year: 2006
  ident: 10.1016/j.jcss.2013.09.003_br0090
  article-title: Bandwidth efficient string reconciliation using puzzles
  publication-title: IEEE Trans. Parallel Distrib. Syst.
  doi: 10.1109/TPDS.2006.148
– year: 2012
  ident: 10.1016/j.jcss.2013.09.003_br0130
– start-page: 21
  year: 1997
  ident: 10.1016/j.jcss.2013.09.003_br0030
  article-title: On the resemblance and containment of documents
– year: 2013
  ident: 10.1016/j.jcss.2013.09.003_br0040
  article-title: Efficient determination of the unique decodability of a string
– volume: 329
  start-page: 271
  issue: 1–3
  year: 2004
  ident: 10.1016/j.jcss.2013.09.003_br0080
  article-title: Uniquely decodable n-gram embeddings
  publication-title: Theor. Comput. Sci.
  doi: 10.1016/j.tcs.2004.10.010
– start-page: 216
  year: 1986
  ident: 10.1016/j.jcss.2013.09.003_br0010
  article-title: On learning past tenses of English verbs
SSID ssj0011563
Score 2.0337172
Snippet We revisit the problem of deciding by means of a finite automaton whether a string is uniquely decodable from its bigram counts. An efficient algorithm for...
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 450
SubjectTerms Algorithms
Eulerian graph
Finite-state automata
Sequence reconstruction
Uniqueness
Title Deciding unique decodability of bigram counts via finite automata
URI https://dx.doi.org/10.1016/j.jcss.2013.09.003
https://www.proquest.com/docview/1506365786
Volume 80
WOSCitedRecordID wos000327289300009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1090-2724
  dateEnd: 20171231
  omitProxy: false
  ssIdentifier: ssj0011563
  issn: 0022-0000
  databaseCode: AIEXJ
  dateStart: 19950201
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELZQywEObSmglgIyErcoKBs7m-QYQVFBqOKwQnuznIktsqLZapOt-vMZP_LQQit64BJFUWI5mS8zY8_MN4S8NwQjrIQ05EzrkKuoCjOZsRBKCaZkpMxsBP_Ht_TyMlsu8-8-lbe17QTSpslub_Pr_ypqvIbCNqWzDxD3MChewHMUOh5R7Hj8J8F_UlDbSpWt42atcIFZOTZuG00va5OQFdgmEW1wU8tA18bxDOS2W1-5UrW_-avg-z_YcIMjgA68_RwDQoYNwWxSuAZTBTr0w3YzWkX42Y0JZcWmnm45zPiYczUpATCWbqpGXUMmD5d4ohO5Y5b15pU7HvE_NLfbRFh9WEFrWNRnzNLPRmy0U31sfsd8DUmFfb7aSpgxhBlDRLmwXLD7cZrkqLf3iy_ny69DmAkXr6ynkzfv46uqXALg7kzu8lx2bLh1TBZH5MBLiBYOCc_II9Uck8O-Wwf1yvuYPJ1QTz4nRQ8T6mBCpzCha00dTKiDCUWYUAcT2sPkBVl8Pl98vAh9O40QGGNdOI-qBCBmVcXR7csimAH6L5ms5gAMsirhKuGaKR2XHBRjnPEEtExYnpQpen0vyV6zbtQJodaJjVONa03NK6lLlZjWOlLrUuJs9SmZ9V9KgKeaNx1Pfom7ZXRKguGZa0e0cu_dSS8A4aHuXECBeLr3uXe9tATqURMck41ab1thmDbZHO3X_NWDZnJGnoy_yGuy12226g15DDdd3W7eesD9BugtmRY
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deciding+unique+decodability+of+bigram+counts+via+finite+automata&rft.jtitle=Journal+of+computer+and+system+sciences&rft.au=Kontorovich%2C+Aryeh&rft.au=Trachtenberg%2C+Ari&rft.date=2014-03-01&rft.issn=0022-0000&rft.volume=80&rft.issue=2&rft.spage=450&rft.epage=456&rft_id=info:doi/10.1016%2Fj.jcss.2013.09.003&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_jcss_2013_09_003
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0022-0000&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0022-0000&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0022-0000&client=summon