Elastic-degenerate string comparison
An elastic-degenerate (ED) string T is a sequence of n sets T[1],…,T[n] containing m strings in total whose cumulative length is N. We call n, m, and N the length, the cardinality and the size of T, respectively. The language of T is defined as L(T)={S1⋯Sn:Si∈T[i] for all i∈[1,n]}. Given two ED stri...
Gespeichert in:
| Veröffentlicht in: | Information and computation Jg. 304; S. 105296 |
|---|---|
| Hauptverfasser: | , , , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier Inc
01.05.2025
|
| Schlagworte: | |
| ISSN: | 0890-5401 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | An elastic-degenerate (ED) string T is a sequence of n sets T[1],…,T[n] containing m strings in total whose cumulative length is N. We call n, m, and N the length, the cardinality and the size of T, respectively. The language of T is defined as L(T)={S1⋯Sn:Si∈T[i] for all i∈[1,n]}. Given two ED strings, how fast can we check whether the two languages they represent have a nonempty intersection? We call this problem the ED String Intersection (EDSI) problem. For two ED strings T1 and T2 of lengths n1 and n2, cardinalities m1 and m2, and sizes N1 and N2, respectively, we show the following:•There is no O((N1N2)1−ϵ)-time algorithm, for any ϵ>0, for EDSI even if T1 and T2 are over a binary alphabet, unless the Strong Exponential-Time Hypothesis is false.•There is no combinatorial O((N1+N2)1.2−ϵf(n1,n2))-time algorithm, for any ϵ>0 and any function f, for EDSI even if T1 and T2 are over a binary alphabet, unless the Boolean Matrix Multiplication conjecture is false.•An O(N1logN1logn1+N2logN2logn2)-time algorithm for outputting a compact representation of the intersection language of two unary ED strings. When T1 and T2 are given in a compact representation, we show that the problem is NP-complete.•An O(N1m2+N2m1)-time algorithm for EDSI.•An O˜(N1ω−1n2+N2ω−1n1)-time algorithm for EDSI, where ω is the matrix multiplication exponent; the O˜ notation suppresses factors that are polylogarithmic in the input size. |
|---|---|
| AbstractList | An elastic-degenerate (ED) string T is a sequence of n sets T[1],…,T[n] containing m strings in total whose cumulative length is N. We call n, m, and N the length, the cardinality and the size of T, respectively. The language of T is defined as L(T)={S1⋯Sn:Si∈T[i] for all i∈[1,n]}. Given two ED strings, how fast can we check whether the two languages they represent have a nonempty intersection? We call this problem the ED String Intersection (EDSI) problem. For two ED strings T1 and T2 of lengths n1 and n2, cardinalities m1 and m2, and sizes N1 and N2, respectively, we show the following:•There is no O((N1N2)1−ϵ)-time algorithm, for any ϵ>0, for EDSI even if T1 and T2 are over a binary alphabet, unless the Strong Exponential-Time Hypothesis is false.•There is no combinatorial O((N1+N2)1.2−ϵf(n1,n2))-time algorithm, for any ϵ>0 and any function f, for EDSI even if T1 and T2 are over a binary alphabet, unless the Boolean Matrix Multiplication conjecture is false.•An O(N1logN1logn1+N2logN2logn2)-time algorithm for outputting a compact representation of the intersection language of two unary ED strings. When T1 and T2 are given in a compact representation, we show that the problem is NP-complete.•An O(N1m2+N2m1)-time algorithm for EDSI.•An O˜(N1ω−1n2+N2ω−1n1)-time algorithm for EDSI, where ω is the matrix multiplication exponent; the O˜ notation suppresses factors that are polylogarithmic in the input size. |
| ArticleNumber | 105296 |
| Author | Sweering, Michelle Pissis, Solon P. Radoszewski, Jakub Zuba, Wiktor Mwaniki, Moses Njagi Gabory, Estéban Pisanti, Nadia |
| Author_xml | – sequence: 1 givenname: Estéban surname: Gabory fullname: Gabory, Estéban email: esteban.gabory@cwi.nl organization: CWI, Amsterdam, the Netherlands – sequence: 2 givenname: Moses Njagi orcidid: 0000-0002-4858-2375 surname: Mwaniki fullname: Mwaniki, Moses Njagi email: njagi.mwaniki@di.unipi.it organization: University of Pisa, Pisa, Italy – sequence: 3 givenname: Nadia surname: Pisanti fullname: Pisanti, Nadia email: nadia.pisanti@unipi.it organization: University of Pisa, Pisa, Italy – sequence: 4 givenname: Solon P. orcidid: 0000-0002-1445-1932 surname: Pissis fullname: Pissis, Solon P. email: solon.pissis@cwi.nl organization: CWI, Amsterdam, the Netherlands – sequence: 5 givenname: Jakub orcidid: 0000-0002-0067-6401 surname: Radoszewski fullname: Radoszewski, Jakub email: jrad@mimuw.edu.pl organization: Institute of Informatics, University of Warsaw, Warsaw, Poland – sequence: 6 givenname: Michelle surname: Sweering fullname: Sweering, Michelle email: michelle.sweering@cwi.nl organization: CWI, Amsterdam, the Netherlands – sequence: 7 givenname: Wiktor orcidid: 0000-0002-1988-3507 surname: Zuba fullname: Zuba, Wiktor email: w.zuba@mimuw.edu.pl organization: CWI, Amsterdam, the Netherlands |
| BookMark | eNp9zz9LA0EQhvEtIpiovWUK24uze7ebOzsJ8Q8EbLReJnOzYUKyF3YXwW9vQqwErYYpfi88EzWKQ2SlbjXMNGh3v50JzQwYe3yt6dxIjaHtoLIN6Es1yXkLoLVt3FjdLXeYi1DV84YjJyw8zSVJ3Exp2B8wSR7itboIuMt883Ov1MfT8n3xUq3enl8Xj6uK6rotVXCWcd5oi0RtwxjIBN1BWzM4CA7ArJ3hOThGNNRzj9SQhbCeN3XnHNZXCs67lIacEwd_SLLH9OU1-FOZ33ohfyrz57Ijcb8IScEiQywJZfcffDhDPgZ9CiefSTgS95KYiu8H-Rt_A88Tavc |
| CitedBy_id | crossref_primary_10_1109_ACCESS_2025_3597547 |
| Cites_doi | 10.1016/j.patrec.2017.01.018 10.1016/j.tcs.2023.114269 10.1145/8307.8309 10.1145/321796.321811 10.1145/363269.363610 10.1137/15M1053128 10.1137/S0097539794264810 10.1006/jcss.2000.1727 10.1016/j.tcs.2019.08.012 10.1007/s100320050018 10.1146/annurev-virology-091919-105900 10.1007/s10472-018-9608-8 10.1093/bioinformatics/bty506 10.1145/375360.375365 10.3389/fbinf.2024.1397036 10.1145/359460.359467 10.1186/1471-2105-10-S15-S7 10.1021/bi00822a023 10.3233/FI-2020-1947 10.1145/3588334 10.1016/j.ic.2020.104616 10.1137/20M1368033 10.1093/nar/14.1.31 10.1006/jcss.2001.1774 10.1145/828.1884 10.1007/s00453-022-01007-w 10.1016/0020-0190(94)00032-8 10.1016/j.ins.2016.06.045 10.1016/j.tcs.2005.09.023 10.1007/s11047-022-09882-6 |
| ContentType | Journal Article |
| Copyright | 2025 The Author(s) |
| Copyright_xml | – notice: 2025 The Author(s) |
| DBID | 6I. AAFTH AAYXX CITATION |
| DOI | 10.1016/j.ic.2025.105296 |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| ExternalDocumentID | 10_1016_j_ic_2025_105296 S089054012500032X |
| GroupedDBID | --K --M --Z -~X .~1 0R~ 1B1 1~. 1~5 29I 4.4 457 4G. 5GY 5VS 6I. 6TJ 7-5 71M 8P~ 9JN AAEDT AAEDW AAFTH AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AATTM AAXKI AAXUO AAYFN AAYWO ABAOU ABBOA ABDPE ABFNM ABJNI ABMAC ABWVN ABXDB ACDAQ ACGFS ACNNM ACRLP ACRPL ACVFH ACZNC ADBBV ADCNI ADEZE ADFGL ADMUD ADNMO ADVLN AEBSH AEIPS AEKER AENEX AEUPX AEXQZ AFJKZ AFPUW AFTJW AFXIZ AGCQF AGHFR AGQPQ AGRNS AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIGII AIIUN AIKHN AITUG AKBMS AKRWK AKYEP ALMA_UNASSIGNED_HOLDINGS AMRAJ ANKPU AOUOD APXCP ARUGR ASPBG AVWKF AXJTR AZFZN BKOJK BLXMC BNPGV CAG COF CS3 DM4 DU5 E3Z EBS EFBJH EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HVGLF HZ~ H~9 IHE IXB J1W KOM LG5 LX9 M41 MHUIS MO0 MVM N9A O-L O9- OAUVE OK1 OZT P-8 P-9 P2P PC. Q38 R2- RIG RNS ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SSH SSV SSW SSZ T5K TN5 WH7 WUQ XJT XPP ZMT ZU3 ZY4 ~G- 9DU AAYXX ACLOT CITATION EFKBS EFLBG ~HD |
| ID | FETCH-LOGICAL-c338t-f65ea7415acc84eafc2f19083e060f6002b62e706eaa2cdedac4c50fb743966a3 |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001458803200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0890-5401 |
| IngestDate | Sat Nov 29 06:55:50 EST 2025 Tue Nov 18 21:49:25 EST 2025 Sat Jun 14 16:53:08 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Sequence comparison Elastic-degenerate string Pangenome Languages intersection Acronym identification |
| Language | English |
| License | This is an open access article under the CC BY license. |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c338t-f65ea7415acc84eafc2f19083e060f6002b62e706eaa2cdedac4c50fb743966a3 |
| ORCID | 0000-0002-0067-6401 0000-0002-4858-2375 0000-0002-1988-3507 0000-0002-1445-1932 |
| OpenAccessLink | https://dx.doi.org/10.1016/j.ic.2025.105296 |
| ParticipantIDs | crossref_primary_10_1016_j_ic_2025_105296 crossref_citationtrail_10_1016_j_ic_2025_105296 elsevier_sciencedirect_doi_10_1016_j_ic_2025_105296 |
| PublicationCentury | 2000 |
| PublicationDate | May 2025 2025-05-00 |
| PublicationDateYYYYMMDD | 2025-05-01 |
| PublicationDate_xml | – month: 05 year: 2025 text: May 2025 |
| PublicationDecade | 2020 |
| PublicationTitle | Information and computation |
| PublicationYear | 2025 |
| Publisher | Elsevier Inc |
| Publisher_xml | – name: Elsevier Inc |
| References | Navarro (br0070) 2001; 33 Baeza-Yates, Ribeiro-Neto (br0030) 2011 Kubal, Nagvenkar (br0490) 2021; vol. 2831 Alzamel, Ayad, Bernardini, Grossi, Iliopoulos, Pisanti, Pissis, Rosone (br0110) 2020; 175 Gawrychowski, Ghazawi, Landau (br0320) 2020; vol. 161 Bernardini, Gawrychowski, Pisanti, Pissis, Rosone (br0200) 2019; vol. 132 Cormen, Leiserson, Rivest, Stein (br0420) 2009 Rizzo, Equi, Norri, Mäkinen (br0360) 2024; 982 Backurs, Indyk (br0290) 2016 Bille, Gørtz (br0300) 2024 Aoyama, Nakashima, I, Inenaga, Bannai, Takeda (br0190) 2018; vol. 105 Taghva, Gilbreth (br0530) 1999; 1 Gusfield (br0010) 1997 Bernardini, Conte, Gourdel, Grossi, Loukides, Pisanti, Pissis, Punzi, Stougie, Sweering (br0090) 2020 Gabory, Mwaniki, Pisanti, Pissis, Radoszewski, Sweering, Zuba (br0280) 2023; vol. 259 Alzamel, Ayad, Bernardini, Grossi, Iliopoulos, Pisanti, Pissis, Rosone (br0310) 2018; vol. 113 Abboud, Vassilevska Williams (br0270) 2014 Impagliazzo, Paturi (br0240) 2001; 62 Gibney (br0330) 2020 Vassilevska Williams, Williams (br0400) 2010 Landau, Vishkin, Nussinov (br0610) 1986; 14 Ascone, Bernardini, Conte, Equi, Gabory, Grossi, Pisanti (br0370) 2024 Kuo, Ling, Lin, Hsu (br0500) 2009; 10 Gabory, Mwaniki, Pisanti, Pissis, Radoszewski, Sweering, Zuba (br0550) 2024; 4 Manber, Wu (br0060) 1994; 50 Backurs, Indyk (br0600) 2018; 47 Baaijens, Bonizzoni, Boucher, Vedova, Pirola, Rizzi, Sirén (br0160) 2022; 21 Mäkinen, Cazaux, Equi, Norri, Tomescu (br0350) 2020; vol. 172 Kleinberg, Tardos (br0410) 2006 Fredman, Komlós, Szemerédi (br0460) 1984; 31 (br0100) 1970; 9 Crochemore, Hancart, Lecroq (br0080) 2007 Equi, Mäkinen, Tomescu, Grossi (br0380) 2023; 19 Grossi, Iliopoulos, Liu, Pisanti, Pissis, Retha, Rosone, Vayani, Versari (br0170) 2017; vol. 78 Cisłak, Grabowski, Holub (br0180) 2018; 34 Iliopoulos, Kundu, Pissis (br0130) 2021; 279 Bender, Farach-Colton (br0440) 2000; vol. 1776 Bernardini, Gabory, Pissis, Stougie, Sweering, Zuba (br0230) 2022; vol. 13568 Equi, Norri, Alanko, Cazaux, Tomescu, Mäkinen (br0340) 2023; 85 Dixon, Martin (br0020) 1979 Kirchhoff, Turner (br0480) 2016 Carletti, Foggia, Garrison, Greco, Ritrovato, Vento (br0150) 2019; vol. 11510 Impagliazzo, Paturi, Zane (br0250) 2001; 63 Bernardini, Gawrychowski, Pisanti, Pissis, Rosone (br0210) 2022; 51 Wagner, Fischer (br0570) 1974; 21 Schwartz, Hearst (br0520) 2003 Bringmann, Künnemann (br0560) 2015 Jacobs, Itai, Wintner (br0470) 2020; 88 Liu, Liu, Huang (br0510) 2017; 378 Galil, Giancarlo (br0620) 1986; 17 Veyseh, Dernoncourt, Tran, Nguyen (br0540) 2020 The Computational Pan-Genomics Consortium (br0140) 2018; 19 Heckel (br0040) 1978; 21 Ayad, Barton, Pissis (br0050) 2017; 88 Dial (br0590) 1969; 12 Williams (br0260) 2005; 348 Bernardini, Pisanti, Pissis, Rosone (br0220) 2020; 812 Landau, Myers, Schmidt (br0580) 1998; 27 Lawson (br0450) 2004 Farach (br0430) 1997 Domingo, García-Crespo, Perales (br0120) 2021; 8 Calabro, Impagliazzo, Paturi (br0390) 2009; vol. 5917 Landau (10.1016/j.ic.2025.105296_br0580) 1998; 27 Kubal (10.1016/j.ic.2025.105296_br0490) 2021; vol. 2831 (10.1016/j.ic.2025.105296_br0100) 1970; 9 Gawrychowski (10.1016/j.ic.2025.105296_br0320) 2020; vol. 161 The Computational Pan-Genomics Consortium (10.1016/j.ic.2025.105296_br0140) 2018; 19 Aoyama (10.1016/j.ic.2025.105296_br0190) 2018; vol. 105 Jacobs (10.1016/j.ic.2025.105296_br0470) 2020; 88 Manber (10.1016/j.ic.2025.105296_br0060) 1994; 50 Bernardini (10.1016/j.ic.2025.105296_br0090) 2020 Kirchhoff (10.1016/j.ic.2025.105296_br0480) 2016 Bernardini (10.1016/j.ic.2025.105296_br0210) 2022; 51 Wagner (10.1016/j.ic.2025.105296_br0570) 1974; 21 Farach (10.1016/j.ic.2025.105296_br0430) 1997 Heckel (10.1016/j.ic.2025.105296_br0040) 1978; 21 Bernardini (10.1016/j.ic.2025.105296_br0230) 2022; vol. 13568 Impagliazzo (10.1016/j.ic.2025.105296_br0240) 2001; 62 Bender (10.1016/j.ic.2025.105296_br0440) 2000; vol. 1776 Schwartz (10.1016/j.ic.2025.105296_br0520) 2003 Bernardini (10.1016/j.ic.2025.105296_br0220) 2020; 812 Bringmann (10.1016/j.ic.2025.105296_br0560) Liu (10.1016/j.ic.2025.105296_br0510) 2017; 378 Gabory (10.1016/j.ic.2025.105296_br0550) 2024; 4 Cormen (10.1016/j.ic.2025.105296_br0420) 2009 Kleinberg (10.1016/j.ic.2025.105296_br0410) 2006 Calabro (10.1016/j.ic.2025.105296_br0390) 2009; vol. 5917 Impagliazzo (10.1016/j.ic.2025.105296_br0250) 2001; 63 Bernardini (10.1016/j.ic.2025.105296_br0200) 2019; vol. 132 Galil (10.1016/j.ic.2025.105296_br0620) 1986; 17 Alzamel (10.1016/j.ic.2025.105296_br0310) 2018; vol. 113 Mäkinen (10.1016/j.ic.2025.105296_br0350) 2020; vol. 172 Vassilevska Williams (10.1016/j.ic.2025.105296_br0400) 2010 Carletti (10.1016/j.ic.2025.105296_br0150) 2019; vol. 11510 Williams (10.1016/j.ic.2025.105296_br0260) 2005; 348 Lawson (10.1016/j.ic.2025.105296_br0450) 2004 Dixon (10.1016/j.ic.2025.105296_br0020) 1979 Gabory (10.1016/j.ic.2025.105296_br0280) 2023; vol. 259 Grossi (10.1016/j.ic.2025.105296_br0170) 2017; vol. 78 Rizzo (10.1016/j.ic.2025.105296_br0360) 2024; 982 Domingo (10.1016/j.ic.2025.105296_br0120) 2021; 8 Baaijens (10.1016/j.ic.2025.105296_br0160) 2022; 21 Kuo (10.1016/j.ic.2025.105296_br0500) 2009; 10 Backurs (10.1016/j.ic.2025.105296_br0600) 2018; 47 Baeza-Yates (10.1016/j.ic.2025.105296_br0030) 2011 Ayad (10.1016/j.ic.2025.105296_br0050) 2017; 88 Fredman (10.1016/j.ic.2025.105296_br0460) 1984; 31 Dial (10.1016/j.ic.2025.105296_br0590) 1969; 12 Crochemore (10.1016/j.ic.2025.105296_br0080) 2007 Ascone (10.1016/j.ic.2025.105296_br0370) 2024 Gibney (10.1016/j.ic.2025.105296_br0330) 2020 Gusfield (10.1016/j.ic.2025.105296_br0010) 1997 Iliopoulos (10.1016/j.ic.2025.105296_br0130) 2021; 279 Navarro (10.1016/j.ic.2025.105296_br0070) 2001; 33 Abboud (10.1016/j.ic.2025.105296_br0270) 2014 Landau (10.1016/j.ic.2025.105296_br0610) 1986; 14 Cisłak (10.1016/j.ic.2025.105296_br0180) 2018; 34 Equi (10.1016/j.ic.2025.105296_br0380) 2023; 19 Taghva (10.1016/j.ic.2025.105296_br0530) 1999; 1 Equi (10.1016/j.ic.2025.105296_br0340) 2023; 85 Backurs (10.1016/j.ic.2025.105296_br0290) 2016 Veyseh (10.1016/j.ic.2025.105296_br0540) 2020 Alzamel (10.1016/j.ic.2025.105296_br0110) 2020; 175 Bille (10.1016/j.ic.2025.105296_br0300) 2024 |
| References_xml | – volume: 12 start-page: 632 year: 1969 end-page: 633 ident: br0590 article-title: Algorithm 360: shortest-path forest with topological ordering [H] publication-title: Commun. ACM – volume: 19 start-page: 21:1 year: 2023 end-page: 21:25 ident: br0380 article-title: On the complexity of string matching for graphs publication-title: ACM Trans. Algorithms – volume: 27 start-page: 557 year: 1998 end-page: 582 ident: br0580 article-title: Incremental string comparison publication-title: SIAM J. Comput. – volume: vol. 2831 year: 2021 ident: br0490 article-title: Effective ensembling of transformer based language models for acronyms identification publication-title: Proceedings of the Workshop on Scientific Document Understanding Co-Located with 35th AAAI Conference on Artificial Intelligence – start-page: 924 year: 2020 end-page: 929 ident: br0090 article-title: Hide and mine in strings: hardness and algorithms publication-title: 20th IEEE International Conference on Data Mining – volume: 812 start-page: 109 year: 2020 end-page: 122 ident: br0220 article-title: Approximate pattern matching on elastic-degenerate text publication-title: Theor. Comput. Sci. – start-page: 645 year: 2010 end-page: 654 ident: br0400 article-title: Subcubic equivalences between path, matrix and triangle problems publication-title: 51st Annual IEEE Symposium on Foundations of Computer Science – volume: 50 start-page: 191 year: 1994 end-page: 197 ident: br0060 article-title: An algorithm for approximate membership checking with application to password security publication-title: Inf. Process. Lett. – volume: 21 start-page: 168 year: 1974 end-page: 173 ident: br0570 article-title: The string-to-string correction problem publication-title: J. ACM – start-page: 3354 year: 2024 end-page: 3375 ident: br0300 article-title: Sparse regular expression matching publication-title: Proceedings of the 2024 ACM-SIAM Symposium on Discrete Algorithms – start-page: 52 year: 2016 end-page: 60 ident: br0480 article-title: Unsupervised resolution of acronyms and abbreviations in nursing notes using document-level context models publication-title: Proceedings of the Seventh International Workshop on Health Text Mining and Information Analysis – volume: 17 start-page: 52 year: 1986 end-page: 54 ident: br0620 article-title: Improved string matching with k mismatches publication-title: SIGACT News – start-page: 76 year: 2020 end-page: 88 ident: br0330 article-title: An efficient elastic-degenerate text index? Not likely publication-title: String Processing and Information Retrieval - 27th International Symposium – volume: 4 year: 2024 ident: br0550 article-title: Pangenome comparison via ED strings publication-title: Front. Bioinform. – start-page: 3285 year: 2020 end-page: 3301 ident: br0540 article-title: What does this acronym mean? Introducing a new dataset for acronym identification and disambiguation publication-title: Proceedings of the 28th International Conference on Computational Linguistics – year: 2015 ident: br0560 article-title: Quadratic conditional lower bounds for string problems and dynamic time warping – volume: 51 start-page: 549 year: 2022 end-page: 576 ident: br0210 article-title: Elastic-degenerate string matching via fast matrix multiplication publication-title: SIAM J. Comput. – year: 2006 ident: br0410 article-title: Algorithm Design – volume: 19 start-page: 118 year: 2018 end-page: 135 ident: br0140 article-title: Computational pan-genomics: status, promises and challenges publication-title: Brief. Bioinform. – volume: vol. 161 start-page: 14:1 year: 2020 end-page: 14:14 ident: br0320 article-title: On indeterminate strings matching publication-title: 31st Annual Symposium on Combinatorial Pattern Matching – volume: vol. 11510 start-page: 237 year: 2019 end-page: 246 ident: br0150 article-title: Graph-based representations for supporting genome data analysis and visualization: opportunities and challenges publication-title: Graph-Based Representations in Pattern Recognition - 12th IAPR-TC-15 – volume: vol. 5917 start-page: 75 year: 2009 end-page: 85 ident: br0390 article-title: The complexity of satisfiability of small depth circuits publication-title: Parameterized and Exact Computation – volume: 378 start-page: 462 year: 2017 end-page: 474 ident: br0510 article-title: Multi-granularity sequence labeling model for acronym expansion identification publication-title: Inf. Sci. – volume: vol. 78 start-page: 9:1 year: 2017 end-page: 9:14 ident: br0170 article-title: On-line pattern matching on similar texts publication-title: 28th Annual Symposium on Combinatorial Pattern Matching – volume: vol. 132 start-page: 21:1 year: 2019 end-page: 21:15 ident: br0200 article-title: Even faster elastic-degenerate string matching via fast matrix multiplication publication-title: 46th International Colloquium on Automata, Languages, and Programming, ICALP – start-page: 14:1 year: 2024 end-page: 14:21 ident: br0370 article-title: A unifying taxonomy of pattern matching in degenerate strings and founder graphs publication-title: 24th International Workshop on Algorithms in Bioinformatics – volume: 9 start-page: 4022 year: 1970 end-page: 4027 ident: br0100 article-title: IUPAC-IUB commission on biochemical nomenclature, abbreviations and symbols for nucleic acids, polynucleotides, and their constituents publication-title: Biochemistry – year: 2004 ident: br0450 article-title: Finite Automata – volume: 63 start-page: 512 year: 2001 end-page: 530 ident: br0250 article-title: Which problems have strongly exponential complexity? publication-title: J. Comput. Syst. Sci. – year: 2009 ident: br0420 article-title: Introduction to Algorithms – volume: 21 start-page: 81 year: 2022 end-page: 108 ident: br0160 article-title: Computational graph pangenomics: a tutorial on data structures and their applications publication-title: Nat. Comput. – volume: vol. 172 start-page: 7:1 year: 2020 end-page: 7:18 ident: br0350 article-title: Linear time construction of indexable founder block graphs publication-title: 20th International Workshop on Algorithms in Bioinformatics – start-page: 451 year: 2003 end-page: 462 ident: br0520 article-title: A simple algorithm for identifying abbreviation definitions in biomedical text publication-title: Proceedings of the 8th Pacific Symposium on Biocomputing – volume: 85 start-page: 1586 year: 2023 end-page: 1623 ident: br0340 article-title: Algorithms and complexity on indexing founder graphs publication-title: Algorithmica – year: 1979 ident: br0020 article-title: Automatic Speech and Speaker Recognition – volume: vol. 1776 start-page: 88 year: 2000 end-page: 94 ident: br0440 article-title: The LCA problem revisited publication-title: LATIN 2000: Theoretical Informatics, 4th Latin American Symposium – volume: 10 start-page: 7 year: 2009 ident: br0500 article-title: BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature publication-title: BMC Bioinform. – volume: 88 start-page: 81 year: 2017 end-page: 87 ident: br0050 article-title: A faster and more accurate heuristic for cyclic edit distance computation publication-title: Pattern Recognit. Lett. – year: 2007 ident: br0080 article-title: Algorithms on Strings – volume: 175 start-page: 41 year: 2020 end-page: 58 ident: br0110 article-title: Comparing degenerate strings publication-title: Fundam. Inform. – year: 1997 ident: br0010 article-title: Algorithms on Strings, Trees, and Sequences - Computer Science and Computational Biology – volume: 348 start-page: 357 year: 2005 end-page: 365 ident: br0260 article-title: A new algorithm for optimal 2-constraint satisfaction and its implications publication-title: Theor. Comput. Sci. – volume: 33 start-page: 31 year: 2001 end-page: 88 ident: br0070 article-title: A guided tour to approximate string matching publication-title: ACM Comput. Surv. – volume: vol. 113 start-page: 21:1 year: 2018 end-page: 21:14 ident: br0310 article-title: Degenerate string comparison and applications publication-title: 18th International Workshop on Algorithms in Bioinformatics – year: 2011 ident: br0030 article-title: Modern Information Retrieval - the Concepts and Technology Behind Search – volume: 14 start-page: 31 year: 1986 end-page: 46 ident: br0610 article-title: An efficient string matching algorithm with k differences for nucleotide and amino acid sequences publication-title: Nucleic Acids Res. – volume: 62 start-page: 367 year: 2001 end-page: 375 ident: br0240 article-title: On the complexity of k-SAT publication-title: J. Comput. Syst. Sci. – volume: 1 start-page: 191 year: 1999 end-page: 198 ident: br0530 article-title: Recognizing acronyms and their definitions publication-title: Int. J. Doc. Anal. Recognit. – start-page: 457 year: 2016 end-page: 466 ident: br0290 article-title: Which regular expression patterns are hard to match? publication-title: IEEE 57th Annual Symposium on Foundations of Computer Science, FOCS 2016, 9-11 October 2016, Hyatt Regency, New Brunswick, New Jersey, USA – volume: 47 start-page: 1087 year: 2018 end-page: 1097 ident: br0600 article-title: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false) publication-title: SIAM J. Comput. – volume: vol. 259 start-page: 11:1 year: 2023 end-page: 11:20 ident: br0280 article-title: Comparing elastic-degenerate strings: algorithms, lower bounds, and applications publication-title: 34th Annual Symposium on Combinatorial Pattern Matching – volume: vol. 13568 start-page: 20 year: 2022 end-page: 37 ident: br0230 article-title: Elastic-degenerate string matching with 1 error publication-title: LATIN 2022: Theoretical Informatics - 15th Latin American Symposium – start-page: 434 year: 2014 end-page: 443 ident: br0270 article-title: Popular conjectures imply strong lower bounds for dynamic problems publication-title: 55th IEEE Annual Symposium on Foundations of Computer Science – volume: 982 year: 2024 ident: br0360 article-title: Elastic founder graphs improved and enhanced publication-title: Theor. Comput. Sci. – volume: vol. 105 start-page: 9:1 year: 2018 end-page: 9:10 ident: br0190 article-title: Faster online elastic degenerate string matching publication-title: Annual Symposium on Combinatorial Pattern Matching – volume: 8 start-page: 51 year: 2021 end-page: 72 ident: br0120 article-title: Historical perspective on the discovery of the quasispecies concept publication-title: Annu. Rev. Virol. – volume: 88 start-page: 517 year: 2020 end-page: 532 ident: br0470 article-title: Acronyms: identification, expansion and disambiguation publication-title: Ann. Math. Artif. Intell. – volume: 31 start-page: 538 year: 1984 end-page: 544 ident: br0460 article-title: Storing a sparse table with O(1) worst case access time publication-title: J. ACM – volume: 279 year: 2021 ident: br0130 article-title: Efficient pattern matching in elastic-degenerate strings publication-title: Inf. Comput. – volume: 34 start-page: 4290 year: 2018 end-page: 4292 ident: br0180 article-title: SOPanG: online text searching over a pan-genome publication-title: Bioinformatics – volume: 21 start-page: 264 year: 1978 end-page: 268 ident: br0040 article-title: A technique for isolating differences between files publication-title: Commun. ACM – start-page: 137 year: 1997 end-page: 143 ident: br0430 article-title: Optimal suffix tree construction with large alphabets publication-title: 38th Annual Symposium on Foundations of Computer Science – volume: vol. 11510 start-page: 237 year: 2019 ident: 10.1016/j.ic.2025.105296_br0150 article-title: Graph-based representations for supporting genome data analysis and visualization: opportunities and challenges – volume: 88 start-page: 81 year: 2017 ident: 10.1016/j.ic.2025.105296_br0050 article-title: A faster and more accurate heuristic for cyclic edit distance computation publication-title: Pattern Recognit. Lett. doi: 10.1016/j.patrec.2017.01.018 – volume: 982 year: 2024 ident: 10.1016/j.ic.2025.105296_br0360 article-title: Elastic founder graphs improved and enhanced publication-title: Theor. Comput. Sci. doi: 10.1016/j.tcs.2023.114269 – start-page: 924 year: 2020 ident: 10.1016/j.ic.2025.105296_br0090 article-title: Hide and mine in strings: hardness and algorithms – volume: 17 start-page: 52 issue: 4 year: 1986 ident: 10.1016/j.ic.2025.105296_br0620 article-title: Improved string matching with k mismatches publication-title: SIGACT News doi: 10.1145/8307.8309 – volume: 21 start-page: 168 issue: 1 year: 1974 ident: 10.1016/j.ic.2025.105296_br0570 article-title: The string-to-string correction problem publication-title: J. ACM doi: 10.1145/321796.321811 – volume: 12 start-page: 632 issue: 11 year: 1969 ident: 10.1016/j.ic.2025.105296_br0590 article-title: Algorithm 360: shortest-path forest with topological ordering [H] publication-title: Commun. ACM doi: 10.1145/363269.363610 – year: 1979 ident: 10.1016/j.ic.2025.105296_br0020 – start-page: 76 year: 2020 ident: 10.1016/j.ic.2025.105296_br0330 article-title: An efficient elastic-degenerate text index? Not likely – volume: 47 start-page: 1087 issue: 3 year: 2018 ident: 10.1016/j.ic.2025.105296_br0600 article-title: Edit distance cannot be computed in strongly subquadratic time (unless SETH is false) publication-title: SIAM J. Comput. doi: 10.1137/15M1053128 – volume: 27 start-page: 557 issue: 2 year: 1998 ident: 10.1016/j.ic.2025.105296_br0580 article-title: Incremental string comparison publication-title: SIAM J. Comput. doi: 10.1137/S0097539794264810 – volume: 62 start-page: 367 issue: 2 year: 2001 ident: 10.1016/j.ic.2025.105296_br0240 article-title: On the complexity of k-SAT publication-title: J. Comput. Syst. Sci. doi: 10.1006/jcss.2000.1727 – start-page: 457 year: 2016 ident: 10.1016/j.ic.2025.105296_br0290 article-title: Which regular expression patterns are hard to match? – ident: 10.1016/j.ic.2025.105296_br0560 – volume: 812 start-page: 109 year: 2020 ident: 10.1016/j.ic.2025.105296_br0220 article-title: Approximate pattern matching on elastic-degenerate text publication-title: Theor. Comput. Sci. doi: 10.1016/j.tcs.2019.08.012 – start-page: 137 year: 1997 ident: 10.1016/j.ic.2025.105296_br0430 article-title: Optimal suffix tree construction with large alphabets – volume: vol. 5917 start-page: 75 year: 2009 ident: 10.1016/j.ic.2025.105296_br0390 article-title: The complexity of satisfiability of small depth circuits – start-page: 645 year: 2010 ident: 10.1016/j.ic.2025.105296_br0400 article-title: Subcubic equivalences between path, matrix and triangle problems – start-page: 14:1 year: 2024 ident: 10.1016/j.ic.2025.105296_br0370 article-title: A unifying taxonomy of pattern matching in degenerate strings and founder graphs – volume: vol. 132 start-page: 21:1 year: 2019 ident: 10.1016/j.ic.2025.105296_br0200 article-title: Even faster elastic-degenerate string matching via fast matrix multiplication – year: 2007 ident: 10.1016/j.ic.2025.105296_br0080 – volume: 1 start-page: 191 issue: 4 year: 1999 ident: 10.1016/j.ic.2025.105296_br0530 article-title: Recognizing acronyms and their definitions publication-title: Int. J. Doc. Anal. Recognit. doi: 10.1007/s100320050018 – start-page: 3285 year: 2020 ident: 10.1016/j.ic.2025.105296_br0540 article-title: What does this acronym mean? Introducing a new dataset for acronym identification and disambiguation – year: 2004 ident: 10.1016/j.ic.2025.105296_br0450 – volume: 8 start-page: 51 issue: 1 year: 2021 ident: 10.1016/j.ic.2025.105296_br0120 article-title: Historical perspective on the discovery of the quasispecies concept publication-title: Annu. Rev. Virol. doi: 10.1146/annurev-virology-091919-105900 – volume: vol. 1776 start-page: 88 year: 2000 ident: 10.1016/j.ic.2025.105296_br0440 article-title: The LCA problem revisited – volume: 88 start-page: 517 issue: 5–6 year: 2020 ident: 10.1016/j.ic.2025.105296_br0470 article-title: Acronyms: identification, expansion and disambiguation publication-title: Ann. Math. Artif. Intell. doi: 10.1007/s10472-018-9608-8 – volume: vol. 259 start-page: 11:1 year: 2023 ident: 10.1016/j.ic.2025.105296_br0280 article-title: Comparing elastic-degenerate strings: algorithms, lower bounds, and applications – start-page: 434 year: 2014 ident: 10.1016/j.ic.2025.105296_br0270 article-title: Popular conjectures imply strong lower bounds for dynamic problems – year: 2009 ident: 10.1016/j.ic.2025.105296_br0420 – volume: 34 start-page: 4290 issue: 24 year: 2018 ident: 10.1016/j.ic.2025.105296_br0180 article-title: SOPanG: online text searching over a pan-genome publication-title: Bioinformatics doi: 10.1093/bioinformatics/bty506 – volume: vol. 13568 start-page: 20 year: 2022 ident: 10.1016/j.ic.2025.105296_br0230 article-title: Elastic-degenerate string matching with 1 error – volume: vol. 161 start-page: 14:1 year: 2020 ident: 10.1016/j.ic.2025.105296_br0320 article-title: On indeterminate strings matching – volume: 33 start-page: 31 issue: 1 year: 2001 ident: 10.1016/j.ic.2025.105296_br0070 article-title: A guided tour to approximate string matching publication-title: ACM Comput. Surv. doi: 10.1145/375360.375365 – start-page: 3354 year: 2024 ident: 10.1016/j.ic.2025.105296_br0300 article-title: Sparse regular expression matching – volume: vol. 105 start-page: 9:1 year: 2018 ident: 10.1016/j.ic.2025.105296_br0190 article-title: Faster online elastic degenerate string matching – volume: 4 year: 2024 ident: 10.1016/j.ic.2025.105296_br0550 article-title: Pangenome comparison via ED strings publication-title: Front. Bioinform. doi: 10.3389/fbinf.2024.1397036 – volume: 21 start-page: 264 issue: 4 year: 1978 ident: 10.1016/j.ic.2025.105296_br0040 article-title: A technique for isolating differences between files publication-title: Commun. ACM doi: 10.1145/359460.359467 – volume: 10 start-page: 7 issue: S-15 year: 2009 ident: 10.1016/j.ic.2025.105296_br0500 article-title: BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature publication-title: BMC Bioinform. doi: 10.1186/1471-2105-10-S15-S7 – volume: 9 start-page: 4022 issue: 20 year: 1970 ident: 10.1016/j.ic.2025.105296_br0100 article-title: IUPAC-IUB commission on biochemical nomenclature, abbreviations and symbols for nucleic acids, polynucleotides, and their constituents publication-title: Biochemistry doi: 10.1021/bi00822a023 – volume: 175 start-page: 41 issue: 1–4 year: 2020 ident: 10.1016/j.ic.2025.105296_br0110 article-title: Comparing degenerate strings publication-title: Fundam. Inform. doi: 10.3233/FI-2020-1947 – volume: vol. 172 start-page: 7:1 year: 2020 ident: 10.1016/j.ic.2025.105296_br0350 article-title: Linear time construction of indexable founder block graphs – volume: 19 start-page: 21:1 issue: 3 year: 2023 ident: 10.1016/j.ic.2025.105296_br0380 article-title: On the complexity of string matching for graphs publication-title: ACM Trans. Algorithms doi: 10.1145/3588334 – volume: 279 year: 2021 ident: 10.1016/j.ic.2025.105296_br0130 article-title: Efficient pattern matching in elastic-degenerate strings publication-title: Inf. Comput. doi: 10.1016/j.ic.2020.104616 – start-page: 451 year: 2003 ident: 10.1016/j.ic.2025.105296_br0520 article-title: A simple algorithm for identifying abbreviation definitions in biomedical text – volume: 51 start-page: 549 issue: 3 year: 2022 ident: 10.1016/j.ic.2025.105296_br0210 article-title: Elastic-degenerate string matching via fast matrix multiplication publication-title: SIAM J. Comput. doi: 10.1137/20M1368033 – volume: vol. 2831 year: 2021 ident: 10.1016/j.ic.2025.105296_br0490 article-title: Effective ensembling of transformer based language models for acronyms identification – volume: 14 start-page: 31 issue: 1 year: 1986 ident: 10.1016/j.ic.2025.105296_br0610 article-title: An efficient string matching algorithm with k differences for nucleotide and amino acid sequences publication-title: Nucleic Acids Res. doi: 10.1093/nar/14.1.31 – volume: 63 start-page: 512 issue: 4 year: 2001 ident: 10.1016/j.ic.2025.105296_br0250 article-title: Which problems have strongly exponential complexity? publication-title: J. Comput. Syst. Sci. doi: 10.1006/jcss.2001.1774 – year: 1997 ident: 10.1016/j.ic.2025.105296_br0010 – volume: 31 start-page: 538 issue: 3 year: 1984 ident: 10.1016/j.ic.2025.105296_br0460 article-title: Storing a sparse table with O(1) worst case access time publication-title: J. ACM doi: 10.1145/828.1884 – volume: 85 start-page: 1586 issue: 6 year: 2023 ident: 10.1016/j.ic.2025.105296_br0340 article-title: Algorithms and complexity on indexing founder graphs publication-title: Algorithmica doi: 10.1007/s00453-022-01007-w – volume: 19 start-page: 118 issue: 1 year: 2018 ident: 10.1016/j.ic.2025.105296_br0140 article-title: Computational pan-genomics: status, promises and challenges publication-title: Brief. Bioinform. – volume: 50 start-page: 191 issue: 4 year: 1994 ident: 10.1016/j.ic.2025.105296_br0060 article-title: An algorithm for approximate membership checking with application to password security publication-title: Inf. Process. Lett. doi: 10.1016/0020-0190(94)00032-8 – year: 2011 ident: 10.1016/j.ic.2025.105296_br0030 – volume: 378 start-page: 462 year: 2017 ident: 10.1016/j.ic.2025.105296_br0510 article-title: Multi-granularity sequence labeling model for acronym expansion identification publication-title: Inf. Sci. doi: 10.1016/j.ins.2016.06.045 – volume: 348 start-page: 357 issue: 2–3 year: 2005 ident: 10.1016/j.ic.2025.105296_br0260 article-title: A new algorithm for optimal 2-constraint satisfaction and its implications publication-title: Theor. Comput. Sci. doi: 10.1016/j.tcs.2005.09.023 – year: 2006 ident: 10.1016/j.ic.2025.105296_br0410 – volume: vol. 78 start-page: 9:1 year: 2017 ident: 10.1016/j.ic.2025.105296_br0170 article-title: On-line pattern matching on similar texts – volume: vol. 113 start-page: 21:1 year: 2018 ident: 10.1016/j.ic.2025.105296_br0310 article-title: Degenerate string comparison and applications – volume: 21 start-page: 81 issue: 1 year: 2022 ident: 10.1016/j.ic.2025.105296_br0160 article-title: Computational graph pangenomics: a tutorial on data structures and their applications publication-title: Nat. Comput. doi: 10.1007/s11047-022-09882-6 – start-page: 52 year: 2016 ident: 10.1016/j.ic.2025.105296_br0480 article-title: Unsupervised resolution of acronyms and abbreviations in nursing notes using document-level context models |
| SSID | ssj0011546 |
| Score | 2.4360602 |
| Snippet | An elastic-degenerate (ED) string T is a sequence of n sets T[1],…,T[n] containing m strings in total whose cumulative length is N. We call n, m, and N the... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 105296 |
| SubjectTerms | Acronym identification Elastic-degenerate string Languages intersection Pangenome Sequence comparison |
| Title | Elastic-degenerate string comparison |
| URI | https://dx.doi.org/10.1016/j.ic.2025.105296 |
| Volume | 304 |
| WOSCitedRecordID | wos001458803200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 issn: 0890-5401 databaseCode: AIEXJ dateStart: 20211207 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: false ssIdentifier: ssj0011546 providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT4NAEN5o60EPvo31FQ5ePFBhgQWOjamvpMbEmvRGlmVpaAw1BR8_31l2eVSt0YMXQjYwkP2Gjw9mdgahUxZiFxM_1F3sRTq8jz095D7VXeYYNg-pRQ1WNJtw7-680ci_VwUVsqKdgJum3vu7__yvUMMYgC2Wzv4B7sooDMA-gA5bgB22vwK-D3oYBvWIj4uS0rmoITtTa2tVz8GmJFULkvIyL5kVfR7mAvRXwlMKNPpZLkPrYe1VgzeaJrL79WCacSDOCR0nFecmGYCXSCaPEtoYz2R9gwfgX2DibvMHBHbqdL-Sp3xD5FeYTVK1DLtBi6aIJ5JvGVv-PJh0E1FPEjvd-tD54tifXlpVKmGZpTYJEhYIC4G0sIza2HV8ILp276Y_uq1CS6ZavVXetYpdy6S_-bv4Xqs09MdwE62rDwetJwHfQks83UYbZVMOTXH0NlprVJjcQadfvUGT3qDV3rCLHi_7w4trXTXG0JllebkeE4dTIQUpY57NacxwDMLOs7hBjFhEWkOCuWsQTilmEY8os-HZi0Px9UkItfZQK52mfB9ptgkWsMmZTX3bjEGc8IiACdeLGeOx30Hn5RQETFWNF81LnoJFE99BZ9UZz7Jiyg_HWuWsBkrxSSUXgHssPOvgD1c4RKu1zx6hVj574cdohb3mSTY7Ub7xAc7_cqI |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Elastic-degenerate+string+comparison&rft.jtitle=Information+and+computation&rft.au=Gabory%2C+Est%C3%A9ban&rft.au=Mwaniki%2C+Moses+Njagi&rft.au=Pisanti%2C+Nadia&rft.au=Pissis%2C+Solon+P.&rft.date=2025-05-01&rft.issn=0890-5401&rft.volume=304&rft.spage=105296&rft_id=info:doi/10.1016%2Fj.ic.2025.105296&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_ic_2025_105296 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0890-5401&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0890-5401&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0890-5401&client=summon |