Fast exact string matching algorithms
String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets.
Uložené v:
| Vydané v: | Information processing letters Ročník 102; číslo 6; s. 229 - 235 |
|---|---|
| Hlavný autor: | |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Amsterdam
Elsevier B.V
15.06.2007
Elsevier Science Elsevier Sequoia S.A Elsevier |
| Predmet: | |
| ISSN: | 0020-0190, 1872-6119 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing
q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets. |
|---|---|
| AbstractList | String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets. [PUBLICATION ABSTRACT] String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets. |
| Author | Lecroq, Thierry |
| Author_xml | – sequence: 1 givenname: Thierry surname: Lecroq fullname: Lecroq, Thierry email: thierry.lecroq@univ-rouen.fr organization: LITIS, Faculté des Sciences et des Techniques, Université de Rouen, 76821 Mont-Saint-Aignan Cedex, France |
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18702888$$DView record in Pascal Francis https://hal.science/hal-00468876$$DView record in HAL |
| BookMark | eNp9kE1LAzEQhoMoWD9-gLci9OBh15mku8niScSqUPCi5zCbzbYp292apKL_3pSKBw89zTA878vwnLHjfugtY1cIOQKWt6vcbbqcA8gcMAfgR2yESvKsRKyO2ShdIAOs4JSdhbACgHIq5IhNZhTi2H6RieMQvesX4zVFs9wt1C0G7-JyHS7YSUtdsJe_85y9zx7fHp6z-evTy8P9PDNCqZgJrAsForGqrDmXpmokB4KaF3UFZV2ImrfA0RRVI3hbThU107YqOBKS5JUV5-xm37ukTm-8W5P_1gM5_Xw_17sbwLRUSpafmNjrPbvxw8fWhqhXw9b36T3NheSqkigSNPmFKBjqWk-9ceGvOhkCrpRKnNxzxg8heNtq4yJFN_TRk-s0gt5p1iudNOudZg2YvuEpif-Sf-UHMnf7jE0uP531Ohhne2Mb562JuhncgfQPmdqTqQ |
| CODEN | IFPLAT |
| CitedBy_id | crossref_primary_10_1109_ACCESS_2019_2914071 crossref_primary_10_1016_j_ipm_2022_103057 crossref_primary_10_1145_1764810_1764822 crossref_primary_10_1145_1764810_1764829 crossref_primary_10_1007_s40745_016_0080_1 crossref_primary_10_3390_s141224188 crossref_primary_10_1007_s13198_023_01948_7 crossref_primary_10_1016_j_tcs_2019_09_031 crossref_primary_10_1016_j_eswa_2017_03_026 crossref_primary_10_32604_cmc_2021_016081 crossref_primary_10_1145_3301295 crossref_primary_10_1016_j_compbiomed_2021_104292 crossref_primary_10_1007_s42979_022_01052_w crossref_primary_10_1016_j_ipl_2009_01_022 crossref_primary_10_1016_j_jbiotec_2022_09_015 crossref_primary_10_1016_j_ipl_2012_11_005 crossref_primary_10_1371_journal_pone_0200912 crossref_primary_10_1038_srep41039 crossref_primary_10_1093_comjnl_bxx123 crossref_primary_10_1109_TKDE_2013_155 crossref_primary_10_1016_j_ipl_2017_03_005 crossref_primary_10_1016_j_tcs_2022_08_028 crossref_primary_10_1016_j_jda_2014_07_003 crossref_primary_10_1145_2431211_2431212 crossref_primary_10_1177_0165551514555668 crossref_primary_10_1080_02522667_2017_1374730 crossref_primary_10_1002_cpe_6505 crossref_primary_10_4018_IJSWIS_2017100110 crossref_primary_10_1016_j_ipl_2009_11_010 crossref_primary_10_1016_j_ipl_2012_02_010 |
| Cites_doi | 10.1002/spe.4380211105 10.1147/rd.312.0249 10.1145/351827.384246 10.1021/ci030463z |
| ContentType | Journal Article |
| Copyright | 2007 Elsevier B.V. 2007 INIST-CNRS Copyright Elsevier Sequoia S.A. Jun 15, 2007 Distributed under a Creative Commons Attribution 4.0 International License |
| Copyright_xml | – notice: 2007 Elsevier B.V. – notice: 2007 INIST-CNRS – notice: Copyright Elsevier Sequoia S.A. Jun 15, 2007 – notice: Distributed under a Creative Commons Attribution 4.0 International License |
| DBID | AAYXX CITATION IQODW 7SC 8FD JQ2 L7M L~C L~D 1XC |
| DOI | 10.1016/j.ipl.2007.01.002 |
| DatabaseName | CrossRef Pascal-Francis Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Hyper Article en Ligne (HAL) |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science Applied Sciences |
| EISSN | 1872-6119 |
| EndPage | 235 |
| ExternalDocumentID | oai:HAL:hal-00468876v1 1258175391 18702888 10_1016_j_ipl_2007_01_002 S0020019007000117 |
| Genre | Feature |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1RT 1~. 1~5 29I 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABFSI ABJNI ABMAC ABTAH ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD AEBSH AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BKOJK BKOMP BLXMC CS3 DU5 E.L EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q G8K GBLVA GBOLZ HLZ HMJ HVGLF HZ~ IHE J1W KOM LG9 M26 M41 MO0 MS~ O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 R2- RIG ROL RPZ SBC SDF SDG SDP SES SEW SME SPC SPCBC SSV SSZ T5K TN5 UQL WH7 WUQ XPP ZMT ZY4 ~G- 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD AFXIZ AGCQF AGRNS BNPGV IQODW SSH 7SC 8FD JQ2 L7M L~C L~D 1XC |
| ID | FETCH-LOGICAL-c388t-31b5803de86b227c9d720a0b25b906b53b2f021c59d32f648ad4f9521a1a729e3 |
| ISICitedReferencesCount | 67 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000246622100003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0020-0190 |
| IngestDate | Sat Nov 29 15:05:15 EST 2025 Fri Jul 25 05:42:27 EDT 2025 Mon Jul 21 09:13:11 EDT 2025 Tue Nov 18 21:53:47 EST 2025 Sat Nov 29 03:44:14 EST 2025 Fri Feb 23 02:30:03 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Keywords | Design of algorithms Hashing String matching Computer theory Information processing Fast algorithm Algorithm analysis |
| Language | English |
| License | https://www.elsevier.com/tdm/userlicense/1.0 CC BY 4.0 Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0 |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c388t-31b5803de86b227c9d720a0b25b906b53b2f021c59d32f648ad4f9521a1a729e3 |
| Notes | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 |
| ORCID | 0000-0002-1900-3397 |
| PQID | 237289713 |
| PQPubID | 45522 |
| PageCount | 7 |
| ParticipantIDs | hal_primary_oai_HAL_hal_00468876v1 proquest_journals_237289713 pascalfrancis_primary_18702888 crossref_citationtrail_10_1016_j_ipl_2007_01_002 crossref_primary_10_1016_j_ipl_2007_01_002 elsevier_sciencedirect_doi_10_1016_j_ipl_2007_01_002 |
| PublicationCentury | 2000 |
| PublicationDate | 2007-06-15 |
| PublicationDateYYYYMMDD | 2007-06-15 |
| PublicationDate_xml | – month: 06 year: 2007 text: 2007-06-15 day: 15 |
| PublicationDecade | 2000 |
| PublicationPlace | Amsterdam |
| PublicationPlace_xml | – name: Amsterdam |
| PublicationTitle | Information processing letters |
| PublicationYear | 2007 |
| Publisher | Elsevier B.V Elsevier Science Elsevier Sequoia S.A Elsevier |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier Science – name: Elsevier Sequoia S.A – name: Elsevier |
| References | Navarro, Raffinot (bib009) 2000; 5 S. Wu, U. Manber, A fast algorithm for multi-pattern searching, Report TR-94-17, Department of Computer Science, University of Arizona, Tucson, AZ, 1994 Zhu, Takaoka (bib013) 1987; 10 Allauzen, Crochemore, Raffinot (bib001) 1999; vol. 1725 Fredriksson, Grabowski (bib005) 2005; vol. 3772 Cantone, Faro (bib002) 2003; vol. 2647 Navarro, Raffinot (bib010) 2002 M. Crochemore, T. Lecroq, A fast implementation of the Boyer–Moore string matching algorithm, submitted for publication Charras, Lecroq (bib003) 2004 Karp, Rabin (bib008) 1987; 31 Sheik, Aggarwal, Poddar, Balakrishnan, Sekar (bib011) 2004; 44 Holub, Durian (bib006) 2005 Hume, Sunday (bib007) 1991; 21 Allauzen (10.1016/j.ipl.2007.01.002_bib001) 1999; vol. 1725 Cantone (10.1016/j.ipl.2007.01.002_bib002) 2003; vol. 2647 Sheik (10.1016/j.ipl.2007.01.002_bib011) 2004; 44 Navarro (10.1016/j.ipl.2007.01.002_bib010) 2002 Navarro (10.1016/j.ipl.2007.01.002_bib009) 2000; 5 Karp (10.1016/j.ipl.2007.01.002_bib008) 1987; 31 10.1016/j.ipl.2007.01.002_bib012 Holub (10.1016/j.ipl.2007.01.002_bib006) 10.1016/j.ipl.2007.01.002_bib004 Zhu (10.1016/j.ipl.2007.01.002_bib013) 1987; 10 Charras (10.1016/j.ipl.2007.01.002_bib003) 2004 Hume (10.1016/j.ipl.2007.01.002_bib007) 1991; 21 Fredriksson (10.1016/j.ipl.2007.01.002_bib005) 2005; vol. 3772 |
| References_xml | – volume: vol. 3772 start-page: 374 year: 2005 end-page: 385 ident: bib005 article-title: Practical and optimal string matching publication-title: Proceedings of SPIRE'2005 – volume: 5 start-page: 4 year: 2000 ident: bib009 article-title: Fast and flexible string matching by combining bit-parallelism and suffix automata publication-title: ACM Journal of Experimental Algorithms – year: 2005 ident: bib006 article-title: Fast variants of bit parallel approach to suffix automata. Talk given in: The Second Haifa Annual International Stringology Research Workshop of the Israeli Science Foundation – volume: vol. 1725 start-page: 291 year: 1999 end-page: 306 ident: bib001 article-title: Factor oracle: A new structure for pattern matching publication-title: Proceedings of SOFSEM'99, Theory and Practice of Informatics – volume: 10 start-page: 173 year: 1987 end-page: 177 ident: bib013 article-title: On improving the average case of the Boyer–Moore string matching algorithm publication-title: J. Inform. Process. – year: 2002 ident: bib010 article-title: Flexible Pattern Matching in Strings—Practical On-Line Search Algorithms for Texts and Biological Sequences – volume: 31 start-page: 249 year: 1987 end-page: 260 ident: bib008 article-title: Efficient randomized pattern-matching algorithms publication-title: IBM J. Res. Dev. – reference: S. Wu, U. Manber, A fast algorithm for multi-pattern searching, Report TR-94-17, Department of Computer Science, University of Arizona, Tucson, AZ, 1994 – year: 2004 ident: bib003 article-title: Handbook of Exact String Matching Algorithms – reference: M. Crochemore, T. Lecroq, A fast implementation of the Boyer–Moore string matching algorithm, submitted for publication – volume: vol. 2647 start-page: 47 year: 2003 end-page: 58 ident: bib002 article-title: Fast-search: A new efficient variant of the Boyer–Moore string matching algorithm publication-title: Proceedings of the 2nd International Workshop on Experimental and Efficient Algorithms – volume: 21 start-page: 1221 year: 1991 end-page: 1248 ident: bib007 article-title: Fast string searching publication-title: Software—Practice & Experience – volume: 44 start-page: 1251 year: 2004 end-page: 1256 ident: bib011 article-title: A fast pattern matching algorithm publication-title: J. Chem. Inf. Comput. Sci. – ident: 10.1016/j.ipl.2007.01.002_bib012 – volume: vol. 1725 start-page: 291 year: 1999 ident: 10.1016/j.ipl.2007.01.002_bib001 article-title: Factor oracle: A new structure for pattern matching – year: 2004 ident: 10.1016/j.ipl.2007.01.002_bib003 – volume: 21 start-page: 1221 issue: 11 year: 1991 ident: 10.1016/j.ipl.2007.01.002_bib007 article-title: Fast string searching publication-title: Software—Practice & Experience doi: 10.1002/spe.4380211105 – volume: 31 start-page: 249 issue: 2 year: 1987 ident: 10.1016/j.ipl.2007.01.002_bib008 article-title: Efficient randomized pattern-matching algorithms publication-title: IBM J. Res. Dev. doi: 10.1147/rd.312.0249 – ident: 10.1016/j.ipl.2007.01.002_bib004 – volume: 10 start-page: 173 issue: 3 year: 1987 ident: 10.1016/j.ipl.2007.01.002_bib013 article-title: On improving the average case of the Boyer–Moore string matching algorithm publication-title: J. Inform. Process. – volume: vol. 3772 start-page: 374 year: 2005 ident: 10.1016/j.ipl.2007.01.002_bib005 article-title: Practical and optimal string matching – year: 2002 ident: 10.1016/j.ipl.2007.01.002_bib010 – ident: 10.1016/j.ipl.2007.01.002_bib006 – volume: 5 start-page: 4 year: 2000 ident: 10.1016/j.ipl.2007.01.002_bib009 article-title: Fast and flexible string matching by combining bit-parallelism and suffix automata publication-title: ACM Journal of Experimental Algorithms doi: 10.1145/351827.384246 – volume: vol. 2647 start-page: 47 year: 2003 ident: 10.1016/j.ipl.2007.01.002_bib002 article-title: Fast-search: A new efficient variant of the Boyer–Moore string matching algorithm – volume: 44 start-page: 1251 year: 2004 ident: 10.1016/j.ipl.2007.01.002_bib011 article-title: A fast pattern matching algorithm publication-title: J. Chem. Inf. Comput. Sci. doi: 10.1021/ci030463z |
| SSID | ssj0006437 |
| Score | 2.1251228 |
| Snippet | String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on... |
| SourceID | hal proquest pascalfrancis crossref elsevier |
| SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 229 |
| SubjectTerms | Algorithmics. Computability. Computer arithmetics Algorithms Applied sciences Computer Science Computer science; control theory; systems Data processing. List processing. Character string processing Data Structures and Algorithms Design of algorithms Exact sciences and technology Hashing Information processing Memory organisation. Data processing Software String matching Studies Theoretical computing |
| Title | Fast exact string matching algorithms |
| URI | https://dx.doi.org/10.1016/j.ipl.2007.01.002 https://www.proquest.com/docview/237289713 https://hal.science/hal-00468876 |
| Volume | 102 |
| WOSCitedRecordID | wos000246622100003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1872-6119 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006437 issn: 0020-0190 databaseCode: AIEXJ dateStart: 19950113 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3da9swEBdruofB2HdZ1q2Ysb0suNgn25Iew2jpRil76CBvQrbsNiE4XuyV_Pk7WZKzLqxsg70YI_ypn3R3Ov3ujpB3jAmR4PI_VELpMKFAwzzNaMi5AF1CSalO-mIT7OKCz2biiyt41_blBFhd881GNP8VamxDsE3o7F_APTwUG_AcQccjwo7HPwL-VLXdpNyY2EdTk6O-mqBRahmTanm1Ws-7a5ehfOFZ7EME46SxcQPm4mUf6DOY3OdoX66-WU4RqlLHHvb-gp7cZiMmrRNrJ5DFkfoNP81W7jwurSzkDHBl6SSaF5YR_DQqbok-57mwWhRsEpIdAW19BYvjebN0-SNj49PaaqOBI2i2kY0BahIS9Znr9sg-sFTwEdmffjqZfR4Urtl7tEwe-w9-87qn8f3yot-ZH3vXhgf7sFEtTo3K1jTZUc-9zXH5hDxyi4VgakF-Su6V9TPy2BfiCJxcfk7eG8yDHvPAYh54zIMt5i_I19OTy49noat_ERaU8w7VY57yiOqSZzkAK4RmEKkohzQXUZanNIcKTbQiFZpClSVc6aQSaI-pWOGaqaQHZFSv6vIlCUBBpXAxrDOTw04nQnGmctBKCVHFCsYk8t0iC5cc3tQoWUrPAlxI7ElTtJTJKJbYk2PyYbilsZlR7ro48X0tnWlnTTaJA-Ou294iLsPjTSr0s-m5NG3GsYMKMruJx-ToFmzbr0FdBJzzMTn0OEo3UVsJlAEXLKav_u3TDsmD7ex6TUbd-nv5htwvbrp5uz5yQ_QH9wSOEw |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Fast+exact+string+matching+algorithms&rft.jtitle=Information+processing+letters&rft.au=Lecroq%2C+Thierry&rft.date=2007-06-15&rft.pub=Elsevier+B.V&rft.issn=0020-0190&rft.eissn=1872-6119&rft.volume=102&rft.issue=6&rft.spage=229&rft.epage=235&rft_id=info:doi/10.1016%2Fj.ipl.2007.01.002&rft.externalDocID=S0020019007000117 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0020-0190&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0020-0190&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0020-0190&client=summon |