Fast exact string matching algorithms

String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets.

Uložené v:
Podrobná bibliografia
Vydané v:Information processing letters Ročník 102; číslo 6; s. 229 - 235
Hlavný autor: Lecroq, Thierry
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Amsterdam Elsevier B.V 15.06.2007
Elsevier Science
Elsevier Sequoia S.A
Elsevier
Predmet:
ISSN:0020-0190, 1872-6119
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets.
AbstractList String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets. [PUBLICATION ABSTRACT]
String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on hashing q-grams. The new algorithms are the fastest on many cases, in particular, on small size alphabets.
Author Lecroq, Thierry
Author_xml – sequence: 1
  givenname: Thierry
  surname: Lecroq
  fullname: Lecroq, Thierry
  email: thierry.lecroq@univ-rouen.fr
  organization: LITIS, Faculté des Sciences et des Techniques, Université de Rouen, 76821 Mont-Saint-Aignan Cedex, France
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=18702888$$DView record in Pascal Francis
https://hal.science/hal-00468876$$DView record in HAL
BookMark eNp9kE1LAzEQhoMoWD9-gLci9OBh15mku8niScSqUPCi5zCbzbYp292apKL_3pSKBw89zTA878vwnLHjfugtY1cIOQKWt6vcbbqcA8gcMAfgR2yESvKsRKyO2ShdIAOs4JSdhbACgHIq5IhNZhTi2H6RieMQvesX4zVFs9wt1C0G7-JyHS7YSUtdsJe_85y9zx7fHp6z-evTy8P9PDNCqZgJrAsForGqrDmXpmokB4KaF3UFZV2ImrfA0RRVI3hbThU107YqOBKS5JUV5-xm37ukTm-8W5P_1gM5_Xw_17sbwLRUSpafmNjrPbvxw8fWhqhXw9b36T3NheSqkigSNPmFKBjqWk-9ceGvOhkCrpRKnNxzxg8heNtq4yJFN_TRk-s0gt5p1iudNOudZg2YvuEpif-Sf-UHMnf7jE0uP531Ohhne2Mb562JuhncgfQPmdqTqQ
CODEN IFPLAT
CitedBy_id crossref_primary_10_1109_ACCESS_2019_2914071
crossref_primary_10_1016_j_ipm_2022_103057
crossref_primary_10_1145_1764810_1764822
crossref_primary_10_1145_1764810_1764829
crossref_primary_10_1007_s40745_016_0080_1
crossref_primary_10_3390_s141224188
crossref_primary_10_1007_s13198_023_01948_7
crossref_primary_10_1016_j_tcs_2019_09_031
crossref_primary_10_1016_j_eswa_2017_03_026
crossref_primary_10_32604_cmc_2021_016081
crossref_primary_10_1145_3301295
crossref_primary_10_1016_j_compbiomed_2021_104292
crossref_primary_10_1007_s42979_022_01052_w
crossref_primary_10_1016_j_ipl_2009_01_022
crossref_primary_10_1016_j_jbiotec_2022_09_015
crossref_primary_10_1016_j_ipl_2012_11_005
crossref_primary_10_1371_journal_pone_0200912
crossref_primary_10_1038_srep41039
crossref_primary_10_1093_comjnl_bxx123
crossref_primary_10_1109_TKDE_2013_155
crossref_primary_10_1016_j_ipl_2017_03_005
crossref_primary_10_1016_j_tcs_2022_08_028
crossref_primary_10_1016_j_jda_2014_07_003
crossref_primary_10_1145_2431211_2431212
crossref_primary_10_1177_0165551514555668
crossref_primary_10_1080_02522667_2017_1374730
crossref_primary_10_1002_cpe_6505
crossref_primary_10_4018_IJSWIS_2017100110
crossref_primary_10_1016_j_ipl_2009_11_010
crossref_primary_10_1016_j_ipl_2012_02_010
Cites_doi 10.1002/spe.4380211105
10.1147/rd.312.0249
10.1145/351827.384246
10.1021/ci030463z
ContentType Journal Article
Copyright 2007 Elsevier B.V.
2007 INIST-CNRS
Copyright Elsevier Sequoia S.A. Jun 15, 2007
Distributed under a Creative Commons Attribution 4.0 International License
Copyright_xml – notice: 2007 Elsevier B.V.
– notice: 2007 INIST-CNRS
– notice: Copyright Elsevier Sequoia S.A. Jun 15, 2007
– notice: Distributed under a Creative Commons Attribution 4.0 International License
DBID AAYXX
CITATION
IQODW
7SC
8FD
JQ2
L7M
L~C
L~D
1XC
DOI 10.1016/j.ipl.2007.01.002
DatabaseName CrossRef
Pascal-Francis
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Hyper Article en Ligne (HAL)
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Applied Sciences
EISSN 1872-6119
EndPage 235
ExternalDocumentID oai:HAL:hal-00468876v1
1258175391
18702888
10_1016_j_ipl_2007_01_002
S0020019007000117
Genre Feature
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1RT
1~.
1~5
29I
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABFSI
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BKOMP
BLXMC
CS3
DU5
E.L
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBLVA
GBOLZ
HLZ
HMJ
HVGLF
HZ~
IHE
J1W
KOM
LG9
M26
M41
MO0
MS~
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PQQKQ
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SES
SEW
SME
SPC
SPCBC
SSV
SSZ
T5K
TN5
UQL
WH7
WUQ
XPP
ZMT
ZY4
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
AFXIZ
AGCQF
AGRNS
BNPGV
IQODW
SSH
7SC
8FD
JQ2
L7M
L~C
L~D
1XC
ID FETCH-LOGICAL-c388t-31b5803de86b227c9d720a0b25b906b53b2f021c59d32f648ad4f9521a1a729e3
ISICitedReferencesCount 67
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000246622100003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0020-0190
IngestDate Sat Nov 29 15:05:15 EST 2025
Fri Jul 25 05:42:27 EDT 2025
Mon Jul 21 09:13:11 EDT 2025
Tue Nov 18 21:53:47 EST 2025
Sat Nov 29 03:44:14 EST 2025
Fri Feb 23 02:30:03 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords Design of algorithms
Hashing
String matching
Computer theory
Information processing
Fast algorithm
Algorithm analysis
Language English
License https://www.elsevier.com/tdm/userlicense/1.0
CC BY 4.0
Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c388t-31b5803de86b227c9d720a0b25b906b53b2f021c59d32f648ad4f9521a1a729e3
Notes SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ORCID 0000-0002-1900-3397
PQID 237289713
PQPubID 45522
PageCount 7
ParticipantIDs hal_primary_oai_HAL_hal_00468876v1
proquest_journals_237289713
pascalfrancis_primary_18702888
crossref_citationtrail_10_1016_j_ipl_2007_01_002
crossref_primary_10_1016_j_ipl_2007_01_002
elsevier_sciencedirect_doi_10_1016_j_ipl_2007_01_002
PublicationCentury 2000
PublicationDate 2007-06-15
PublicationDateYYYYMMDD 2007-06-15
PublicationDate_xml – month: 06
  year: 2007
  text: 2007-06-15
  day: 15
PublicationDecade 2000
PublicationPlace Amsterdam
PublicationPlace_xml – name: Amsterdam
PublicationTitle Information processing letters
PublicationYear 2007
Publisher Elsevier B.V
Elsevier Science
Elsevier Sequoia S.A
Elsevier
Publisher_xml – name: Elsevier B.V
– name: Elsevier Science
– name: Elsevier Sequoia S.A
– name: Elsevier
References Navarro, Raffinot (bib009) 2000; 5
S. Wu, U. Manber, A fast algorithm for multi-pattern searching, Report TR-94-17, Department of Computer Science, University of Arizona, Tucson, AZ, 1994
Zhu, Takaoka (bib013) 1987; 10
Allauzen, Crochemore, Raffinot (bib001) 1999; vol. 1725
Fredriksson, Grabowski (bib005) 2005; vol. 3772
Cantone, Faro (bib002) 2003; vol. 2647
Navarro, Raffinot (bib010) 2002
M. Crochemore, T. Lecroq, A fast implementation of the Boyer–Moore string matching algorithm, submitted for publication
Charras, Lecroq (bib003) 2004
Karp, Rabin (bib008) 1987; 31
Sheik, Aggarwal, Poddar, Balakrishnan, Sekar (bib011) 2004; 44
Holub, Durian (bib006) 2005
Hume, Sunday (bib007) 1991; 21
Allauzen (10.1016/j.ipl.2007.01.002_bib001) 1999; vol. 1725
Cantone (10.1016/j.ipl.2007.01.002_bib002) 2003; vol. 2647
Sheik (10.1016/j.ipl.2007.01.002_bib011) 2004; 44
Navarro (10.1016/j.ipl.2007.01.002_bib010) 2002
Navarro (10.1016/j.ipl.2007.01.002_bib009) 2000; 5
Karp (10.1016/j.ipl.2007.01.002_bib008) 1987; 31
10.1016/j.ipl.2007.01.002_bib012
Holub (10.1016/j.ipl.2007.01.002_bib006)
10.1016/j.ipl.2007.01.002_bib004
Zhu (10.1016/j.ipl.2007.01.002_bib013) 1987; 10
Charras (10.1016/j.ipl.2007.01.002_bib003) 2004
Hume (10.1016/j.ipl.2007.01.002_bib007) 1991; 21
Fredriksson (10.1016/j.ipl.2007.01.002_bib005) 2005; vol. 3772
References_xml – volume: vol. 3772
  start-page: 374
  year: 2005
  end-page: 385
  ident: bib005
  article-title: Practical and optimal string matching
  publication-title: Proceedings of SPIRE'2005
– volume: 5
  start-page: 4
  year: 2000
  ident: bib009
  article-title: Fast and flexible string matching by combining bit-parallelism and suffix automata
  publication-title: ACM Journal of Experimental Algorithms
– year: 2005
  ident: bib006
  article-title: Fast variants of bit parallel approach to suffix automata. Talk given in: The Second Haifa Annual International Stringology Research Workshop of the Israeli Science Foundation
– volume: vol. 1725
  start-page: 291
  year: 1999
  end-page: 306
  ident: bib001
  article-title: Factor oracle: A new structure for pattern matching
  publication-title: Proceedings of SOFSEM'99, Theory and Practice of Informatics
– volume: 10
  start-page: 173
  year: 1987
  end-page: 177
  ident: bib013
  article-title: On improving the average case of the Boyer–Moore string matching algorithm
  publication-title: J. Inform. Process.
– year: 2002
  ident: bib010
  article-title: Flexible Pattern Matching in Strings—Practical On-Line Search Algorithms for Texts and Biological Sequences
– volume: 31
  start-page: 249
  year: 1987
  end-page: 260
  ident: bib008
  article-title: Efficient randomized pattern-matching algorithms
  publication-title: IBM J. Res. Dev.
– reference: S. Wu, U. Manber, A fast algorithm for multi-pattern searching, Report TR-94-17, Department of Computer Science, University of Arizona, Tucson, AZ, 1994
– year: 2004
  ident: bib003
  article-title: Handbook of Exact String Matching Algorithms
– reference: M. Crochemore, T. Lecroq, A fast implementation of the Boyer–Moore string matching algorithm, submitted for publication
– volume: vol. 2647
  start-page: 47
  year: 2003
  end-page: 58
  ident: bib002
  article-title: Fast-search: A new efficient variant of the Boyer–Moore string matching algorithm
  publication-title: Proceedings of the 2nd International Workshop on Experimental and Efficient Algorithms
– volume: 21
  start-page: 1221
  year: 1991
  end-page: 1248
  ident: bib007
  article-title: Fast string searching
  publication-title: Software—Practice & Experience
– volume: 44
  start-page: 1251
  year: 2004
  end-page: 1256
  ident: bib011
  article-title: A fast pattern matching algorithm
  publication-title: J. Chem. Inf. Comput. Sci.
– ident: 10.1016/j.ipl.2007.01.002_bib012
– volume: vol. 1725
  start-page: 291
  year: 1999
  ident: 10.1016/j.ipl.2007.01.002_bib001
  article-title: Factor oracle: A new structure for pattern matching
– year: 2004
  ident: 10.1016/j.ipl.2007.01.002_bib003
– volume: 21
  start-page: 1221
  issue: 11
  year: 1991
  ident: 10.1016/j.ipl.2007.01.002_bib007
  article-title: Fast string searching
  publication-title: Software—Practice & Experience
  doi: 10.1002/spe.4380211105
– volume: 31
  start-page: 249
  issue: 2
  year: 1987
  ident: 10.1016/j.ipl.2007.01.002_bib008
  article-title: Efficient randomized pattern-matching algorithms
  publication-title: IBM J. Res. Dev.
  doi: 10.1147/rd.312.0249
– ident: 10.1016/j.ipl.2007.01.002_bib004
– volume: 10
  start-page: 173
  issue: 3
  year: 1987
  ident: 10.1016/j.ipl.2007.01.002_bib013
  article-title: On improving the average case of the Boyer–Moore string matching algorithm
  publication-title: J. Inform. Process.
– volume: vol. 3772
  start-page: 374
  year: 2005
  ident: 10.1016/j.ipl.2007.01.002_bib005
  article-title: Practical and optimal string matching
– year: 2002
  ident: 10.1016/j.ipl.2007.01.002_bib010
– ident: 10.1016/j.ipl.2007.01.002_bib006
– volume: 5
  start-page: 4
  year: 2000
  ident: 10.1016/j.ipl.2007.01.002_bib009
  article-title: Fast and flexible string matching by combining bit-parallelism and suffix automata
  publication-title: ACM Journal of Experimental Algorithms
  doi: 10.1145/351827.384246
– volume: vol. 2647
  start-page: 47
  year: 2003
  ident: 10.1016/j.ipl.2007.01.002_bib002
  article-title: Fast-search: A new efficient variant of the Boyer–Moore string matching algorithm
– volume: 44
  start-page: 1251
  year: 2004
  ident: 10.1016/j.ipl.2007.01.002_bib011
  article-title: A fast pattern matching algorithm
  publication-title: J. Chem. Inf. Comput. Sci.
  doi: 10.1021/ci030463z
SSID ssj0006437
Score 2.1251228
Snippet String matching is the problem of finding all the occurrences of a pattern in a text. We propose a very fast new family of string matching algorithms based on...
SourceID hal
proquest
pascalfrancis
crossref
elsevier
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 229
SubjectTerms Algorithmics. Computability. Computer arithmetics
Algorithms
Applied sciences
Computer Science
Computer science; control theory; systems
Data processing. List processing. Character string processing
Data Structures and Algorithms
Design of algorithms
Exact sciences and technology
Hashing
Information processing
Memory organisation. Data processing
Software
String matching
Studies
Theoretical computing
Title Fast exact string matching algorithms
URI https://dx.doi.org/10.1016/j.ipl.2007.01.002
https://www.proquest.com/docview/237289713
https://hal.science/hal-00468876
Volume 102
WOSCitedRecordID wos000246622100003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1872-6119
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006437
  issn: 0020-0190
  databaseCode: AIEXJ
  dateStart: 19950113
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3da9swEBdruofB2HdZ1q2Ysb0suNgn25Iew2jpRil76CBvQrbsNiE4XuyV_Pk7WZKzLqxsg70YI_ypn3R3Ov3ujpB3jAmR4PI_VELpMKFAwzzNaMi5AF1CSalO-mIT7OKCz2biiyt41_blBFhd881GNP8VamxDsE3o7F_APTwUG_AcQccjwo7HPwL-VLXdpNyY2EdTk6O-mqBRahmTanm1Ws-7a5ehfOFZ7EME46SxcQPm4mUf6DOY3OdoX66-WU4RqlLHHvb-gp7cZiMmrRNrJ5DFkfoNP81W7jwurSzkDHBl6SSaF5YR_DQqbok-57mwWhRsEpIdAW19BYvjebN0-SNj49PaaqOBI2i2kY0BahIS9Znr9sg-sFTwEdmffjqZfR4Urtl7tEwe-w9-87qn8f3yot-ZH3vXhgf7sFEtTo3K1jTZUc-9zXH5hDxyi4VgakF-Su6V9TPy2BfiCJxcfk7eG8yDHvPAYh54zIMt5i_I19OTy49noat_ERaU8w7VY57yiOqSZzkAK4RmEKkohzQXUZanNIcKTbQiFZpClSVc6aQSaI-pWOGaqaQHZFSv6vIlCUBBpXAxrDOTw04nQnGmctBKCVHFCsYk8t0iC5cc3tQoWUrPAlxI7ElTtJTJKJbYk2PyYbilsZlR7ro48X0tnWlnTTaJA-Ou294iLsPjTSr0s-m5NG3GsYMKMruJx-ToFmzbr0FdBJzzMTn0OEo3UVsJlAEXLKav_u3TDsmD7ex6TUbd-nv5htwvbrp5uz5yQ_QH9wSOEw
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Fast+exact+string+matching+algorithms&rft.jtitle=Information+processing+letters&rft.au=Lecroq%2C+Thierry&rft.date=2007-06-15&rft.pub=Elsevier+B.V&rft.issn=0020-0190&rft.eissn=1872-6119&rft.volume=102&rft.issue=6&rft.spage=229&rft.epage=235&rft_id=info:doi/10.1016%2Fj.ipl.2007.01.002&rft.externalDocID=S0020019007000117
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0020-0190&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0020-0190&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0020-0190&client=summon