Efficient Exact Online String Matching Through Linked Weak Factors
Uloženo v:
| Název: | Efficient Exact Online String Matching Through Linked Weak Factors |
|---|---|
| Autoři: | Matthew N. Palmer, Simone Faro, Stefano Scafiti |
| Přispěvatelé: | Matthew N. Palmer and Simone Faro and Stefano Scafiti |
| Informace o vydavateli: | Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2024. |
| Rok vydání: | 2024 |
| Témata: | String matching, text processing, weak recognition, hashing, experimental algorithms, design and analysis of algorithms, design and analysis of algorithms, experimental algorithms, String matching, text processing, ddc:004, weak recognition, hashing |
| Popis: | Online exact string matching is a fundamental computational problem in computer science, involving the sequential search for a pattern within a large text without prior access to the entire text. Its significance is underscored by its diverse applications in data compression, data mining, text editing, and bioinformatics, just to cite a few, where efficient substring matching is crucial. While the problem has been a subject of study for years, recent decades have witnessed a heightened focus on experimental solutions, employing various techniques to achieve superior performance. Notably, approaches centered around weak factor recognition have emerged as leaders in experimental settings, gaining increasing attention. This paper introduces Hash Chain, a novel algorithm founded on a robust weak factor recognition approach that links adjacent factors through hashing. Building upon the efficacy of weak recognition techniques, the proposed algorithm incorporates innovative strategies for organizing data structures and optimizations to enhance performance. Despite its quadratic worst-case time complexity, the new proposed algorithm demonstrates sublinear behavior in practice, outperforming currently known algorithms in the literature. |
| Druh dokumentu: | Conference object Article |
| Popis souboru: | application/pdf |
| Jazyk: | English |
| DOI: | 10.4230/lipics.sea.2024.24 |
| Přístupová URL adresa: | https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.24 https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2024.24 https://hdl.handle.net/20.500.11769/641810 https://doi.org/10.4230/lipics.sea.2024.24 |
| Rights: | CC BY |
| Přístupové číslo: | edsair.dedup.wf.002..6e09623261eddb40a7e4f85b8bdfa382 |
| Databáze: | OpenAIRE |
| Abstrakt: | Online exact string matching is a fundamental computational problem in computer science, involving the sequential search for a pattern within a large text without prior access to the entire text. Its significance is underscored by its diverse applications in data compression, data mining, text editing, and bioinformatics, just to cite a few, where efficient substring matching is crucial. While the problem has been a subject of study for years, recent decades have witnessed a heightened focus on experimental solutions, employing various techniques to achieve superior performance. Notably, approaches centered around weak factor recognition have emerged as leaders in experimental settings, gaining increasing attention. This paper introduces Hash Chain, a novel algorithm founded on a robust weak factor recognition approach that links adjacent factors through hashing. Building upon the efficacy of weak recognition techniques, the proposed algorithm incorporates innovative strategies for organizing data structures and optimizations to enhance performance. Despite its quadratic worst-case time complexity, the new proposed algorithm demonstrates sublinear behavior in practice, outperforming currently known algorithms in the literature. |
|---|---|
| DOI: | 10.4230/lipics.sea.2024.24 |
Nájsť tento článok vo Web of Science