Pattern Matching in Hypertext

The importance of hypertext has been steadily growing over the past decade. The Internet and other information systems use hypertext format, with data organized associatively rather than sequentially or relationally. A myriad of textual problems have been considered in the pattern matching field wit...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of algorithms Ročník 35; číslo 1; s. 82 - 99
Hlavní autoři: Amir, Amihood, Lewenstein, Moshe, Lewenstein, Noa
Médium: Journal Article
Jazyk:angličtina
Vydáno: San Diego, CA Elsevier Inc 01.04.2000
Elsevier
Témata:
ISSN:0196-6774, 1090-2678
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The importance of hypertext has been steadily growing over the past decade. The Internet and other information systems use hypertext format, with data organized associatively rather than sequentially or relationally. A myriad of textual problems have been considered in the pattern matching field with many nontrivial results. Nevertheless, surprisingly little work has been done on the natural combination of pattern matching and hypertext. In contrast to regular text, hypertext has a nonlinear structure and the techniques of pattern matching for text cannot be directly applied to hypertext. Manber and Wu (1992, “IAPR Workshop on Structural and Syntactic Pattern Recognition, Bern, Switzerland”) pioneered the study of pattern matching in hypertext and defined a hypertext model for pattern matching. Akutsu (1993, “Procedures of the 4th Symposium on Combinatorial Pattern Matching, Podova, Italy,” pp. 1–10) developed an algorithm that can be used for exact pattern matching in a tree-structured hypertext. Park and Kim (1995, “6th Symposium on Combinatorial Pattern Matching, Helsinki, Finland”) considered regular pattern matching in hypertext. They developed a complex algorithm that works for hypertext with an underlying structure of a DAG. In this paper we present a much simpler algorithm achieving the same complexity which runs on any hypertext graph. We then extend the problem to approximate pattern matching in hypertext, first considering hamming distance and then edit distance. We show that in contrast to regular text, it does make a difference whether the errors occur in the hypertext or the pattern. The approximate pattern matching problem in hypertext with errors in the hypertext turns out to be NP-complete and the approximate pattern matching problem in hypertext with errors in the pattern has a polynomial time solution.
AbstractList The importance of hypertext has been steadily growing over the past decade. The Internet and other information systems use hypertext format, with data organized associatively rather than sequentially or relationally. A myriad of textual problems have been considered in the pattern matching field with many nontrivial results. Nevertheless, surprisingly little work has been done on the natural combination of pattern matching and hypertext. In contrast to regular text, hypertext has a nonlinear structure and the techniques of pattern matching for text cannot be directly applied to hypertext. Manber and Wu (1992, “IAPR Workshop on Structural and Syntactic Pattern Recognition, Bern, Switzerland”) pioneered the study of pattern matching in hypertext and defined a hypertext model for pattern matching. Akutsu (1993, “Procedures of the 4th Symposium on Combinatorial Pattern Matching, Podova, Italy,” pp. 1–10) developed an algorithm that can be used for exact pattern matching in a tree-structured hypertext. Park and Kim (1995, “6th Symposium on Combinatorial Pattern Matching, Helsinki, Finland”) considered regular pattern matching in hypertext. They developed a complex algorithm that works for hypertext with an underlying structure of a DAG. In this paper we present a much simpler algorithm achieving the same complexity which runs on any hypertext graph. We then extend the problem to approximate pattern matching in hypertext, first considering hamming distance and then edit distance. We show that in contrast to regular text, it does make a difference whether the errors occur in the hypertext or the pattern. The approximate pattern matching problem in hypertext with errors in the hypertext turns out to be NP-complete and the approximate pattern matching problem in hypertext with errors in the pattern has a polynomial time solution.
Author Amir, Amihood
Lewenstein, Moshe
Lewenstein, Noa
Author_xml – sequence: 1
  givenname: Amihood
  surname: Amir
  fullname: Amir, Amihood
– sequence: 2
  givenname: Moshe
  surname: Lewenstein
  fullname: Lewenstein, Moshe
– sequence: 3
  givenname: Noa
  surname: Lewenstein
  fullname: Lewenstein, Noa
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=1308214$$DView record in Pascal Francis
BookMark eNp1kE1LAzEQQINUsK1evQk9eN2a7EeyOUpRK1T0oOcwnUxqSpuWJIj99-5SQRB6GgbeG5g3YoOwC8TYteBTwbm8W8NqOxVa626V1RkbCq55UUrVDtiQCy0LqVR9wUYprTkXoqn1kN28Qc4Uw-QFMn76sJr4MJkf9hQzfedLdu5gk-jqd47Zx-PD-2xeLF6fnmf3iwIr0eQC0ZVLCaS0okqq0jWuAaWs5WiXQlW10xbJOVsqV4MA7gQ2ugUU2AqSthqz2-PdPSSEjYsQ0Cezj34L8WBExdtS1B02PWIYdylFcn8EN30D0zcwfQPTN-iE-p-APkP2u5Aj-M1prT1q1P385SmahJ4CkvWRMBu786fUH_mSdgU
CODEN JOALDV
CitedBy_id crossref_primary_10_1007_s00453_022_00989_x
crossref_primary_10_3390_a14010014
crossref_primary_10_1007_s00453_016_0271_3
crossref_primary_10_1016_j_ic_2021_104748
crossref_primary_10_1089_cmb_2024_0601
crossref_primary_10_1007_s00224_024_10194_8
crossref_primary_10_1186_s12859_018_2436_3
crossref_primary_10_1016_j_ic_2007_06_001
crossref_primary_10_1016_j_jda_2012_10_001
crossref_primary_10_1145_3588334
crossref_primary_10_1145_3301312
crossref_primary_10_1007_s00453_022_01007_w
crossref_primary_10_1093_bioadv_vbad167
crossref_primary_10_1101_gr_279143_124
crossref_primary_10_1016_j_jclepro_2023_136888
crossref_primary_10_1016_j_ipl_2009_04_012
crossref_primary_10_1186_s12859_020_03590_7
crossref_primary_10_1089_cmb_2019_0066
crossref_primary_10_1089_cmb_2022_0411
Cites_doi 10.1006/inco.1995.1090
10.1016/0196-6774(89)90010-2
10.1142/9789812797919_0002
10.1016/S0022-0000(05)80047-9
10.1007/BFb0029792
10.1007/3-540-60044-2_51
10.1137/0206024
10.1137/0216067
ContentType Journal Article
Copyright 2000 Academic Press
2000 INIST-CNRS
Copyright_xml – notice: 2000 Academic Press
– notice: 2000 INIST-CNRS
DBID AAYXX
CITATION
IQODW
DOI 10.1006/jagm.1999.1063
DatabaseName CrossRef
Pascal-Francis
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Applied Sciences
EISSN 1090-2678
EndPage 99
ExternalDocumentID 1308214
10_1006_jagm_1999_1063
S0196677499910635
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1RT
1~.
1~5
29J
4.4
4G.
5GY
5VS
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABAOU
ABBOA
ABEFU
ABMAC
ABTAH
ABXDB
ABYKQ
ACAZW
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADFGL
ADGUI
ADIYS
ADJOM
ADMUD
AEBSH
AEKER
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHZHX
AIEXJ
AIGVJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
ASPBG
AVWKF
AXJTR
AZFZN
BKOJK
BLXMC
CAG
COF
CS3
DM4
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
FA8
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBOLZ
HLZ
HMJ
HVGLF
HZ~
IHE
KOM
LG5
LX9
M25
MHUIS
MO0
MVM
N9A
O-L
O9-
OAUVE
OZT
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SDF
SDG
SDP
SEW
SME
SPC
SSV
SSW
SSZ
T5K
TN5
TWZ
UPT
UQL
WUQ
XJT
XPP
YQT
ZCA
ZU3
ZY4
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABJNI
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
CITATION
EFKBS
~HD
AFXIZ
AGCQF
AGRNS
IQODW
SSH
ID FETCH-LOGICAL-c315t-ccf2b6ae797e3672f5f5a77dd0cdb1734f9dceffd27f4a1a0f1c598ac1c81e6d3
ISICitedReferencesCount 40
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000086054600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0196-6774
IngestDate Mon Jul 21 09:18:45 EDT 2025
Sat Nov 29 06:24:22 EST 2025
Tue Nov 18 20:40:29 EST 2025
Fri Feb 23 02:22:56 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 1
Keywords pattern matching
hypertext
design and analysis of algorithms
combinatorial algorithms on words
pattern matching on hypertext
Design
Hamming distance
Hypertext
Analysis
Combinatorial algorithm
Graph theory
Algorithm
Computational complexity
Pattern matching
Polynomial time
Language English
License https://www.elsevier.com/tdm/userlicense/1.0
CC BY 4.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c315t-ccf2b6ae797e3672f5f5a77dd0cdb1734f9dceffd27f4a1a0f1c598ac1c81e6d3
PageCount 18
ParticipantIDs pascalfrancis_primary_1308214
crossref_primary_10_1006_jagm_1999_1063
crossref_citationtrail_10_1006_jagm_1999_1063
elsevier_sciencedirect_doi_10_1006_jagm_1999_1063
PublicationCentury 2000
PublicationDate 2000-04-01
PublicationDateYYYYMMDD 2000-04-01
PublicationDate_xml – month: 04
  year: 2000
  text: 2000-04-01
  day: 01
PublicationDecade 2000
PublicationPlace San Diego, CA
PublicationPlace_xml – name: San Diego, CA
PublicationTitle Journal of algorithms
PublicationYear 2000
Publisher Elsevier Inc
Elsevier
Publisher_xml – name: Elsevier Inc
– name: Elsevier
References Cormen, Leiserson, Rivest (RF7) 1990
IAPR Workshop on Structural and Syntactic Pattern Recognition, Bern, Switzerland, 1992.
Boyer, Moore (RF6) 1977; 20
K. Park, and, D. K. Kim, String matching in hypertext
Landau, Vishkin (RF13) 1989; 10
U. Manber, and, S. Wu, Approximate string matching with arbitrary costs for text and hypertext
Amir, Farach, Giancarlo, Galil, Park (RF3) 1994; 49
Fraenkel, Klein (RF10) 1995
Knuth, Morris, Pratt (RF11) 1977; 6
Abrahamson (RF1) 1987; 16
T. Akutsu, A linear time pattern matching algorithm between a string and a tree
Amir, Farach, Idury, La Poutré, Schäffer (RF4) 1995; 119
Aviad (RF5) 1993
Proceedings of the 4th Symposium on Combinatorial Pattern Matching, Padova, Italy, 1993, pp. 1–10.
6th Symposium on Combinatorial Pattern Matching, Helsinki, Finland, 1995.
Sahinalp, Vishkin (RF17) 1996
S. Rao, Kosaraju, Efficient string matching, manuscript, 1987.
Ferragina, Grossi (RF9) 1995
Fischer, Paterson (RF8) 1974
Nielsen (RF15) 1993
Fischer (10.1006/jagm.1999.1063_RF8) 1974
Nielsen (10.1006/jagm.1999.1063_RF15) 1993
10.1006/jagm.1999.1063_RF2
Amir (10.1006/jagm.1999.1063_RF3) 1994; 49
10.1006/jagm.1999.1063_RF14
Aviad (10.1006/jagm.1999.1063_RF5) 1993
Fraenkel (10.1006/jagm.1999.1063_RF10) 1995
10.1006/jagm.1999.1063_RF12
10.1006/jagm.1999.1063_RF16
Abrahamson (10.1006/jagm.1999.1063_RF1) 1987; 16
Knuth (10.1006/jagm.1999.1063_RF11) 1977; 6
Cormen (10.1006/jagm.1999.1063_RF7) 1990
Amir (10.1006/jagm.1999.1063_RF4) 1995; 119
Ferragina (10.1006/jagm.1999.1063_RF9) 1995
Sahinalp (10.1006/jagm.1999.1063_RF17) 1996
Boyer (10.1006/jagm.1999.1063_RF6) 1977; 20
Landau (10.1006/jagm.1999.1063_RF13) 1989; 10
References_xml – reference: U. Manber, and, S. Wu, Approximate string matching with arbitrary costs for text and hypertext,
– year: 1993
  ident: RF15
  publication-title: Hypertext and Hypermedia
– year: 1990
  ident: RF7
  publication-title: Introduction to Algorithms
– volume: 6
  start-page: 323
  year: 1977
  end-page: 350
  ident: RF11
  article-title: Fast pattern matching in strings
  publication-title: SIAM J. Comput.
– reference: T. Akutsu, A linear time pattern matching algorithm between a string and a tree,
– volume: 16
  start-page: 1039
  year: 1987
  end-page: 1051
  ident: RF1
  article-title: Generalized string matching
  publication-title: SIAM J. Comput.
– volume: 20
  start-page: 762
  year: 1977
  end-page: 772
  ident: RF6
  article-title: A fast string searching algorithm
  publication-title: Comm. Assoc. Comput. Mach.
– reference: K. Park, and, D. K. Kim, String matching in hypertext,
– reference: , 6th Symposium on Combinatorial Pattern Matching, Helsinki, Finland, 1995.
– start-page: 113
  year: 1974
  end-page: 125
  ident: RF8
  article-title: String matching and other products
  publication-title: Complexity of Computation
– year: 1995
  ident: RF9
  article-title: Optimal on-line search and sublinear time update in string matching
  publication-title: Proc. 7th ACM-SIAM Symposium on Discrete Algorithms
– volume: 10
  start-page: 157
  year: 1989
  end-page: 169
  ident: RF13
  article-title: Fast parallel and serial approximate string matching
  publication-title: J. Algorithms
– reference: Proceedings of the 4th Symposium on Combinatorial Pattern Matching, Padova, Italy, 1993, pp. 1–10.
– volume: 49
  start-page: 208
  year: 1994
  end-page: 222
  ident: RF3
  article-title: Dynamic dictionary matching
  publication-title: J. Comput. System Sci.
– year: 1995
  ident: RF10
  publication-title: Information Retrieval from Annotated Texts
– year: 1996
  ident: RF17
  article-title: Efficient approximate and dynamic matching of patterns using a labeling paradigm
  publication-title: Proc. 36th FOCS
– volume: 119
  start-page: 258
  year: 1995
  end-page: 282
  ident: RF4
  article-title: Improved dynamic dictionary matching
  publication-title: Inform. and Comput.
– reference: S. Rao, Kosaraju, Efficient string matching, manuscript, 1987.
– reference: , IAPR Workshop on Structural and Syntactic Pattern Recognition, Bern, Switzerland, 1992.
– year: 1993
  ident: RF5
  publication-title: HyperTalmud: A Hypertext System for the Babylonian Talmud and Its Commentaries
– year: 1995
  ident: 10.1006/jagm.1999.1063_RF10
– volume: 119
  start-page: 258
  year: 1995
  ident: 10.1006/jagm.1999.1063_RF4
  article-title: Improved dynamic dictionary matching
  publication-title: Inform. and Comput.
  doi: 10.1006/inco.1995.1090
– volume: 10
  start-page: 157
  year: 1989
  ident: 10.1006/jagm.1999.1063_RF13
  article-title: Fast parallel and serial approximate string matching
  publication-title: J. Algorithms
  doi: 10.1016/0196-6774(89)90010-2
– ident: 10.1006/jagm.1999.1063_RF14
  doi: 10.1142/9789812797919_0002
– volume: 49
  start-page: 208
  year: 1994
  ident: 10.1006/jagm.1999.1063_RF3
  article-title: Dynamic dictionary matching
  publication-title: J. Comput. System Sci.
  doi: 10.1016/S0022-0000(05)80047-9
– ident: 10.1006/jagm.1999.1063_RF12
– year: 1990
  ident: 10.1006/jagm.1999.1063_RF7
– year: 1993
  ident: 10.1006/jagm.1999.1063_RF15
– year: 1996
  ident: 10.1006/jagm.1999.1063_RF17
  article-title: Efficient approximate and dynamic matching of patterns using a labeling paradigm
– volume: 20
  start-page: 762
  year: 1977
  ident: 10.1006/jagm.1999.1063_RF6
  article-title: A fast string searching algorithm
  publication-title: Comm. Assoc. Comput. Mach.
– year: 1993
  ident: 10.1006/jagm.1999.1063_RF5
– ident: 10.1006/jagm.1999.1063_RF2
  doi: 10.1007/BFb0029792
– ident: 10.1006/jagm.1999.1063_RF16
  doi: 10.1007/3-540-60044-2_51
– volume: 6
  start-page: 323
  year: 1977
  ident: 10.1006/jagm.1999.1063_RF11
  article-title: Fast pattern matching in strings
  publication-title: SIAM J. Comput.
  doi: 10.1137/0206024
– start-page: 113
  year: 1974
  ident: 10.1006/jagm.1999.1063_RF8
  article-title: String matching and other products
– year: 1995
  ident: 10.1006/jagm.1999.1063_RF9
  article-title: Optimal on-line search and sublinear time update in string matching
– volume: 16
  start-page: 1039
  year: 1987
  ident: 10.1006/jagm.1999.1063_RF1
  article-title: Generalized string matching
  publication-title: SIAM J. Comput.
  doi: 10.1137/0216067
SSID ssj0011549
Score 1.5954802
Snippet The importance of hypertext has been steadily growing over the past decade. The Internet and other information systems use hypertext format, with data...
SourceID pascalfrancis
crossref
elsevier
SourceType Index Database
Enrichment Source
Publisher
StartPage 82
SubjectTerms Applied sciences
combinatorial algorithms on words
design and analysis of algorithms
Exact sciences and technology
hypertext
Mathematical programming
Operational research and scientific management
Operational research. Management science
pattern matching
pattern matching on hypertext
Title Pattern Matching in Hypertext
URI https://dx.doi.org/10.1006/jagm.1999.1063
Volume 35
WOSCitedRecordID wos000086054600004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: ScienceDirect
  customDbUrl:
  eissn: 1090-2678
  dateEnd: 20091031
  omitProxy: false
  ssIdentifier: ssj0011549
  issn: 0196-6774
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Ni9swEBVl00Oh9LtsupvFh0IPwTSyI8s6huKlWbYmtCnNzcj6yBoSJyRpyc_fkSU7cZfQ9tCLMWMnDvPGo6eJNA-h91oQncO8xM9DgGEoWeADjeA-zARyTCSTlPBKbIKmaTybsYlTyN5WcgK0LOP9nq3_K9RgA7DN1tl_gLv5UjDAOYAOR4Adjn8F_KTqmFn2gYradZJF2b-D2ebGrPE4wUX5Yr7aFLu7ZcOwR1_GX1ulztvkR5J-mybjtFVDbZvTVhFhcLT2pKpsuWH4uNDIIj-iVkGnzpS2sUgrImzas_pBbgC1gkcPUjO83pUkwHxpdkgyMLjM1uqB_dvY1KwYxKatjhEs7wSUMMjAndE4md00fxiZTnN2Z7z92XV_zkH0sf3IU_zj6Zpv4a3QVs7kiGNMX6BnDhBvZEF9iR6p8hV67iYKnkvDWzDVWhy17TXqOdi9GnavKL0G9jfo-3Uy_fTZd9IXvggx2flC6CCPuKKMqjCigSaacEqlHAiZYxoONZNCaS0Dqocc84HGgrCYCyxirCIZvkVn5apU58gLqSKhCCImuBgCT4kDnmMFtFviQIY47yK_dkgmXF94I0-yyGxH6ygzDsyMAzPjwC760Ny_th1RTt6Ja_9mjs9ZnpZBTJz8TK8FxOERNgDe_eH6BXpyiO9LdLbb_FQ99Fj82hXbzZWLmnutw27b
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Pattern+matching+in+hypertext&rft.jtitle=Journal+of+algorithms&rft.au=AMIR%2C+A&rft.au=LEWENSTEIN%2C+M&rft.au=LEWENSTEIN%2C+N&rft.date=2000-04-01&rft.pub=Elsevier&rft.issn=0196-6774&rft.volume=35&rft.issue=1&rft.spage=82&rft.epage=99&rft_id=info:doi/10.1006%2Fjagm.1999.1063&rft.externalDBID=n%2Fa&rft.externalDocID=1308214
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0196-6774&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0196-6774&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0196-6774&client=summon