MergedTrie: Efficient textual indexing

The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very...

Full description

Saved in:
Bibliographic Details
Published in:PloS one Vol. 14; no. 4; p. e0215288
Main Authors: Ferrández, Antonio, Peral, Jesús
Format: Journal Article
Language:English
Published: United States Public Library of Science 23.04.2019
Public Library of Science (PLoS)
Subjects:
ISSN:1932-6203, 1932-6203
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is obtained. Moreover, we propose a new implementation of Tries that achieves better compression rates than the Double-Array representation usually chosen for implementing Tries. Our proposal also overcomes the limitation of static implementations that does not allow insertions and updates in their compact representations. Finally, our MergedTrie implementation experimentally improves the efficiency of the Hash Tables, the DTs, the Double-Array, the Crit-bit, the Directed Acyclic Word Graphs (DAWG), and the Acyclic Deterministic Finite Automata (ADFA) data structures, requiring less space than the original text to be indexed.
AbstractList The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is obtained. Moreover, we propose a new implementation of Tries that achieves better compression rates than the Double-Array representation usually chosen for implementing Tries. Our proposal also overcomes the limitation of static implementations that does not allow insertions and updates in their compact representations. Finally, our MergedTrie implementation experimentally improves the efficiency of the Hash Tables, the DTs, the Double-Array, the Crit-bit, the Directed Acyclic Word Graphs (DAWG), and the Acyclic Deterministic Finite Automata (ADFA) data structures, requiring less space than the original text to be indexed.The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is obtained. Moreover, we propose a new implementation of Tries that achieves better compression rates than the Double-Array representation usually chosen for implementing Tries. Our proposal also overcomes the limitation of static implementations that does not allow insertions and updates in their compact representations. Finally, our MergedTrie implementation experimentally improves the efficiency of the Hash Tables, the DTs, the Double-Array, the Crit-bit, the Directed Acyclic Word Graphs (DAWG), and the Acyclic Deterministic Finite Automata (ADFA) data structures, requiring less space than the original text to be indexed.
The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is obtained. Moreover, we propose a new implementation of Tries that achieves better compression rates than the Double-Array representation usually chosen for implementing Tries. Our proposal also overcomes the limitation of static implementations that does not allow insertions and updates in their compact representations. Finally, our MergedTrie implementation experimentally improves the efficiency of the Hash Tables, the DTs, the Double-Array, the Crit-bit, the Directed Acyclic Word Graphs (DAWG), and the Acyclic Deterministic Finite Automata (ADFA) data structures, requiring less space than the original text to be indexed.
Audience Academic
Author Ferrández, Antonio
Peral, Jesús
AuthorAffiliation 1 GPLSI Research Group, Department of Software and Computing Systems, University of Alicante, Alicante, Spain
2 Lucentia Research Group, Department of Software and Computing Systems, University of Alicante, Alicante, Spain
Indian Institute of Technology Madras, INDIA
AuthorAffiliation_xml – name: Indian Institute of Technology Madras, INDIA
– name: 2 Lucentia Research Group, Department of Software and Computing Systems, University of Alicante, Alicante, Spain
– name: 1 GPLSI Research Group, Department of Software and Computing Systems, University of Alicante, Alicante, Spain
Author_xml – sequence: 1
  givenname: Antonio
  surname: Ferrández
  fullname: Ferrández, Antonio
– sequence: 2
  givenname: Jesús
  surname: Peral
  fullname: Peral, Jesús
BackLink https://www.ncbi.nlm.nih.gov/pubmed/31013282$$D View this record in MEDLINE/PubMed
BookMark eNqNkl9rFDEUxYNUbLv6DUQWhKIPu-bvJNMHoZSqC5WCVl9DNrkzm2V2siYz0n77Ztyt7IiIhJBw8zuH5OacoqM2tIDQS4LnhEnybh362Jpmvs3lOaZEUKWeoBNSMjorKGZHB_tjdJrSGmPBVFE8Q8eMYMKooifo7DPEGtxt9HA-vaoqbz203bSDu643zdS3Du58Wz9HTyvTJHixXyfo24er28tPs-ubj4vLi-uZLTjuZoZDvga2JXGlMiDzdAILqdTSYeDccUlE4RRIK4QtaGGo5ZIClculFQzYBC12vi6Ytd5GvzHxXgfj9a9CiLU2sfO2AU0JFSBYCapynIliiY1kQnHGGVhBBq_3O69tv9yAs_ld0TQj0_FJ61e6Dj91waWiZZkN3uwNYvjRQ-r0xicLTWNaCH3SlBJWUloqmdHXO7Q2-Wq-rUJ2tAOuL4Riudcyd3yC5n-h8nCw8Tb_Y-VzfSR4OxJkZviZ2vQp6cXXL__P3nwfs2cH7ApM061SaPrOhzaNwVeHPfzdvMcAZeB8B9gYUopQaes7M_jkp_lGE6yHtOp9WvWQVr1PaxbzP8SP_v-UPQCm3O0q
CitedBy_id crossref_primary_10_1371_journal_pone_0217958
crossref_primary_10_33693_2313_223X_2019_6_4_29_43
Cites_doi 10.1016/j.ipm.2014.04.004
10.1016/j.ipm.2011.01.003
10.1016/S0306-4573(01)00031-0
10.1145/367390.367400
10.1016/j.dam.2004.04.012
10.1145/335305.335351
10.1007/s10115-016-0999-8
10.1002/spe.4380220902
10.1016/j.ipm.2006.04.004
10.1109/IWCIA.2015.7449451
10.1109/32.31365
10.1007/978-3-642-03784-9_11
10.1109/ICSC.2011.94
10.1002/1097-024X(200101)31:1<43::AID-SPE356>3.0.CO;2-R
10.4236/jis.2011.21003
10.1017/S135132499700154X
10.3115/1621947.1621952
10.1145/28869.28873
10.1016/j.ipm.2006.08.005
10.1371/journal.pone.0033427
10.1016/j.is.2015.08.008
10.1145/506309.506312
10.1371/journal.pone.0142240
10.1162/089120100561601
10.1016/j.ipl.2010.09.014
10.1016/S0304-3975(01)00222-5
10.1145/1457838.1457895
10.1016/j.jnca.2014.02.007
10.1613/jair.4272
10.1109/INFCOM.2010.5461960
10.1002/scj.4690260209
10.1016/S0020-0190(99)00121-0
10.1007/s10791-012-9184-1
10.1016/j.ipm.2007.06.003
10.3390/s16071069
10.1162/089120102760173652
10.21437/Eurospeech.1997-683
10.1613/jair.3500
10.21437/Eurospeech.2001-8
10.1016/j.ipm.2016.03.002
10.1016/j.tcs.2015.04.002
10.1016/j.dam.2013.08.003
10.1002/spe.545
10.1109/69.506713
ContentType Journal Article
Copyright COPYRIGHT 2019 Public Library of Science
2019 Ferrández, Peral 2019 Ferrández, Peral
Copyright_xml – notice: COPYRIGHT 2019 Public Library of Science
– notice: 2019 Ferrández, Peral 2019 Ferrández, Peral
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
IOV
ISR
7X8
5PM
DOA
DOI 10.1371/journal.pone.0215288
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Gale In Context: Opposing Viewpoints
Gale In Context: Science
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Open Access Full Text
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
MEDLINE - Academic
DatabaseTitleList MEDLINE - Academic




MEDLINE
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
DocumentTitleAlternate MergedTrie: Efficient textual indexing
EISSN 1932-6203
ExternalDocumentID oai_doaj_org_article_2125e539e8fd4356b0a73584343ec51e
PMC6478299
A583328701
31013282
10_1371_journal_pone_0215288
Genre Research Support, Non-U.S. Gov't
Evaluation Study
Journal Article
GrantInformation_xml – fundername: ;
  grantid: TIN2015-65100-R
– fundername: ;
  grantid: TIN2015-63502-C3-3-R
GroupedDBID ---
123
29O
2WC
53G
5VS
7RV
7X2
7X7
7XC
88E
8AO
8C1
8CJ
8FE
8FG
8FH
8FI
8FJ
A8Z
AAFWJ
AAUCC
AAWOE
AAYXX
ABDBF
ABIVO
ABJCF
ABUWG
ACCTH
ACGFO
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
AEAQA
AENEX
AEUYN
AFFHD
AFKRA
AFPKN
AFRAH
AHMBA
ALMA_UNASSIGNED_HOLDINGS
AOIJS
APEBS
ARAPS
ATCPS
BAIFH
BAWUL
BBNVY
BBTPI
BCNDV
BENPR
BGLVJ
BHPHI
BKEYQ
BPHCQ
BVXVI
BWKFM
CCPQU
CITATION
CS3
D1I
D1J
D1K
DIK
DU5
E3Z
EAP
EAS
EBD
EMOBN
ESX
EX3
F5P
FPL
FYUFA
GROUPED_DOAJ
GX1
HCIFZ
HH5
HMCUK
HYE
IAO
IEA
IGS
IHR
IHW
INH
INR
IOV
IPY
ISE
ISR
ITC
K6-
KB.
KQ8
L6V
LK5
LK8
M0K
M1P
M48
M7P
M7R
M7S
M~E
NAPCQ
O5R
O5S
OK1
OVT
P2P
P62
PATMY
PDBOC
PHGZM
PHGZT
PIMPY
PJZUB
PPXIY
PQGLB
PQQKQ
PROAC
PSQYO
PTHSS
PV9
PYCSY
RNS
RPM
RZL
SV3
TR2
UKHRP
WOQ
WOW
~02
~KM
3V.
ADRAZ
ALIPV
BBORY
CGR
CUY
CVF
ECM
EIF
IPNFZ
NPM
RIG
7X8
ESTFP
5PM
ID FETCH-LOGICAL-c640t-a4e5280c91d98ae78aed505788bd0e44d47156d8e7c55c626a2c472e27bbc53e3
IEDL.DBID DOA
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000465223900013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1932-6203
IngestDate Fri Oct 03 12:50:29 EDT 2025
Tue Nov 04 01:45:53 EST 2025
Sun Nov 09 12:12:12 EST 2025
Sat Nov 29 13:16:20 EST 2025
Sat Nov 29 09:58:08 EST 2025
Wed Nov 26 10:14:53 EST 2025
Wed Nov 26 10:17:51 EST 2025
Thu May 22 21:22:22 EDT 2025
Wed Feb 19 02:31:06 EST 2025
Tue Nov 18 22:25:59 EST 2025
Sat Nov 29 02:20:55 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
License This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c640t-a4e5280c91d98ae78aed505788bd0e44d47156d8e7c55c626a2c472e27bbc53e3
Notes ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Undefined-1
ObjectType-Feature-3
content type line 23
Competing Interests: The authors have declared that no competing interests exist.
OpenAccessLink https://doaj.org/article/2125e539e8fd4356b0a73584343ec51e
PMID 31013282
PQID 2213922987
PQPubID 23479
PageCount e0215288
ParticipantIDs doaj_primary_oai_doaj_org_article_2125e539e8fd4356b0a73584343ec51e
pubmedcentral_primary_oai_pubmedcentral_nih_gov_6478299
proquest_miscellaneous_2213922987
gale_infotracmisc_A583328701
gale_infotracacademiconefile_A583328701
gale_incontextgauss_ISR_A583328701
gale_incontextgauss_IOV_A583328701
gale_healthsolutions_A583328701
pubmed_primary_31013282
crossref_citationtrail_10_1371_journal_pone_0215288
crossref_primary_10_1371_journal_pone_0215288
PublicationCentury 2000
PublicationDate 2019-04-23
PublicationDateYYYYMMDD 2019-04-23
PublicationDate_xml – month: 04
  year: 2019
  text: 2019-04-23
  day: 23
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: San Francisco, CA USA
PublicationTitle PloS one
PublicationTitleAlternate PLoS One
PublicationYear 2019
Publisher Public Library of Science
Public Library of Science (PLoS)
Publisher_xml – name: Public Library of Science
– name: Public Library of Science (PLoS)
References ref14
ref52
A. Ferrández (ref10) 2011; 47
I.H. Witten (ref13) 1999
M. Jung (ref26) 2002; 38
F. Sánchez-Martínez (ref54) 2012; 43
I. Mukhopadhyay (ref22) 2011; 2
J. Bubenzer (ref32) 2014; 163
E. Fredkin (ref23) 1960; 3
R. Baeza-Yates (ref9) 2011
J. Daciuk (ref31) 2002; 2608
J. Aoe (ref38) 1996; 8
S. Heinz (ref36) 2002; 20
J. Adiego (ref55) 2009; 5721
P Kelbert (ref11) 2015; 10
ref51
ref50
J. Daciuk (ref29) 2000; 26
ref47
ref41
J. Aoe (ref42) 1992; 22
ref49
P. Bellot (ref3) 2016; 52
ref8
ref7
D. Gil (ref1) 2016; 16
S. Kiritchenko (ref2) 2014; 50
ref5
A Korhonen (ref4) 2012; 7
ref40
S. Inenaga (ref15) 2005; 146
ref34
R.C. Carrasco (ref30) 2002; 28
ref37
N.R. Brisaboa (ref53) 2012; 15
K. Fredriksson (ref33) 2010; 110
K. Huang (ref21) 2015; 51
G. Navarro (ref16) 1999; 72
A. Blumer (ref28) 1987; 34
P. García (ref35) 2015; 583
M. Chang (ref56) 2008; 44
M.A. Martínez-Prieto (ref6) 2016; 56
B. W. Watson (ref39) 1996; 2
ref24
ref25
ref20
K. Morimoto (ref18) 1995; 26
S. Yata (ref45) 2007; 43
ref27
K. Morita (ref43) 2001; 31
M. Fuketa (ref46) 2014; 50
S. Kanda (ref48) 2017; 51
O. Santana (ref57) 2007; 43
J. Aoe (ref19) 1989; 15
M. Oono (ref44) 2003; 33
S. Büttcher (ref12) 2010
M. Crochemore (ref17) 2003; 292
31150529 - PLoS One. 2019 May 31;14(5):e0217958
References_xml – ident: ref37
– volume: 50
  start-page: 796
  issue: 5
  year: 2014
  ident: ref46
  article-title: Compression of double array structures for fixed length keywords
  publication-title: Information Processing & Management
  doi: 10.1016/j.ipm.2014.04.004
– volume: 47
  start-page: 692
  year: 2011
  ident: ref10
  article-title: Lexical and Syntactic knowledge for Information Retrieval
  publication-title: Information Processing & Management
  doi: 10.1016/j.ipm.2011.01.003
– ident: ref5
– volume: 38
  start-page: 221
  issue: 2
  year: 2002
  ident: ref26
  article-title: A dynamic construction algorithm for the Compact Patricia trie using the hierarchical structure
  publication-title: Information Processing & Management
  doi: 10.1016/S0306-4573(01)00031-0
– ident: ref20
– volume: 3
  start-page: 490
  issue: 9
  year: 1960
  ident: ref23
  article-title: Trie Memory
  publication-title: Communications of the ACM
  doi: 10.1145/367390.367400
– volume: 146
  start-page: 156
  year: 2005
  ident: ref15
  article-title: On-line construction of compact directed acyclic word graphs
  publication-title: Discrete Applied Mathematics
  doi: 10.1016/j.dam.2004.04.012
– ident: ref14
  doi: 10.1145/335305.335351
– volume: 51
  start-page: 1023
  issue: 3
  year: 2017
  ident: ref48
  article-title: Compressed double-array tries for string dictionaries supporting fast lookup
  publication-title: Knowledge and Information Systems
  doi: 10.1007/s10115-016-0999-8
– ident: ref27
– volume: 22
  start-page: 695
  issue: 9
  year: 1992
  ident: ref42
  article-title: An Efficient Implementation of Trie Structures
  publication-title: Software-Practice and Experience
  doi: 10.1002/spe.4380220902
– year: 2010
  ident: ref12
  article-title: Information Retrieval: Implementing and Evaluating Search Engines
– volume: 43
  start-page: 237
  issue: 1
  year: 2007
  ident: ref45
  article-title: A compact static double-array keeping character codes
  publication-title: Information Processing & Management
  doi: 10.1016/j.ipm.2006.04.004
– ident: ref47
  doi: 10.1109/IWCIA.2015.7449451
– volume: 15
  start-page: 1066
  issue: 9
  year: 1989
  ident: ref19
  article-title: An Efficient Digital Search Algorithm by Using a Double-Array Structure
  publication-title: IEEE Transactions on Software Engineering
  doi: 10.1109/32.31365
– ident: ref34
– volume: 5721
  start-page: 114
  year: 2009
  ident: ref55
  article-title: A two-level structure for compressing aligned bitexts. In Proceedings of the 16th String Processing and Information Retrieval Symposium
  publication-title: Lecture Notes in Computer Science
  doi: 10.1007/978-3-642-03784-9_11
– ident: ref7
  doi: 10.1109/ICSC.2011.94
– volume: 31
  start-page: 43
  year: 2001
  ident: ref43
  article-title: Fast insertion methods of a double-array structure
  publication-title: Software-Practice and Experience
  doi: 10.1002/1097-024X(200101)31:1<43::AID-SPE356>3.0.CO;2-R
– volume: 2
  start-page: 28
  issue: 1
  year: 2011
  ident: ref22
  article-title: A Comparative Study of Related Technologies of Intrusion Detection & Prevention Systems
  publication-title: Journal of Information Security
  doi: 10.4236/jis.2011.21003
– volume: 2
  start-page: 295
  issue: 4
  year: 1996
  ident: ref39
  article-title: Implementing and using finite automata toolkits
  publication-title: Natural Language Engineering
  doi: 10.1017/S135132499700154X
– ident: ref8
  doi: 10.3115/1621947.1621952
– volume: 2608
  start-page: 255
  year: 2002
  ident: ref31
  article-title: Proceedings of CIAA’02, LNCS
– volume: 34
  start-page: 578
  issue: 3
  year: 1987
  ident: ref28
  article-title: Complete inverted files for efficient text retrieval and analysis
  publication-title: Journal of the Association for Computing Machinery
  doi: 10.1145/28869.28873
– volume: 43
  start-page: 946
  year: 2007
  ident: ref57
  article-title: Integration of an XML electronic dictionary with linguistic tools for Natural Language Processing
  publication-title: Information Processing & Management
  doi: 10.1016/j.ipm.2006.08.005
– ident: ref50
– volume: 7
  start-page: e33427
  issue: 4
  year: 2012
  ident: ref4
  article-title: Text Mining for Literature Review and Knowledge Discovery in Cancer Risk Assessment and Research
  publication-title: PLoS ONE
  doi: 10.1371/journal.pone.0033427
– volume: 56
  start-page: 73
  year: 2016
  ident: ref6
  article-title: Practical compressed string dictionaries
  publication-title: Information Systems
  doi: 10.1016/j.is.2015.08.008
– volume: 20
  start-page: 192
  year: 2002
  ident: ref36
  article-title: Burst tries: a fast, efficient data structure for string keys
  publication-title: ACM Trans. Inf. Syst.
  doi: 10.1145/506309.506312
– volume: 10
  start-page: e0142240
  issue: 11
  year: 2015
  ident: ref11
  article-title: B-HIT—A Tool for Harvesting and Indexing Biodiversity Data
  publication-title: PLoS ONE
  doi: 10.1371/journal.pone.0142240
– volume: 26
  start-page: 3
  issue: 1
  year: 2000
  ident: ref29
  article-title: Incremental Construction of Minimal Acyclic Finite-State Automata
  publication-title: Computational Linguistics
  doi: 10.1162/089120100561601
– volume: 110
  start-page: 1093
  issue: 24
  year: 2010
  ident: ref33
  article-title: On building minimal automaton for subset matching queries
  publication-title: Information Processing Letters
  doi: 10.1016/j.ipl.2010.09.014
– volume: 292
  start-page: 185
  year: 2003
  ident: ref17
  article-title: Reducing space for index implementation
  publication-title: Theoretical Computer Science
  doi: 10.1016/S0304-3975(01)00222-5
– ident: ref24
  doi: 10.1145/1457838.1457895
– volume: 51
  start-page: 47
  year: 2015
  ident: ref21
  article-title: Memory-efficient IP lookup using trie merging for scalable virtual routers
  publication-title: Journal of Network and Computer Applications
  doi: 10.1016/j.jnca.2014.02.007
– volume: 50
  start-page: 723
  year: 2014
  ident: ref2
  article-title: Sentiment Analysis of Short Informal Texts
  publication-title: Journal of Artificial Intelligence Research
  doi: 10.1613/jair.4272
– year: 1999
  ident: ref13
  article-title: Managing Gigabytes: Compressing and Indexing Documents and Images
– ident: ref52
  doi: 10.1109/INFCOM.2010.5461960
– volume: 26
  start-page: 85
  issue: 2
  year: 1995
  ident: ref18
  article-title: A dictionary retrieval algorithm using two trie structures
  publication-title: Systems and Computers in Japan
  doi: 10.1002/scj.4690260209
– ident: ref25
– ident: ref51
– volume: 72
  start-page: 65
  year: 1999
  ident: ref16
  article-title: Very fast and simple approximate string matching
  publication-title: Information Processing Letters
  doi: 10.1016/S0020-0190(99)00121-0
– volume: 15
  start-page: 527
  year: 2012
  ident: ref53
  article-title: Implicit indexing of natural language text by reorganizing bytecodes
  publication-title: Information Retrieval
  doi: 10.1007/s10791-012-9184-1
– volume: 44
  start-page: 756
  year: 2008
  ident: ref56
  article-title: Efficient phrase querying with common phrase index
  publication-title: Information Processing & Management
  doi: 10.1016/j.ipm.2007.06.003
– volume: 16
  start-page: 1069
  issue: 7
  year: 2016
  ident: ref1
  article-title: Internet of Things: A Review of Surveys Based on Context Aware Intelligent Services
  publication-title: Sensors
  doi: 10.3390/s16071069
– volume: 28
  start-page: 207
  issue: 2
  year: 2002
  ident: ref30
  article-title: Incremental Construction and Maintenance of Minimal Finite-State Automata
  publication-title: Computational Linguistics
  doi: 10.1162/089120102760173652
– ident: ref40
  doi: 10.21437/Eurospeech.1997-683
– volume: 43
  start-page: 389
  year: 2012
  ident: ref54
  article-title: Generalized Biwords for Bitext Compression and Translation Spotting
  publication-title: Journal of Artificial Intelligence Research
  doi: 10.1613/jair.3500
– year: 2011
  ident: ref9
  article-title: Modern Information Retrieval
– ident: ref49
– ident: ref41
  doi: 10.21437/Eurospeech.2001-8
– volume: 52
  start-page: 801
  issue: 5
  year: 2016
  ident: ref3
  article-title: INEX Tweet Contextualization task: Evaluation, results and lesson learned
  publication-title: Information Processing & Management
  doi: 10.1016/j.ipm.2016.03.002
– volume: 583
  start-page: 78
  issue: 7
  year: 2015
  ident: ref35
  article-title: DFA minimization: Double reversal versus split minimization algorithms
  publication-title: Theoretical Computer Science
  doi: 10.1016/j.tcs.2015.04.002
– volume: 163
  start-page: 238
  issue: 3
  year: 2014
  ident: ref32
  article-title: Cycle-aware minimization of acyclic deterministic finite-state automata
  publication-title: Discrete Applied Mathematics
  doi: 10.1016/j.dam.2013.08.003
– volume: 33
  start-page: 1229
  year: 2003
  ident: ref44
  article-title: A fast and compact elimination method of empty elements from a double-array structure
  publication-title: Software-Practice and Experience
  doi: 10.1002/spe.545
– volume: 8
  start-page: 476
  year: 1996
  ident: ref38
  article-title: A Trie Compaction Algorithm for a Large Set of Keys
  publication-title: IEEE Transactions on Knowledge & Data Engineering
  doi: 10.1109/69.506713
– reference: 31150529 - PLoS One. 2019 May 31;14(5):e0217958
SSID ssj0053866
Score 2.3067002
Snippet The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications...
SourceID doaj
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage e0215288
SubjectTerms Abstracting and Indexing - methods
Algorithms
Analysis
Big Data
Data structures
Encyclopedias and dictionaries
Indexing (Content analysis)
Information storage and retrieval
Information Storage and Retrieval - methods
Natural language processing
Robots
Social Networking
Social networks
Title MergedTrie: Efficient textual indexing
URI https://www.ncbi.nlm.nih.gov/pubmed/31013282
https://www.proquest.com/docview/2213922987
https://pubmed.ncbi.nlm.nih.gov/PMC6478299
https://doaj.org/article/2125e539e8fd4356b0a73584343ec51e
Volume 14
WOSCitedRecordID wos000465223900013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: DOA
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: M~E
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: AAdvanced Technologies & Aerospace Database (subscription)
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: P5Z
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Agricultural Science Database
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: M0K
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/agriculturejournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Biological Science Database
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: M7P
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/biologicalscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Engineering Database
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: M7S
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Environmental Science Database (subscripiton)
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: PATMY
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/environmentalscience
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Health Medical collection
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: 7X7
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/healthcomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Materials Science Database
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: KB.
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/materialsscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Nursing & Allied Health Database
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: 7RV
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/nahs
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central - New (Subscription)
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: BENPR
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Publicly Available Content
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: PIMPY
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Public Health Database
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: 8C1
  dateStart: 20061201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/publichealth
  providerName: ProQuest
– providerCode: PRVATS
  databaseName: Public Library of Science (PLoS) Journals Open Access
  customDbUrl:
  eissn: 1932-6203
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0053866
  issn: 1932-6203
  databaseCode: FPL
  dateStart: 20060101
  isFulltext: true
  titleUrlDefault: http://www.plos.org/publications/
  providerName: Public Library of Science
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3Nb9MwFLdgcOCCGJ8ZoxSE-Dikc-zEdritUyumrSXqRtVxsRLHZZNQOi0tfz_vOW5pxGEcOOQd4mfL-eXZ71n2-5mQd0UBTiK3JixpLsN4LmgIUTgLjeXWzFXBC7eDPz2V47GazdJs66ovPBPW0AM3wB3A1JrYhKdWzUtw7aKANjl4TR5DY0lkcfalMl0vppo5GEaxED5RjsvowP-X3vWisj13lau7aeWPI3J8_X_PyltuqX1kcssHDR-Rhz547B42nd4ld2z1mOz64Vl3P3oO6U9PyPsRZlWW57AQ_twdOJoIaLKLxzxW0ILjSASn9ZR8Gw7Oj76E_kqE0IiYLsM8ttB3atKoTFVuJTwlLjGUKkpq47gEX5OIUllpksTAYiVnJpbMMlkUJuGWPyM7FYDwgnRpVERClCIWNI-lKCA0EJTBo2RqEysCwtf4aOP5wvHaip_abYJJWDc0X60RVe1RDUi4qXXd8GXcot9H6De6yHbtXoANaG8D-jYbCMhr_HG6SR3djFl9iClluJMbBeSt00DGiwqx_pGv6loff53-g9LZpKX0wSvNFwCHyX0aA3wTMmm1NPdbmjBuTav4zdrMNBbhYbfKLla1ZgzCcsZSJQPyvDG7DT4QjUdQnwVEtgyyBWC7pLq6dLThmFUMwcfe_0D8JXkAkaPbVmN8n-wsb1b2Fblvfi2v6psOuSsnU5Qz6aQCqY6iDrnXH4yzSceNVZDD7BTkSb8HckRPUMrMyTOQWfIdamTHo-ziN3InQ_Q
linkProvider Directory of Open Access Journals
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MergedTrie%3A+Efficient+textual+indexing&rft.jtitle=PloS+one&rft.au=Ferr%C3%A1ndez%2C+Antonio&rft.au=Peral%2C+Jes%C3%BAs&rft.date=2019-04-23&rft.pub=Public+Library+of+Science&rft.issn=1932-6203&rft.eissn=1932-6203&rft.volume=14&rft.issue=4&rft.spage=e0215288&rft_id=info:doi/10.1371%2Fjournal.pone.0215288&rft.externalDBID=n%2Fa&rft.externalDocID=A583328701
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6203&client=summon