MergedTrie: Efficient textual indexing
The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very...
Saved in:
| Published in: | PloS one Vol. 14; no. 4; p. e0215288 |
|---|---|
| Main Authors: | , |
| Format: | Journal Article |
| Language: | English |
| Published: |
United States
Public Library of Science
23.04.2019
Public Library of Science (PLoS) |
| Subjects: | |
| ISSN: | 1932-6203, 1932-6203 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is obtained. Moreover, we propose a new implementation of Tries that achieves better compression rates than the Double-Array representation usually chosen for implementing Tries. Our proposal also overcomes the limitation of static implementations that does not allow insertions and updates in their compact representations. Finally, our MergedTrie implementation experimentally improves the efficiency of the Hash Tables, the DTs, the Double-Array, the Crit-bit, the Directed Acyclic Word Graphs (DAWG), and the Acyclic Deterministic Finite Automata (ADFA) data structures, requiring less space than the original text to be indexed. |
|---|---|
| AbstractList | The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is obtained. Moreover, we propose a new implementation of Tries that achieves better compression rates than the Double-Array representation usually chosen for implementing Tries. Our proposal also overcomes the limitation of static implementations that does not allow insertions and updates in their compact representations. Finally, our MergedTrie implementation experimentally improves the efficiency of the Hash Tables, the DTs, the Double-Array, the Crit-bit, the Directed Acyclic Word Graphs (DAWG), and the Acyclic Deterministic Finite Automata (ADFA) data structures, requiring less space than the original text to be indexed.The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is obtained. Moreover, we propose a new implementation of Tries that achieves better compression rates than the Double-Array representation usually chosen for implementing Tries. Our proposal also overcomes the limitation of static implementations that does not allow insertions and updates in their compact representations. Finally, our MergedTrie implementation experimentally improves the efficiency of the Hash Tables, the DTs, the Double-Array, the Crit-bit, the Directed Acyclic Word Graphs (DAWG), and the Acyclic Deterministic Finite Automata (ADFA) data structures, requiring less space than the original text to be indexed. The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications (e.g. information retrieval and social networks), especially when working in the fields of Big Data or IoT, which require the handling of very large string dictionaries. Typical data structures for textual indexing are Hash Tables and some variants of Tries such as the Double Trie (DT). In this paper, we propose an extension of the DT that we have called MergedTrie. It improves the DT compression by merging both Tries into a single and by segmenting the indexed term into two fixed length parts in order to balance the new Trie. Thus, a higher overlapping of both prefixes and suffixes is obtained. Moreover, we propose a new implementation of Tries that achieves better compression rates than the Double-Array representation usually chosen for implementing Tries. Our proposal also overcomes the limitation of static implementations that does not allow insertions and updates in their compact representations. Finally, our MergedTrie implementation experimentally improves the efficiency of the Hash Tables, the DTs, the Double-Array, the Crit-bit, the Directed Acyclic Word Graphs (DAWG), and the Acyclic Deterministic Finite Automata (ADFA) data structures, requiring less space than the original text to be indexed. |
| Audience | Academic |
| Author | Ferrández, Antonio Peral, Jesús |
| AuthorAffiliation | 1 GPLSI Research Group, Department of Software and Computing Systems, University of Alicante, Alicante, Spain 2 Lucentia Research Group, Department of Software and Computing Systems, University of Alicante, Alicante, Spain Indian Institute of Technology Madras, INDIA |
| AuthorAffiliation_xml | – name: Indian Institute of Technology Madras, INDIA – name: 2 Lucentia Research Group, Department of Software and Computing Systems, University of Alicante, Alicante, Spain – name: 1 GPLSI Research Group, Department of Software and Computing Systems, University of Alicante, Alicante, Spain |
| Author_xml | – sequence: 1 givenname: Antonio surname: Ferrández fullname: Ferrández, Antonio – sequence: 2 givenname: Jesús surname: Peral fullname: Peral, Jesús |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/31013282$$D View this record in MEDLINE/PubMed |
| BookMark | eNqNkl9rFDEUxYNUbLv6DUQWhKIPu-bvJNMHoZSqC5WCVl9DNrkzm2V2siYz0n77Ztyt7IiIhJBw8zuH5OacoqM2tIDQS4LnhEnybh362Jpmvs3lOaZEUKWeoBNSMjorKGZHB_tjdJrSGmPBVFE8Q8eMYMKooifo7DPEGtxt9HA-vaoqbz203bSDu643zdS3Du58Wz9HTyvTJHixXyfo24er28tPs-ubj4vLi-uZLTjuZoZDvga2JXGlMiDzdAILqdTSYeDccUlE4RRIK4QtaGGo5ZIClculFQzYBC12vi6Ytd5GvzHxXgfj9a9CiLU2sfO2AU0JFSBYCapynIliiY1kQnHGGVhBBq_3O69tv9yAs_ld0TQj0_FJ61e6Dj91waWiZZkN3uwNYvjRQ-r0xicLTWNaCH3SlBJWUloqmdHXO7Q2-Wq-rUJ2tAOuL4Riudcyd3yC5n-h8nCw8Tb_Y-VzfSR4OxJkZviZ2vQp6cXXL__P3nwfs2cH7ApM061SaPrOhzaNwVeHPfzdvMcAZeB8B9gYUopQaes7M_jkp_lGE6yHtOp9WvWQVr1PaxbzP8SP_v-UPQCm3O0q |
| CitedBy_id | crossref_primary_10_1371_journal_pone_0217958 crossref_primary_10_33693_2313_223X_2019_6_4_29_43 |
| Cites_doi | 10.1016/j.ipm.2014.04.004 10.1016/j.ipm.2011.01.003 10.1016/S0306-4573(01)00031-0 10.1145/367390.367400 10.1016/j.dam.2004.04.012 10.1145/335305.335351 10.1007/s10115-016-0999-8 10.1002/spe.4380220902 10.1016/j.ipm.2006.04.004 10.1109/IWCIA.2015.7449451 10.1109/32.31365 10.1007/978-3-642-03784-9_11 10.1109/ICSC.2011.94 10.1002/1097-024X(200101)31:1<43::AID-SPE356>3.0.CO;2-R 10.4236/jis.2011.21003 10.1017/S135132499700154X 10.3115/1621947.1621952 10.1145/28869.28873 10.1016/j.ipm.2006.08.005 10.1371/journal.pone.0033427 10.1016/j.is.2015.08.008 10.1145/506309.506312 10.1371/journal.pone.0142240 10.1162/089120100561601 10.1016/j.ipl.2010.09.014 10.1016/S0304-3975(01)00222-5 10.1145/1457838.1457895 10.1016/j.jnca.2014.02.007 10.1613/jair.4272 10.1109/INFCOM.2010.5461960 10.1002/scj.4690260209 10.1016/S0020-0190(99)00121-0 10.1007/s10791-012-9184-1 10.1016/j.ipm.2007.06.003 10.3390/s16071069 10.1162/089120102760173652 10.21437/Eurospeech.1997-683 10.1613/jair.3500 10.21437/Eurospeech.2001-8 10.1016/j.ipm.2016.03.002 10.1016/j.tcs.2015.04.002 10.1016/j.dam.2013.08.003 10.1002/spe.545 10.1109/69.506713 |
| ContentType | Journal Article |
| Copyright | COPYRIGHT 2019 Public Library of Science 2019 Ferrández, Peral 2019 Ferrández, Peral |
| Copyright_xml | – notice: COPYRIGHT 2019 Public Library of Science – notice: 2019 Ferrández, Peral 2019 Ferrández, Peral |
| DBID | AAYXX CITATION CGR CUY CVF ECM EIF NPM IOV ISR 7X8 5PM DOA |
| DOI | 10.1371/journal.pone.0215288 |
| DatabaseName | CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Gale In Context: Opposing Viewpoints Gale In Context: Science MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Open Access Full Text |
| DatabaseTitle | CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) MEDLINE - Academic |
| DatabaseTitleList | MEDLINE - Academic MEDLINE |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Sciences (General) |
| DocumentTitleAlternate | MergedTrie: Efficient textual indexing |
| EISSN | 1932-6203 |
| ExternalDocumentID | oai_doaj_org_article_2125e539e8fd4356b0a73584343ec51e PMC6478299 A583328701 31013282 10_1371_journal_pone_0215288 |
| Genre | Research Support, Non-U.S. Gov't Evaluation Study Journal Article |
| GrantInformation_xml | – fundername: ; grantid: TIN2015-65100-R – fundername: ; grantid: TIN2015-63502-C3-3-R |
| GroupedDBID | --- 123 29O 2WC 53G 5VS 7RV 7X2 7X7 7XC 88E 8AO 8C1 8CJ 8FE 8FG 8FH 8FI 8FJ A8Z AAFWJ AAUCC AAWOE AAYXX ABDBF ABIVO ABJCF ABUWG ACCTH ACGFO ACIHN ACIWK ACPRK ACUHS ADBBV AEAQA AENEX AEUYN AFFHD AFKRA AFPKN AFRAH AHMBA ALMA_UNASSIGNED_HOLDINGS AOIJS APEBS ARAPS ATCPS BAIFH BAWUL BBNVY BBTPI BCNDV BENPR BGLVJ BHPHI BKEYQ BPHCQ BVXVI BWKFM CCPQU CITATION CS3 D1I D1J D1K DIK DU5 E3Z EAP EAS EBD EMOBN ESX EX3 F5P FPL FYUFA GROUPED_DOAJ GX1 HCIFZ HH5 HMCUK HYE IAO IEA IGS IHR IHW INH INR IOV IPY ISE ISR ITC K6- KB. KQ8 L6V LK5 LK8 M0K M1P M48 M7P M7R M7S M~E NAPCQ O5R O5S OK1 OVT P2P P62 PATMY PDBOC PHGZM PHGZT PIMPY PJZUB PPXIY PQGLB PQQKQ PROAC PSQYO PTHSS PV9 PYCSY RNS RPM RZL SV3 TR2 UKHRP WOQ WOW ~02 ~KM 3V. ADRAZ ALIPV BBORY CGR CUY CVF ECM EIF IPNFZ NPM RIG 7X8 ESTFP 5PM |
| ID | FETCH-LOGICAL-c640t-a4e5280c91d98ae78aed505788bd0e44d47156d8e7c55c626a2c472e27bbc53e3 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000465223900013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1932-6203 |
| IngestDate | Fri Oct 03 12:50:29 EDT 2025 Tue Nov 04 01:45:53 EST 2025 Sun Nov 09 12:12:12 EST 2025 Sat Nov 29 13:16:20 EST 2025 Sat Nov 29 09:58:08 EST 2025 Wed Nov 26 10:14:53 EST 2025 Wed Nov 26 10:17:51 EST 2025 Thu May 22 21:22:22 EDT 2025 Wed Feb 19 02:31:06 EST 2025 Tue Nov 18 22:25:59 EST 2025 Sat Nov 29 02:20:55 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 4 |
| Language | English |
| License | This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c640t-a4e5280c91d98ae78aed505788bd0e44d47156d8e7c55c626a2c472e27bbc53e3 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Undefined-1 ObjectType-Feature-3 content type line 23 Competing Interests: The authors have declared that no competing interests exist. |
| OpenAccessLink | https://doaj.org/article/2125e539e8fd4356b0a73584343ec51e |
| PMID | 31013282 |
| PQID | 2213922987 |
| PQPubID | 23479 |
| PageCount | e0215288 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_2125e539e8fd4356b0a73584343ec51e pubmedcentral_primary_oai_pubmedcentral_nih_gov_6478299 proquest_miscellaneous_2213922987 gale_infotracmisc_A583328701 gale_infotracacademiconefile_A583328701 gale_incontextgauss_ISR_A583328701 gale_incontextgauss_IOV_A583328701 gale_healthsolutions_A583328701 pubmed_primary_31013282 crossref_citationtrail_10_1371_journal_pone_0215288 crossref_primary_10_1371_journal_pone_0215288 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-04-23 |
| PublicationDateYYYYMMDD | 2019-04-23 |
| PublicationDate_xml | – month: 04 year: 2019 text: 2019-04-23 day: 23 |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States – name: San Francisco, CA USA |
| PublicationTitle | PloS one |
| PublicationTitleAlternate | PLoS One |
| PublicationYear | 2019 |
| Publisher | Public Library of Science Public Library of Science (PLoS) |
| Publisher_xml | – name: Public Library of Science – name: Public Library of Science (PLoS) |
| References | ref14 ref52 A. Ferrández (ref10) 2011; 47 I.H. Witten (ref13) 1999 M. Jung (ref26) 2002; 38 F. Sánchez-Martínez (ref54) 2012; 43 I. Mukhopadhyay (ref22) 2011; 2 J. Bubenzer (ref32) 2014; 163 E. Fredkin (ref23) 1960; 3 R. Baeza-Yates (ref9) 2011 J. Daciuk (ref31) 2002; 2608 J. Aoe (ref38) 1996; 8 S. Heinz (ref36) 2002; 20 J. Adiego (ref55) 2009; 5721 P Kelbert (ref11) 2015; 10 ref51 ref50 J. Daciuk (ref29) 2000; 26 ref47 ref41 J. Aoe (ref42) 1992; 22 ref49 P. Bellot (ref3) 2016; 52 ref8 ref7 D. Gil (ref1) 2016; 16 S. Kiritchenko (ref2) 2014; 50 ref5 A Korhonen (ref4) 2012; 7 ref40 S. Inenaga (ref15) 2005; 146 ref34 R.C. Carrasco (ref30) 2002; 28 ref37 N.R. Brisaboa (ref53) 2012; 15 K. Fredriksson (ref33) 2010; 110 K. Huang (ref21) 2015; 51 G. Navarro (ref16) 1999; 72 A. Blumer (ref28) 1987; 34 P. García (ref35) 2015; 583 M. Chang (ref56) 2008; 44 M.A. Martínez-Prieto (ref6) 2016; 56 B. W. Watson (ref39) 1996; 2 ref24 ref25 ref20 K. Morimoto (ref18) 1995; 26 S. Yata (ref45) 2007; 43 ref27 K. Morita (ref43) 2001; 31 M. Fuketa (ref46) 2014; 50 S. Kanda (ref48) 2017; 51 O. Santana (ref57) 2007; 43 J. Aoe (ref19) 1989; 15 M. Oono (ref44) 2003; 33 S. Büttcher (ref12) 2010 M. Crochemore (ref17) 2003; 292 31150529 - PLoS One. 2019 May 31;14(5):e0217958 |
| References_xml | – ident: ref37 – volume: 50 start-page: 796 issue: 5 year: 2014 ident: ref46 article-title: Compression of double array structures for fixed length keywords publication-title: Information Processing & Management doi: 10.1016/j.ipm.2014.04.004 – volume: 47 start-page: 692 year: 2011 ident: ref10 article-title: Lexical and Syntactic knowledge for Information Retrieval publication-title: Information Processing & Management doi: 10.1016/j.ipm.2011.01.003 – ident: ref5 – volume: 38 start-page: 221 issue: 2 year: 2002 ident: ref26 article-title: A dynamic construction algorithm for the Compact Patricia trie using the hierarchical structure publication-title: Information Processing & Management doi: 10.1016/S0306-4573(01)00031-0 – ident: ref20 – volume: 3 start-page: 490 issue: 9 year: 1960 ident: ref23 article-title: Trie Memory publication-title: Communications of the ACM doi: 10.1145/367390.367400 – volume: 146 start-page: 156 year: 2005 ident: ref15 article-title: On-line construction of compact directed acyclic word graphs publication-title: Discrete Applied Mathematics doi: 10.1016/j.dam.2004.04.012 – ident: ref14 doi: 10.1145/335305.335351 – volume: 51 start-page: 1023 issue: 3 year: 2017 ident: ref48 article-title: Compressed double-array tries for string dictionaries supporting fast lookup publication-title: Knowledge and Information Systems doi: 10.1007/s10115-016-0999-8 – ident: ref27 – volume: 22 start-page: 695 issue: 9 year: 1992 ident: ref42 article-title: An Efficient Implementation of Trie Structures publication-title: Software-Practice and Experience doi: 10.1002/spe.4380220902 – year: 2010 ident: ref12 article-title: Information Retrieval: Implementing and Evaluating Search Engines – volume: 43 start-page: 237 issue: 1 year: 2007 ident: ref45 article-title: A compact static double-array keeping character codes publication-title: Information Processing & Management doi: 10.1016/j.ipm.2006.04.004 – ident: ref47 doi: 10.1109/IWCIA.2015.7449451 – volume: 15 start-page: 1066 issue: 9 year: 1989 ident: ref19 article-title: An Efficient Digital Search Algorithm by Using a Double-Array Structure publication-title: IEEE Transactions on Software Engineering doi: 10.1109/32.31365 – ident: ref34 – volume: 5721 start-page: 114 year: 2009 ident: ref55 article-title: A two-level structure for compressing aligned bitexts. In Proceedings of the 16th String Processing and Information Retrieval Symposium publication-title: Lecture Notes in Computer Science doi: 10.1007/978-3-642-03784-9_11 – ident: ref7 doi: 10.1109/ICSC.2011.94 – volume: 31 start-page: 43 year: 2001 ident: ref43 article-title: Fast insertion methods of a double-array structure publication-title: Software-Practice and Experience doi: 10.1002/1097-024X(200101)31:1<43::AID-SPE356>3.0.CO;2-R – volume: 2 start-page: 28 issue: 1 year: 2011 ident: ref22 article-title: A Comparative Study of Related Technologies of Intrusion Detection & Prevention Systems publication-title: Journal of Information Security doi: 10.4236/jis.2011.21003 – volume: 2 start-page: 295 issue: 4 year: 1996 ident: ref39 article-title: Implementing and using finite automata toolkits publication-title: Natural Language Engineering doi: 10.1017/S135132499700154X – ident: ref8 doi: 10.3115/1621947.1621952 – volume: 2608 start-page: 255 year: 2002 ident: ref31 article-title: Proceedings of CIAA’02, LNCS – volume: 34 start-page: 578 issue: 3 year: 1987 ident: ref28 article-title: Complete inverted files for efficient text retrieval and analysis publication-title: Journal of the Association for Computing Machinery doi: 10.1145/28869.28873 – volume: 43 start-page: 946 year: 2007 ident: ref57 article-title: Integration of an XML electronic dictionary with linguistic tools for Natural Language Processing publication-title: Information Processing & Management doi: 10.1016/j.ipm.2006.08.005 – ident: ref50 – volume: 7 start-page: e33427 issue: 4 year: 2012 ident: ref4 article-title: Text Mining for Literature Review and Knowledge Discovery in Cancer Risk Assessment and Research publication-title: PLoS ONE doi: 10.1371/journal.pone.0033427 – volume: 56 start-page: 73 year: 2016 ident: ref6 article-title: Practical compressed string dictionaries publication-title: Information Systems doi: 10.1016/j.is.2015.08.008 – volume: 20 start-page: 192 year: 2002 ident: ref36 article-title: Burst tries: a fast, efficient data structure for string keys publication-title: ACM Trans. Inf. Syst. doi: 10.1145/506309.506312 – volume: 10 start-page: e0142240 issue: 11 year: 2015 ident: ref11 article-title: B-HIT—A Tool for Harvesting and Indexing Biodiversity Data publication-title: PLoS ONE doi: 10.1371/journal.pone.0142240 – volume: 26 start-page: 3 issue: 1 year: 2000 ident: ref29 article-title: Incremental Construction of Minimal Acyclic Finite-State Automata publication-title: Computational Linguistics doi: 10.1162/089120100561601 – volume: 110 start-page: 1093 issue: 24 year: 2010 ident: ref33 article-title: On building minimal automaton for subset matching queries publication-title: Information Processing Letters doi: 10.1016/j.ipl.2010.09.014 – volume: 292 start-page: 185 year: 2003 ident: ref17 article-title: Reducing space for index implementation publication-title: Theoretical Computer Science doi: 10.1016/S0304-3975(01)00222-5 – ident: ref24 doi: 10.1145/1457838.1457895 – volume: 51 start-page: 47 year: 2015 ident: ref21 article-title: Memory-efficient IP lookup using trie merging for scalable virtual routers publication-title: Journal of Network and Computer Applications doi: 10.1016/j.jnca.2014.02.007 – volume: 50 start-page: 723 year: 2014 ident: ref2 article-title: Sentiment Analysis of Short Informal Texts publication-title: Journal of Artificial Intelligence Research doi: 10.1613/jair.4272 – year: 1999 ident: ref13 article-title: Managing Gigabytes: Compressing and Indexing Documents and Images – ident: ref52 doi: 10.1109/INFCOM.2010.5461960 – volume: 26 start-page: 85 issue: 2 year: 1995 ident: ref18 article-title: A dictionary retrieval algorithm using two trie structures publication-title: Systems and Computers in Japan doi: 10.1002/scj.4690260209 – ident: ref25 – ident: ref51 – volume: 72 start-page: 65 year: 1999 ident: ref16 article-title: Very fast and simple approximate string matching publication-title: Information Processing Letters doi: 10.1016/S0020-0190(99)00121-0 – volume: 15 start-page: 527 year: 2012 ident: ref53 article-title: Implicit indexing of natural language text by reorganizing bytecodes publication-title: Information Retrieval doi: 10.1007/s10791-012-9184-1 – volume: 44 start-page: 756 year: 2008 ident: ref56 article-title: Efficient phrase querying with common phrase index publication-title: Information Processing & Management doi: 10.1016/j.ipm.2007.06.003 – volume: 16 start-page: 1069 issue: 7 year: 2016 ident: ref1 article-title: Internet of Things: A Review of Surveys Based on Context Aware Intelligent Services publication-title: Sensors doi: 10.3390/s16071069 – volume: 28 start-page: 207 issue: 2 year: 2002 ident: ref30 article-title: Incremental Construction and Maintenance of Minimal Finite-State Automata publication-title: Computational Linguistics doi: 10.1162/089120102760173652 – ident: ref40 doi: 10.21437/Eurospeech.1997-683 – volume: 43 start-page: 389 year: 2012 ident: ref54 article-title: Generalized Biwords for Bitext Compression and Translation Spotting publication-title: Journal of Artificial Intelligence Research doi: 10.1613/jair.3500 – year: 2011 ident: ref9 article-title: Modern Information Retrieval – ident: ref49 – ident: ref41 doi: 10.21437/Eurospeech.2001-8 – volume: 52 start-page: 801 issue: 5 year: 2016 ident: ref3 article-title: INEX Tweet Contextualization task: Evaluation, results and lesson learned publication-title: Information Processing & Management doi: 10.1016/j.ipm.2016.03.002 – volume: 583 start-page: 78 issue: 7 year: 2015 ident: ref35 article-title: DFA minimization: Double reversal versus split minimization algorithms publication-title: Theoretical Computer Science doi: 10.1016/j.tcs.2015.04.002 – volume: 163 start-page: 238 issue: 3 year: 2014 ident: ref32 article-title: Cycle-aware minimization of acyclic deterministic finite-state automata publication-title: Discrete Applied Mathematics doi: 10.1016/j.dam.2013.08.003 – volume: 33 start-page: 1229 year: 2003 ident: ref44 article-title: A fast and compact elimination method of empty elements from a double-array structure publication-title: Software-Practice and Experience doi: 10.1002/spe.545 – volume: 8 start-page: 476 year: 1996 ident: ref38 article-title: A Trie Compaction Algorithm for a Large Set of Keys publication-title: IEEE Transactions on Knowledge & Data Engineering doi: 10.1109/69.506713 – reference: 31150529 - PLoS One. 2019 May 31;14(5):e0217958 |
| SSID | ssj0053866 |
| Score | 2.3067002 |
| Snippet | The accessing and processing of textual information (i.e. the storing and querying of a set of strings) is especially important for many current applications... |
| SourceID | doaj pubmedcentral proquest gale pubmed crossref |
| SourceType | Open Website Open Access Repository Aggregation Database Index Database Enrichment Source |
| StartPage | e0215288 |
| SubjectTerms | Abstracting and Indexing - methods Algorithms Analysis Big Data Data structures Encyclopedias and dictionaries Indexing (Content analysis) Information storage and retrieval Information Storage and Retrieval - methods Natural language processing Robots Social Networking Social networks |
| Title | MergedTrie: Efficient textual indexing |
| URI | https://www.ncbi.nlm.nih.gov/pubmed/31013282 https://www.proquest.com/docview/2213922987 https://pubmed.ncbi.nlm.nih.gov/PMC6478299 https://doaj.org/article/2125e539e8fd4356b0a73584343ec51e |
| Volume | 14 |
| WOSCitedRecordID | wos000465223900013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: DOA dateStart: 20060101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M~E dateStart: 20060101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVPQU databaseName: AAdvanced Technologies & Aerospace Database (subscription) customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: P5Z dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Agricultural Science Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M0K dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/agriculturejournals providerName: ProQuest – providerCode: PRVPQU databaseName: Biological Science Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M7P dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com/biologicalscijournals providerName: ProQuest – providerCode: PRVPQU databaseName: Engineering Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: M7S dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com providerName: ProQuest – providerCode: PRVPQU databaseName: Environmental Science Database (subscripiton) customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: PATMY dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com/environmentalscience providerName: ProQuest – providerCode: PRVPQU databaseName: Health Medical collection customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: 7X7 dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/healthcomplete providerName: ProQuest – providerCode: PRVPQU databaseName: Materials Science Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: KB. dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com/materialsscijournals providerName: ProQuest – providerCode: PRVPQU databaseName: Nursing & Allied Health Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: 7RV dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/nahs providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central - New (Subscription) customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: BENPR dateStart: 20061201 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Publicly Available Content customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: PIMPY dateStart: 20061201 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest – providerCode: PRVPQU databaseName: Public Health Database customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: 8C1 dateStart: 20061201 isFulltext: true titleUrlDefault: https://search.proquest.com/publichealth providerName: ProQuest – providerCode: PRVATS databaseName: Public Library of Science (PLoS) Journals Open Access customDbUrl: eissn: 1932-6203 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0053866 issn: 1932-6203 databaseCode: FPL dateStart: 20060101 isFulltext: true titleUrlDefault: http://www.plos.org/publications/ providerName: Public Library of Science |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3Nb9MwFLdgcOCCGJ8ZoxSE-Dikc-zEdritUyumrSXqRtVxsRLHZZNQOi0tfz_vOW5pxGEcOOQd4mfL-eXZ71n2-5mQd0UBTiK3JixpLsN4LmgIUTgLjeXWzFXBC7eDPz2V47GazdJs66ovPBPW0AM3wB3A1JrYhKdWzUtw7aKANjl4TR5DY0lkcfalMl0vppo5GEaxED5RjsvowP-X3vWisj13lau7aeWPI3J8_X_PyltuqX1kcssHDR-Rhz547B42nd4ld2z1mOz64Vl3P3oO6U9PyPsRZlWW57AQ_twdOJoIaLKLxzxW0ILjSASn9ZR8Gw7Oj76E_kqE0IiYLsM8ttB3atKoTFVuJTwlLjGUKkpq47gEX5OIUllpksTAYiVnJpbMMlkUJuGWPyM7FYDwgnRpVERClCIWNI-lKCA0EJTBo2RqEysCwtf4aOP5wvHaip_abYJJWDc0X60RVe1RDUi4qXXd8GXcot9H6De6yHbtXoANaG8D-jYbCMhr_HG6SR3djFl9iClluJMbBeSt00DGiwqx_pGv6loff53-g9LZpKX0wSvNFwCHyX0aA3wTMmm1NPdbmjBuTav4zdrMNBbhYbfKLla1ZgzCcsZSJQPyvDG7DT4QjUdQnwVEtgyyBWC7pLq6dLThmFUMwcfe_0D8JXkAkaPbVmN8n-wsb1b2Fblvfi2v6psOuSsnU5Qz6aQCqY6iDrnXH4yzSceNVZDD7BTkSb8HckRPUMrMyTOQWfIdamTHo-ziN3InQ_Q |
| linkProvider | Directory of Open Access Journals |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MergedTrie%3A+Efficient+textual+indexing&rft.jtitle=PloS+one&rft.au=Ferr%C3%A1ndez%2C+Antonio&rft.au=Peral%2C+Jes%C3%BAs&rft.date=2019-04-23&rft.pub=Public+Library+of+Science&rft.issn=1932-6203&rft.eissn=1932-6203&rft.volume=14&rft.issue=4&rft.spage=e0215288&rft_id=info:doi/10.1371%2Fjournal.pone.0215288&rft.externalDBID=n%2Fa&rft.externalDocID=A583328701 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6203&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6203&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6203&client=summon |