A parameterizable enumeration algorithm for sequence mining
In this paper, we introduce an generic framework for the mining of sequences under various constraints. More precisely, we study the enumeration of all partitions of a word w into multisets of subsequences. We show that using additional predicates, this generator can be used for frequent subsequence...
Uložené v:
| Vydané v: | Theoretical computer science Ročník 468; s. 59 - 68 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier B.V
14.01.2013
Elsevier |
| Predmet: | |
| ISSN: | 0304-3975, 1879-2294 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | In this paper, we introduce an generic framework for the mining of sequences under various constraints. More precisely, we study the enumeration of all partitions of a word w into multisets of subsequences. We show that using additional predicates, this generator can be used for frequent subsequences and substrings mining. We define the transition graph Tw whose vertices are multisets of words and arcs are transitions between multisets. We show that Tw is a directed acyclic graph and it admits a covering tree. We use Tw to propose a generic algorithm that enumerates all multisets that satisfies a set of predicates, without redundancy. |
|---|---|
| AbstractList | In this paper, we introduce an generic framework for the mining of sequences under various constraints. More precisely, we study the enumeration of all partitions of a word w into multisets of subsequences. We show that using additional predicates, this generator can be used for frequent subsequences and substrings mining. We define the transition graph Tw whose vertices are multisets of words and arcs are transitions between multisets. We show that Tw is a directed acyclic graph and it admits a covering tree. We use Tw to propose a generic algorithm that enumerates all multisets that satisfies a set of predicates, without redundancy. |
| Author | David, J. Nourine, L. |
| Author_xml | – sequence: 1 givenname: J. surname: David fullname: David, J. email: Julien.David@lipn.univ-paris13.fr – sequence: 2 givenname: L. surname: Nourine fullname: Nourine, L. |
| BackLink | https://hal.science/hal-01765525$$DView record in HAL |
| BookMark | eNp9kMtKw0AUhgepYFt9AHfZukicS-YSuipFrRBwo-thMjnTTsmlzqQFfXoTKi49mwOH_zvwfws06_oOELonOCOYiMdDNtiYUUxoRkiGMb9Cc6JkkVJa5DM0xwznKSskv0GLGA94HC7FHK3WydEE08IAwX-bqoEEulMLwQy-7xLT7Prgh32buD4kET5P0FlIWt_5bneLrp1pItz97iX6eH5632zT8u3ldbMuU0sVG1KTV5Wi2AkpKiUEBcWxtEwJUhdQFCCZcyJ3lcop1CLHlhaOWs6UqVQFMmdL9HD5uzeNPgbfmvCle-P1dl3q6YaJFJxTfiZjllyyNvQxBnB_AMF6MqUPejSlJ1OaED1qGJnVhYGxxNlD0NH6qWftA9hB173_h_4BD31yRQ |
| Cites_doi | 10.1016/j.jda.2007.06.001 10.1023/A:1009748302351 10.1016/0022-0000(84)90018-7 10.1016/0020-0190(88)90065-8 10.1016/j.dam.2008.10.010 10.1109/TCBB.2005.5 |
| ContentType | Journal Article |
| Copyright | 2012 Elsevier B.V. Distributed under a Creative Commons Attribution 4.0 International License |
| Copyright_xml | – notice: 2012 Elsevier B.V. – notice: Distributed under a Creative Commons Attribution 4.0 International License |
| DBID | 6I. AAFTH AAYXX CITATION 1XC |
| DOI | 10.1016/j.tcs.2012.11.005 |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef Hyper Article en Ligne (HAL) |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Mathematics Computer Science |
| EISSN | 1879-2294 |
| EndPage | 68 |
| ExternalDocumentID | oai:HAL:hal-01765525v1 10_1016_j_tcs_2012_11_005 S0304397512010353 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 123 1B1 1RT 1~. 1~5 4.4 457 4G. 5VS 6I. 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAFTH AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAXUO AAYFN ABAOU ABBOA ABJNI ABMAC ABVKL ABXDB ABYKQ ACAZW ACDAQ ACGFS ACRLP ACZNC ADBBV ADEZE AEBSH AEKER AENEX AEXQZ AFKWA AFTJW AGUBO AGYEJ AHHHB AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ARUGR AXJTR BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HVGLF IHE IXB J1W KOM LG9 M26 M41 MHUIS MO0 N9A NCXOZ O-L O9- OAUVE OK1 OZT P-8 P-9 P2P PC. Q38 RIG ROL RPZ SCC SDF SDG SES SPC SPCBC SSV SSW SSZ T5K TN5 WH7 YNT ZMT ~G- 29Q 9DU AAQXK AATTM AAXKI AAYWO AAYXX ABDPE ABEFU ABFNM ABWVN ACLOT ACNNM ACRPL ACVFH ADCNI ADMUD ADNMO ADVLN AEIPS AEUPX AFJKZ AFPUW AGHFR AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS FGOYB G-2 HZ~ R2- SEW TAE WUQ ZY4 ~HD 1XC XJT |
| ID | FETCH-LOGICAL-c283t-a4bb820f676b8662e8507c3861d9e99e73ff64fb842ed640c29f2c538ab8be743 |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000313917200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0304-3975 |
| IngestDate | Tue Oct 14 20:34:37 EDT 2025 Sat Nov 29 05:15:12 EST 2025 Fri Feb 23 02:30:24 EST 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | http://www.elsevier.com/open-access/userlicense/1.0 https://www.elsevier.com/tdm/userlicense/1.0 https://www.elsevier.com/open-access/userlicense/1.0 Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0 |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c283t-a4bb820f676b8662e8507c3861d9e99e73ff64fb842ed640c29f2c538ab8be743 |
| OpenAccessLink | https://dx.doi.org/10.1016/j.tcs.2012.11.005 |
| PageCount | 10 |
| ParticipantIDs | hal_primary_oai_HAL_hal_01765525v1 crossref_primary_10_1016_j_tcs_2012_11_005 elsevier_sciencedirect_doi_10_1016_j_tcs_2012_11_005 |
| PublicationCentury | 2000 |
| PublicationDate | 2013-01-14 |
| PublicationDateYYYYMMDD | 2013-01-14 |
| PublicationDate_xml | – month: 01 year: 2013 text: 2013-01-14 day: 14 |
| PublicationDecade | 2010 |
| PublicationTitle | Theoretical computer science |
| PublicationYear | 2013 |
| Publisher | Elsevier B.V Elsevier |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier |
| References | Agrawal, Srikant (br000005) 1995 Arimura, Uno (br000010) 2009 Uno, Arimura (br000065) 2007; vol. 4755 Gély, Nourine, Sadi (br000020) 2009; 157 Johnson, Yannakakis, Papadimitriou (br000030) 1988; 27 Trasarti, Bonchi, Goethals (br000060) 2008 Nourine, Petit (br000040) 2012 Rivière, Barth, Cohen, Denise (br000050) 2008; 6 Warmuth, Haussler (br000070) 1984; 28 David (br000015) 2010 Mannila, Toivonen, Inkeri~Verkamo (br000035) 1997; 1 Singh, Ibrahim, Yohanna, Singh (br000055) 2007; 37 Huan, Wang, Prins (br000025) 2003 Pisanti, Crochemore, Grossi, Sagot (br000045) 2005; 2 Uno (10.1016/j.tcs.2012.11.005_br000065) 2007; vol. 4755 David (10.1016/j.tcs.2012.11.005_br000015) 2010 Johnson (10.1016/j.tcs.2012.11.005_br000030) 1988; 27 Singh (10.1016/j.tcs.2012.11.005_br000055) 2007; 37 Mannila (10.1016/j.tcs.2012.11.005_br000035) 1997; 1 Gély (10.1016/j.tcs.2012.11.005_br000020) 2009; 157 Warmuth (10.1016/j.tcs.2012.11.005_br000070) 1984; 28 Pisanti (10.1016/j.tcs.2012.11.005_br000045) 2005; 2 Trasarti (10.1016/j.tcs.2012.11.005_br000060) 2008 Rivière (10.1016/j.tcs.2012.11.005_br000050) 2008; 6 Nourine (10.1016/j.tcs.2012.11.005_br000040) 2012 Arimura (10.1016/j.tcs.2012.11.005_br000010) 2009 Agrawal (10.1016/j.tcs.2012.11.005_br000005) 1995 Huan (10.1016/j.tcs.2012.11.005_br000025) 2003 |
| References_xml | – volume: 27 start-page: 119 year: 1988 end-page: 123 ident: br000030 article-title: On generating all maximal independent sets publication-title: Inform. Process. Lett. – start-page: 3 year: 1995 end-page: 14 ident: br000005 article-title: Mining sequential patterns publication-title: Proceedings of the Eleventh International Conference on Data Engineering, 1995 – start-page: 1088 year: 2009 end-page: 1099 ident: br000010 article-title: Polynomial-delay and polynomial-space algorithms for mining closed sequences, graphs, and pictures in accessible set systems publication-title: Main – volume: 28 start-page: 345 year: 1984 end-page: 358 ident: br000070 article-title: On the complexity of iterated shuffle publication-title: J. Comput. System Sci. – volume: 2 start-page: 40 year: 2005 end-page: 50 ident: br000045 article-title: Bases of motifs for generating repeated patterns with wild cards publication-title: IEEE/ACM Trans. Comput. Biol. Bioinform. – volume: 6 start-page: 192 year: 2008 end-page: 204 ident: br000050 article-title: Shuffling biological sequences with motif constraints publication-title: J. Discrete Algorithms – start-page: 549 year: 2003 end-page: 553 ident: br000025 article-title: Efficient mining of frequent subgraphs in the presence of isomorphism publication-title: Proceedings of the Third IEEE International Conference on Data Mining – volume: vol. 4755 start-page: 219 year: 2007 end-page: 230 ident: br000065 article-title: An efficient polynomial delay algorithm for pseudo frequent itemset mining publication-title: Discovery Science – volume: 1 start-page: 259 year: 1997 end-page: 289 ident: br000035 article-title: Discovery of frequent episodes in event sequences publication-title: Data Min. Knowl. Discov. – volume: 37 start-page: 73 year: 2007 end-page: 92 ident: br000055 article-title: An overview of the applications of multisets publication-title: Novi Sad J. Math – start-page: 1061 year: 2008 end-page: 1066 ident: br000060 article-title: Sequence mining automata: a new technique for mining frequent sequences under regular expressions publication-title: ICDM – start-page: 318 year: 2010 end-page: 329 ident: br000015 article-title: The average complexity of moore’s state minimization algorithm is O(n log log n) publication-title: MFCS – volume: 157 start-page: 1447 year: 2009 end-page: 1459 ident: br000020 article-title: Enumeration aspects of maximal cliques and bicliques publication-title: Discrete Appl. Math. – start-page: 630 year: 2012 end-page: 635 ident: br000040 article-title: Extending set-based dualization: application to pattern mining publication-title: ECAI 2012 – volume: 37 start-page: 73 issue: 2 year: 2007 ident: 10.1016/j.tcs.2012.11.005_br000055 article-title: An overview of the applications of multisets publication-title: Novi Sad J. Math – volume: 6 start-page: 192 year: 2008 ident: 10.1016/j.tcs.2012.11.005_br000050 article-title: Shuffling biological sequences with motif constraints publication-title: J. Discrete Algorithms doi: 10.1016/j.jda.2007.06.001 – start-page: 3 year: 1995 ident: 10.1016/j.tcs.2012.11.005_br000005 article-title: Mining sequential patterns – volume: 1 start-page: 259 year: 1997 ident: 10.1016/j.tcs.2012.11.005_br000035 article-title: Discovery of frequent episodes in event sequences publication-title: Data Min. Knowl. Discov. doi: 10.1023/A:1009748302351 – volume: 28 start-page: 345 issue: 3 year: 1984 ident: 10.1016/j.tcs.2012.11.005_br000070 article-title: On the complexity of iterated shuffle publication-title: J. Comput. System Sci. doi: 10.1016/0022-0000(84)90018-7 – volume: 27 start-page: 119 issue: 3 year: 1988 ident: 10.1016/j.tcs.2012.11.005_br000030 article-title: On generating all maximal independent sets publication-title: Inform. Process. Lett. doi: 10.1016/0020-0190(88)90065-8 – start-page: 630 year: 2012 ident: 10.1016/j.tcs.2012.11.005_br000040 article-title: Extending set-based dualization: application to pattern mining – start-page: 318 year: 2010 ident: 10.1016/j.tcs.2012.11.005_br000015 article-title: The average complexity of moore’s state minimization algorithm is O(n log log n) – volume: vol. 4755 start-page: 219 year: 2007 ident: 10.1016/j.tcs.2012.11.005_br000065 article-title: An efficient polynomial delay algorithm for pseudo frequent itemset mining – start-page: 1088 year: 2009 ident: 10.1016/j.tcs.2012.11.005_br000010 article-title: Polynomial-delay and polynomial-space algorithms for mining closed sequences, graphs, and pictures in accessible set systems – volume: 157 start-page: 1447 issue: 7 year: 2009 ident: 10.1016/j.tcs.2012.11.005_br000020 article-title: Enumeration aspects of maximal cliques and bicliques publication-title: Discrete Appl. Math. doi: 10.1016/j.dam.2008.10.010 – start-page: 549 year: 2003 ident: 10.1016/j.tcs.2012.11.005_br000025 article-title: Efficient mining of frequent subgraphs in the presence of isomorphism – volume: 2 start-page: 40 year: 2005 ident: 10.1016/j.tcs.2012.11.005_br000045 article-title: Bases of motifs for generating repeated patterns with wild cards publication-title: IEEE/ACM Trans. Comput. Biol. Bioinform. doi: 10.1109/TCBB.2005.5 – start-page: 1061 year: 2008 ident: 10.1016/j.tcs.2012.11.005_br000060 article-title: Sequence mining automata: a new technique for mining frequent sequences under regular expressions |
| SSID | ssj0000576 |
| Score | 2.0017564 |
| Snippet | In this paper, we introduce an generic framework for the mining of sequences under various constraints. More precisely, we study the enumeration of all... |
| SourceID | hal crossref elsevier |
| SourceType | Open Access Repository Index Database Publisher |
| StartPage | 59 |
| SubjectTerms | Computer Science Data Structures and Algorithms Discrete Mathematics |
| Title | A parameterizable enumeration algorithm for sequence mining |
| URI | https://dx.doi.org/10.1016/j.tcs.2012.11.005 https://hal.science/hal-01765525 |
| Volume | 468 |
| WOSCitedRecordID | wos000313917200006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-2294 dateEnd: 20180131 omitProxy: false ssIdentifier: ssj0000576 issn: 0304-3975 databaseCode: AIEXJ dateStart: 19950109 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3fS-swFA5e9eH64G_xN0F8uqOjTdM2waciioqKDxP2VtY00YmrstXhn-9Jk3Rj4kUffCkjY6HNd3bO15OT8yF0zHqJkpSFHpGKeRACwA-Gie8V2lsWhKiQGrGJ5PaWdbv8zgqNj2o5gaQs2fs7f_1VqGEMwNZHZ38AdzMpDMBnAB2uADtcvwV82tLtvAe6zEVXbD3Llq52lxbp3vPDy7BfPQ7q-kJXSN0a1DoR00y1M3XCUVjph5aNl5P09rhfW8hVu0kq66S-yZJet6dTClreIfDMUU53lEpvl3AjaeLcJDXyN9bR2TbeJmSabz45Y5MXeGpXQvdFD0hbt0v1o0nkcbvtMwGpKRN0FWhPGUyR6SnghSWrW9YukCTi4IgX0suz7tUk9kaJ2Z22D-D2seuKvpn7-IqJ_Hl0OfWaY3RW0bJ9OcCpAXUNzclyHa044Q1s_fA6Wrppmu2ONtBJimcQx1OI4wZxDIhjhzg2iG-i-_OzzumFZzUxPAFEsPJ6NM-BtKk4iXMWx0QyIPQiZHFQcMm5TEKlYqpyRoksYuoLwhURENV6Ocsl0MUtNF--lHIbYR4oIQLFC6DYVPKcK-77uWK-hD9oQfkO-ufWJ3s1rU-yLxHZQdStYGZt0XCyDKzhfz87gtVupte9zi_S60yPQaiIo4hE42D3Jzeyh_5OTHofzVfDN3mAFsW46o-Gh9ZePgA6B26W |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+parameterizable+enumeration+algorithm+for+sequence+mining&rft.jtitle=Theoretical+computer+science&rft.au=David%2C+J.&rft.au=Nourine%2C+L.&rft.date=2013-01-14&rft.issn=0304-3975&rft.volume=468&rft.spage=59&rft.epage=68&rft_id=info:doi/10.1016%2Fj.tcs.2012.11.005&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_tcs_2012_11_005 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0304-3975&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0304-3975&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0304-3975&client=summon |