Parallel and Distributed Algorithms for Frequent Pattern Mining in Large Databases
Mining frequent patterns (FP) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. A significant number of parallel and distributed FP mining algorithms have been proposed, when the database is large and/or distributed. Among t...
Uloženo v:
| Vydáno v: | Technical review - IETE Ročník 26; číslo 1; s. 55 - 66 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New Delhi
Taylor & Francis
01.01.2009
Taylor & Francis Ltd |
| Témata: | |
| ISSN: | 0256-4602, 0974-5971 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Mining frequent patterns (FP) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. A significant number of parallel and distributed FP mining algorithms have been proposed, when the database is large and/or distributed. Among them, parallelization of the FP-growth algorithm using the FP-tree has been proved to be more efficient, when compared to the Apriori -based approaches. However, the FP-tree based techniques suffer from two major limitations - multiple database scans requirement (i.e., high I/O cost) and huge communication overhead. Therefore, in this paper, we propose a novel tree structure, called PP-tree (Parallel Pattern tree) that significantly reduces the I/O cost by capturing the database contents with a single scan and facilitates efficient FP-growth mining on it. Our parallel algorithm works independently at each local site and merges the locally generated global frequent patterns at the final stage, thereby reducing inter-processor communication overhead and getting a high degree of parallelism. Extensive experimental study on datasets of different types reflects that parallel and distributed FP mining with our PP-tree is highly efficient on large databases. |
|---|---|
| AbstractList | Mining frequent patterns (FP) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. A significant number of parallel and distributed FP mining algorithms have been proposed, when the database is large and/or distributed. Among them, parallelization of the FP-growth algorithm using the FP-tree has been proved to be more efficient, when compared to the Apriori -based approaches. However, the FP-tree based techniques suffer from two major limitations - multiple database scans requirement (i.e., high I/O cost) and huge communication overhead. Therefore, in this paper, we propose a novel tree structure, called PP-tree (Parallel Pattern tree) that significantly reduces the I/O cost by capturing the database contents with a single scan and facilitates efficient FP-growth mining on it. Our parallel algorithm works independently at each local site and merges the locally generated global frequent patterns at the final stage, thereby reducing inter-processor communication overhead and getting a high degree of parallelism. Extensive experimental study on datasets of different types reflects that parallel and distributed FP mining with our PP-tree is highly efficient on large databases. |
| Author | Jeong, Byeong-Soo Ahmed, Chowdhury Farhan Tanbeer, Syed Khairuzzaman |
| Author_xml | – sequence: 1 givenname: Syed Khairuzzaman surname: Tanbeer fullname: Tanbeer, Syed Khairuzzaman organization: Department of Computer Engineering, Kyung Hee University – sequence: 2 givenname: Chowdhury Farhan surname: Ahmed fullname: Ahmed, Chowdhury Farhan organization: Department of Computer Engineering, Kyung Hee University – sequence: 3 givenname: Byeong-Soo surname: Jeong fullname: Jeong, Byeong-Soo organization: Department of Computer Engineering, Kyung Hee University |
| BookMark | eNp9kE1LXDEUhkOxULVddxu6cHc1ufm4N-5E6weMVEq7DmfyMY1kEk0yiP--GafdCLo6B877HF6eA7SXcnIIfaXkmFPCTsgo5MAlGY_5zKX6gPaJmvgg1ET3-v7_-gkd1HpPiOSjoPvo5x0UiNFFDMnii1BbCctNcxafxVUuof1ZV-xzwZfFPW5cavgOWnMl4duQQlrhkPACysrhC2iwhOrqZ_TRQ6zuy795iH5ffv91fj0sflzdnJ8tBsMEaYOZlJJAJFPe-JGPdmac-l6LChAGJsakcZappaUzSGetp8Zy6UeYlXDSskN0tPv7UHKvVpteh2pcjJBc3lTNuKJUsrEHv70K3udNSb2bnsU0MzZJ1kNiFzIl11qc1yY0aCGnViBETYneatZbkXorUr9o7tzJK-6hhDWU53eI0x0RUhe7hqdcotUNnmMuvkAyoXd_C_4LuoSSzA |
| CitedBy_id | crossref_primary_10_1109_TBDATA_2017_2731838 crossref_primary_10_4103_0256_4602_90761 crossref_primary_10_1007_s40747_018_0085_9 crossref_primary_10_1109_TPDS_2014_2377713 crossref_primary_10_1155_2018_2818251 crossref_primary_10_1007_s00521_012_0943_0 crossref_primary_10_1016_j_procs_2014_05_012 crossref_primary_10_1109_ACCESS_2020_2974035 crossref_primary_10_1080_02533839_2018_1454853 |
| ContentType | Journal Article |
| Copyright | Copyright © 2009 by the IETE 2009 Copyright Medknow Publications & Media Pvt Ltd Jan 2009 |
| Copyright_xml | – notice: Copyright © 2009 by the IETE 2009 – notice: Copyright Medknow Publications & Media Pvt Ltd Jan 2009 |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.4103/0256-4602.48469 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 0974-5971 |
| EndPage | 66 |
| ExternalDocumentID | 2297490651 10_4103_0256_4602_48469 10876687 |
| Genre | Article |
| GroupedDBID | 0BK 29Q 2WC 30N 4.4 5GY 8FE 8FG AAGDL AAHIA AAJMT AALDU AAMIU AAPUL AAQRR ABCCY ABFIM ABJNI ABLIJ ABPAQ ABPEM ABTAI ABXUL ABXYU ACGFS ACTIO ADCVX ADGTB ADUMR AEISY AENEX AEYOC AFRVT AGBKS AGDLA AHDZW AIDUJ AIJEM AIYEW AKBVH AKOOK ALMA_UNASSIGNED_HOLDINGS ALQZU AQRUH AQTUD AVBZW AWYRJ BLEHA BPHCQ C1A CCCUG DGEBU DKSSO DU5 E3Z EBS EJD GTTXZ H13 IL9 KYCEM M4Z P2P P62 PQQKQ PROAC RNANH ROSJB RTWRZ SC5 SNACF TAJZE TASJS TBQAZ TDBHL TEN TFL TFT TFW TR2 TTHFI TUROJ ZGOLN .DC 0R~ AAYXX ACTTO ADXEU AEHZU AEZBV AFBWG AFION AGVKY AGWUF AGYFW AKHJE AKMBP ALRRR ALXIB AMATQ ARCSS BGSSV BWMZZ CITATION CYRSC DAOYK DEXXA FETWF HZ~ IFELN IPNFZ LJTGL NUSFT O9- OPCYK RIG RNS 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c350t-c7996a0639fcf242d8341f64215a5ca7336ced39bd18a6eddf1cd46f2a895e6d3 |
| IEDL.DBID | TFW |
| ISICitedReferencesCount | 12 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000263990400008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0256-4602 |
| IngestDate | Sun Nov 09 10:42:35 EST 2025 Fri Sep 19 21:00:25 EDT 2025 Sat Nov 29 06:24:17 EST 2025 Tue Nov 18 21:09:23 EST 2025 Mon Oct 20 23:38:00 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c350t-c7996a0639fcf242d8341f64215a5ca7336ced39bd18a6eddf1cd46f2a895e6d3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 |
| PQID | 857833763 |
| PQPubID | 226518 |
| PageCount | 12 |
| ParticipantIDs | informaworld_taylorfrancis_310_4103_0256_4602_48469 crossref_primary_10_4103_0256_4602_48469 proquest_miscellaneous_34911632 proquest_journals_857833763 crossref_citationtrail_10_4103_0256_4602_48469 |
| PublicationCentury | 2000 |
| PublicationDate | 1/1/2009 2009-00-00 20090101 |
| PublicationDateYYYYMMDD | 2009-01-01 |
| PublicationDate_xml | – month: 01 year: 2009 text: 1/1/2009 day: 01 |
| PublicationDecade | 2000 |
| PublicationPlace | New Delhi |
| PublicationPlace_xml | – name: New Delhi |
| PublicationTitle | Technical review - IETE |
| PublicationYear | 2009 |
| Publisher | Taylor & Francis Taylor & Francis Ltd |
| Publisher_xml | – name: Taylor & Francis – name: Taylor & Francis Ltd |
| SSID | ssj0064251 |
| Score | 1.8450254 |
| Snippet | Mining frequent patterns (FP) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. A... |
| SourceID | proquest crossref informaworld |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 55 |
| SubjectTerms | Algorithms Computer engineering Data mining Frequent patterns Knowledge discovery Large-scale databases Parallel and distributed processing Studies Tree restructuring Trees |
| Title | Parallel and Distributed Algorithms for Frequent Pattern Mining in Large Databases |
| URI | https://www.tandfonline.com/doi/abs/10.4103/0256-4602.48469 https://www.proquest.com/docview/857833763 https://www.proquest.com/docview/34911632 |
| Volume | 26 |
| WOSCitedRecordID | wos000263990400008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAWR databaseName: Taylor and Francis Online Journals customDbUrl: eissn: 0974-5971 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0064251 issn: 0256-4602 databaseCode: TFW dateStart: 19840101 isFulltext: true titleUrlDefault: https://www.tandfonline.com providerName: Taylor & Francis |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELaqigEG3ohSHh4YWFLSJnETMVWUigGqChXRzXL8gEolRUnK7-fOSapWwAJzbMc6--47v76PkEtXdSVkFrEjXBU4PouVEzIpnY5Wqu16WoZtacUmusNhOJlEoxq5qd7C4LVKXEObgijCxmp0bhFbBRIf6l8jTEOzbqflA3ri4z0AfXTK8eClisKQVlvpxWXZgtbnp_priLTGV_otPlvQGez8q7u7ZLvMNWmvmBx7pKaTfbK1wkB4QJ5GIkU1lRmF5mkfWXRRAEsr2pu9ztNp_vaeUegqHaT2znVOR5aQM6GPVlmCThP6gJfJaV_kAiExOyTPg7vx7b1Tyiw40gvc3JFdWPMITFWMNIDYKgRkM_gANhCBFMiXKLXyoli1Q8FgDE1bKp-ZjgijQDPlHZF6Mk_0MaGS-YaZWCkBqBexKHZF0HWVEaFUBhKfBmlVpuay5CBHKYwZh7UIGoujsTgai1tjNcjVssJHQb_xe1Fvdex4bnc8TCFPwr1fazWrIeal92Y8hDDmYeRtkIvlV3A7PEsRiZ4voD0fUIJ5nZM__bVJNotTKdzKOSX1PF3oM7IhP_Nplp7bifwFvATx_A |
| linkProvider | Taylor & Francis |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3JTsMwEB2xScCBHVHK4gMHLoG0SdxE4lIBFYhSVagIbpbjBSqVFKWB72fGaSsQcIFz7Ik1tueNt_cAjnzdUJhZpJ70deSFPNVezJXy6kbrmh8YFdeUE5todDrx42PSnYGzyVsYulZJa2hbEkW4WE2TmzajaYaHaOCUcBrt-vWTEOEzmYX5CFGWmPN7rYdJHMbE2okvTguXxD4_GfiCSV8YS79FaAc7rdX_NXgNVsbpJmuW42MdZky2AcufSAg34a4rcxJUGTC0zy6ISJc0sIxmzcHTMO8Xzy8jhm1lrdxduy5Y13FyZuzWiUuwfsbadJ-cXchCEiqOtuC-ddk7v_LGSgueCiK_8FQDlz2SshWrLIK2jhHcLL2BjWSkJFEmKqODJNW1WHLsRltTOuS2LuMkMlwH2zCXDTOzA0zx0HKbai0R-BKepL6MGr62MlbaYu5TgZOJr4Ua05CTGsZA4HKEnCXIWYKcJZyzKnA8rfBaMnD8XjT43HmicJsetlQoEcGvtaqTPhbjCTwSMUaygIJvBQ6nX3Hm0XGKzMzwDe2FCBQ8qO_-6a-HsHjVu22L9nXnpgpL5SEV7ezswVyRv5l9WFDvRX-UH7hR_QEKG_Ym |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3JTsMwEB2xCcGBHVFWHzhwCaRN4ibihCgRCKgqBIKb5XiBSiWt0sD3M-O0FQi4wDn2xBrb88bbewCHvm4qzCwyT_o68kKeaS_mSnkNo3XdD4yK68qJTTTb7fjpKelMwen4LQxdq6Q1tK2IIlyspsk90JYmeIj1Twim0azfOA4RPZNpmMWkmdPwvk8fx2EY82qnvTgpXPH6_GTgCyR9ISz9FqAd6qTL_2rvCiyNkk12Vo2OVZgy-RosfqIgXIe7jixITqXH0DxrEY0uKWAZzc56z_2iW768Dhk2laWFu3Rdso5j5MzZrZOWYN2c3dBtctaSpSRMHG7AQ3pxf37pjXQWPBVEfumpJi56JOUqVlmEbB0jtFl6ARvJSEkiTFRGB0mm67Hk2Im2rnTIbUPGSWS4DjZhJu_nZguY4qHlNtNaIuwlPMl8GTV9bWWstMXMpwbHY1cLNSIhJy2MnsDFCDlLkLMEOUs4Z9XgaFJhUPFv_F40-Nx3onRbHrbSJxHBr7V2xl0sRtN3KGKMYwGF3hocTL7ivKPDFJmb_hvaCxEmeNDY_tNfD2C-00rFzVX7egcWqhMq2tbZhZmyeDN7MKfey-6w2Hdj-gMwpPTY |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Parallel+and+Distributed+Algorithms+for+Frequent+Pattern+Mining+in+Large+Databases&rft.jtitle=Technical+review+-+IETE&rft.au=Tanbeer%2C+SyedKhairuzzaman&rft.au=Ahmed%2C+ChowdhuryFarhan&rft.au=Jeong%2C+Byeong-Soo&rft.date=2009&rft.issn=0256-4602&rft.volume=26&rft.issue=1&rft.spage=55&rft_id=info:doi/10.4103%2F0256-4602.48469&rft.externalDBID=n%2Fa&rft.externalDocID=10_4103_0256_4602_48469 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0256-4602&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0256-4602&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0256-4602&client=summon |