Parallel and Distributed Algorithms for Frequent Pattern Mining in Large Databases

Mining frequent patterns (FP) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. A significant number of parallel and distributed FP mining algorithms have been proposed, when the database is large and/or distributed. Among t...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Technical review - IETE Ročník 26; číslo 1; s. 55 - 66
Hlavní autoři: Tanbeer, Syed Khairuzzaman, Ahmed, Chowdhury Farhan, Jeong, Byeong-Soo
Médium: Journal Article
Jazyk:angličtina
Vydáno: New Delhi Taylor & Francis 01.01.2009
Taylor & Francis Ltd
Témata:
ISSN:0256-4602, 0974-5971
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Mining frequent patterns (FP) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. A significant number of parallel and distributed FP mining algorithms have been proposed, when the database is large and/or distributed. Among them, parallelization of the FP-growth algorithm using the FP-tree has been proved to be more efficient, when compared to the Apriori -based approaches. However, the FP-tree based techniques suffer from two major limitations - multiple database scans requirement (i.e., high I/O cost) and huge communication overhead. Therefore, in this paper, we propose a novel tree structure, called PP-tree (Parallel Pattern tree) that significantly reduces the I/O cost by capturing the database contents with a single scan and facilitates efficient FP-growth mining on it. Our parallel algorithm works independently at each local site and merges the locally generated global frequent patterns at the final stage, thereby reducing inter-processor communication overhead and getting a high degree of parallelism. Extensive experimental study on datasets of different types reflects that parallel and distributed FP mining with our PP-tree is highly efficient on large databases.
AbstractList Mining frequent patterns (FP) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. A significant number of parallel and distributed FP mining algorithms have been proposed, when the database is large and/or distributed. Among them, parallelization of the FP-growth algorithm using the FP-tree has been proved to be more efficient, when compared to the Apriori -based approaches. However, the FP-tree based techniques suffer from two major limitations - multiple database scans requirement (i.e., high I/O cost) and huge communication overhead. Therefore, in this paper, we propose a novel tree structure, called PP-tree (Parallel Pattern tree) that significantly reduces the I/O cost by capturing the database contents with a single scan and facilitates efficient FP-growth mining on it. Our parallel algorithm works independently at each local site and merges the locally generated global frequent patterns at the final stage, thereby reducing inter-processor communication overhead and getting a high degree of parallelism. Extensive experimental study on datasets of different types reflects that parallel and distributed FP mining with our PP-tree is highly efficient on large databases.
Author Jeong, Byeong-Soo
Ahmed, Chowdhury Farhan
Tanbeer, Syed Khairuzzaman
Author_xml – sequence: 1
  givenname: Syed Khairuzzaman
  surname: Tanbeer
  fullname: Tanbeer, Syed Khairuzzaman
  organization: Department of Computer Engineering, Kyung Hee University
– sequence: 2
  givenname: Chowdhury Farhan
  surname: Ahmed
  fullname: Ahmed, Chowdhury Farhan
  organization: Department of Computer Engineering, Kyung Hee University
– sequence: 3
  givenname: Byeong-Soo
  surname: Jeong
  fullname: Jeong, Byeong-Soo
  organization: Department of Computer Engineering, Kyung Hee University
BookMark eNp9kE1LXDEUhkOxULVddxu6cHc1ufm4N-5E6weMVEq7DmfyMY1kEk0yiP--GafdCLo6B877HF6eA7SXcnIIfaXkmFPCTsgo5MAlGY_5zKX6gPaJmvgg1ET3-v7_-gkd1HpPiOSjoPvo5x0UiNFFDMnii1BbCctNcxafxVUuof1ZV-xzwZfFPW5cavgOWnMl4duQQlrhkPACysrhC2iwhOrqZ_TRQ6zuy795iH5ffv91fj0sflzdnJ8tBsMEaYOZlJJAJFPe-JGPdmac-l6LChAGJsakcZappaUzSGetp8Zy6UeYlXDSskN0tPv7UHKvVpteh2pcjJBc3lTNuKJUsrEHv70K3udNSb2bnsU0MzZJ1kNiFzIl11qc1yY0aCGnViBETYneatZbkXorUr9o7tzJK-6hhDWU53eI0x0RUhe7hqdcotUNnmMuvkAyoXd_C_4LuoSSzA
CitedBy_id crossref_primary_10_1109_TBDATA_2017_2731838
crossref_primary_10_4103_0256_4602_90761
crossref_primary_10_1007_s40747_018_0085_9
crossref_primary_10_1109_TPDS_2014_2377713
crossref_primary_10_1155_2018_2818251
crossref_primary_10_1007_s00521_012_0943_0
crossref_primary_10_1016_j_procs_2014_05_012
crossref_primary_10_1109_ACCESS_2020_2974035
crossref_primary_10_1080_02533839_2018_1454853
ContentType Journal Article
Copyright Copyright © 2009 by the IETE 2009
Copyright Medknow Publications & Media Pvt Ltd Jan 2009
Copyright_xml – notice: Copyright © 2009 by the IETE 2009
– notice: Copyright Medknow Publications & Media Pvt Ltd Jan 2009
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.4103/0256-4602.48469
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList

Computer and Information Systems Abstracts
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 0974-5971
EndPage 66
ExternalDocumentID 2297490651
10_4103_0256_4602_48469
10876687
Genre Article
GroupedDBID 0BK
29Q
2WC
30N
4.4
5GY
8FE
8FG
AAGDL
AAHIA
AAJMT
AALDU
AAMIU
AAPUL
AAQRR
ABCCY
ABFIM
ABJNI
ABLIJ
ABPAQ
ABPEM
ABTAI
ABXUL
ABXYU
ACGFS
ACTIO
ADCVX
ADGTB
ADUMR
AEISY
AENEX
AEYOC
AFRVT
AGBKS
AGDLA
AHDZW
AIDUJ
AIJEM
AIYEW
AKBVH
AKOOK
ALMA_UNASSIGNED_HOLDINGS
ALQZU
AQRUH
AQTUD
AVBZW
AWYRJ
BLEHA
BPHCQ
C1A
CCCUG
DGEBU
DKSSO
DU5
E3Z
EBS
EJD
GTTXZ
H13
IL9
KYCEM
M4Z
P2P
P62
PQQKQ
PROAC
RNANH
ROSJB
RTWRZ
SC5
SNACF
TAJZE
TASJS
TBQAZ
TDBHL
TEN
TFL
TFT
TFW
TR2
TTHFI
TUROJ
ZGOLN
.DC
0R~
AAYXX
ACTTO
ADXEU
AEHZU
AEZBV
AFBWG
AFION
AGVKY
AGWUF
AGYFW
AKHJE
AKMBP
ALRRR
ALXIB
AMATQ
ARCSS
BGSSV
BWMZZ
CITATION
CYRSC
DAOYK
DEXXA
FETWF
HZ~
IFELN
IPNFZ
LJTGL
NUSFT
O9-
OPCYK
RIG
RNS
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c350t-c7996a0639fcf242d8341f64215a5ca7336ced39bd18a6eddf1cd46f2a895e6d3
IEDL.DBID TFW
ISICitedReferencesCount 12
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000263990400008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0256-4602
IngestDate Sun Nov 09 10:42:35 EST 2025
Fri Sep 19 21:00:25 EDT 2025
Sat Nov 29 06:24:17 EST 2025
Tue Nov 18 21:09:23 EST 2025
Mon Oct 20 23:38:00 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c350t-c7996a0639fcf242d8341f64215a5ca7336ced39bd18a6eddf1cd46f2a895e6d3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
PQID 857833763
PQPubID 226518
PageCount 12
ParticipantIDs informaworld_taylorfrancis_310_4103_0256_4602_48469
crossref_primary_10_4103_0256_4602_48469
proquest_miscellaneous_34911632
proquest_journals_857833763
crossref_citationtrail_10_4103_0256_4602_48469
PublicationCentury 2000
PublicationDate 1/1/2009
2009-00-00
20090101
PublicationDateYYYYMMDD 2009-01-01
PublicationDate_xml – month: 01
  year: 2009
  text: 1/1/2009
  day: 01
PublicationDecade 2000
PublicationPlace New Delhi
PublicationPlace_xml – name: New Delhi
PublicationTitle Technical review - IETE
PublicationYear 2009
Publisher Taylor & Francis
Taylor & Francis Ltd
Publisher_xml – name: Taylor & Francis
– name: Taylor & Francis Ltd
SSID ssj0064251
Score 1.8450254
Snippet Mining frequent patterns (FP) from large-scale databases has emerged as an important problem in the data mining and knowledge discovery research community. A...
SourceID proquest
crossref
informaworld
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 55
SubjectTerms Algorithms
Computer engineering
Data mining
Frequent patterns
Knowledge discovery
Large-scale databases
Parallel and distributed processing
Studies
Tree restructuring
Trees
Title Parallel and Distributed Algorithms for Frequent Pattern Mining in Large Databases
URI https://www.tandfonline.com/doi/abs/10.4103/0256-4602.48469
https://www.proquest.com/docview/857833763
https://www.proquest.com/docview/34911632
Volume 26
WOSCitedRecordID wos000263990400008&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAWR
  databaseName: Taylor and Francis Online Journals
  customDbUrl:
  eissn: 0974-5971
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0064251
  issn: 0256-4602
  databaseCode: TFW
  dateStart: 19840101
  isFulltext: true
  titleUrlDefault: https://www.tandfonline.com
  providerName: Taylor & Francis
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELaqigEG3ohSHh4YWFLSJnETMVWUigGqChXRzXL8gEolRUnK7-fOSapWwAJzbMc6--47v76PkEtXdSVkFrEjXBU4PouVEzIpnY5Wqu16WoZtacUmusNhOJlEoxq5qd7C4LVKXEObgijCxmp0bhFbBRIf6l8jTEOzbqflA3ri4z0AfXTK8eClisKQVlvpxWXZgtbnp_priLTGV_otPlvQGez8q7u7ZLvMNWmvmBx7pKaTfbK1wkB4QJ5GIkU1lRmF5mkfWXRRAEsr2pu9ztNp_vaeUegqHaT2znVOR5aQM6GPVlmCThP6gJfJaV_kAiExOyTPg7vx7b1Tyiw40gvc3JFdWPMITFWMNIDYKgRkM_gANhCBFMiXKLXyoli1Q8FgDE1bKp-ZjgijQDPlHZF6Mk_0MaGS-YaZWCkBqBexKHZF0HWVEaFUBhKfBmlVpuay5CBHKYwZh7UIGoujsTgai1tjNcjVssJHQb_xe1Fvdex4bnc8TCFPwr1fazWrIeal92Y8hDDmYeRtkIvlV3A7PEsRiZ4voD0fUIJ5nZM__bVJNotTKdzKOSX1PF3oM7IhP_Nplp7bifwFvATx_A
linkProvider Taylor & Francis
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3JTsMwEB2xScCBHVHK4gMHLoG0SdxE4lIBFYhSVagIbpbjBSqVFKWB72fGaSsQcIFz7Ik1tueNt_cAjnzdUJhZpJ70deSFPNVezJXy6kbrmh8YFdeUE5todDrx42PSnYGzyVsYulZJa2hbEkW4WE2TmzajaYaHaOCUcBrt-vWTEOEzmYX5CFGWmPN7rYdJHMbE2okvTguXxD4_GfiCSV8YS79FaAc7rdX_NXgNVsbpJmuW42MdZky2AcufSAg34a4rcxJUGTC0zy6ISJc0sIxmzcHTMO8Xzy8jhm1lrdxduy5Y13FyZuzWiUuwfsbadJ-cXchCEiqOtuC-ddk7v_LGSgueCiK_8FQDlz2SshWrLIK2jhHcLL2BjWSkJFEmKqODJNW1WHLsRltTOuS2LuMkMlwH2zCXDTOzA0zx0HKbai0R-BKepL6MGr62MlbaYu5TgZOJr4Ua05CTGsZA4HKEnCXIWYKcJZyzKnA8rfBaMnD8XjT43HmicJsetlQoEcGvtaqTPhbjCTwSMUaygIJvBQ6nX3Hm0XGKzMzwDe2FCBQ8qO_-6a-HsHjVu22L9nXnpgpL5SEV7ezswVyRv5l9WFDvRX-UH7hR_QEKG_Ym
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3JTsMwEB2xCcGBHVFWHzhwCaRN4ibihCgRCKgqBIKb5XiBSiWt0sD3M-O0FQi4wDn2xBrb88bbewCHvm4qzCwyT_o68kKeaS_mSnkNo3XdD4yK68qJTTTb7fjpKelMwen4LQxdq6Q1tK2IIlyspsk90JYmeIj1Twim0azfOA4RPZNpmMWkmdPwvk8fx2EY82qnvTgpXPH6_GTgCyR9ISz9FqAd6qTL_2rvCiyNkk12Vo2OVZgy-RosfqIgXIe7jixITqXH0DxrEY0uKWAZzc56z_2iW768Dhk2laWFu3Rdso5j5MzZrZOWYN2c3dBtctaSpSRMHG7AQ3pxf37pjXQWPBVEfumpJi56JOUqVlmEbB0jtFl6ARvJSEkiTFRGB0mm67Hk2Im2rnTIbUPGSWS4DjZhJu_nZguY4qHlNtNaIuwlPMl8GTV9bWWstMXMpwbHY1cLNSIhJy2MnsDFCDlLkLMEOUs4Z9XgaFJhUPFv_F40-Nx3onRbHrbSJxHBr7V2xl0sRtN3KGKMYwGF3hocTL7ivKPDFJmb_hvaCxEmeNDY_tNfD2C-00rFzVX7egcWqhMq2tbZhZmyeDN7MKfey-6w2Hdj-gMwpPTY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Parallel+and+Distributed+Algorithms+for+Frequent+Pattern+Mining+in+Large+Databases&rft.jtitle=Technical+review+-+IETE&rft.au=Tanbeer%2C+SyedKhairuzzaman&rft.au=Ahmed%2C+ChowdhuryFarhan&rft.au=Jeong%2C+Byeong-Soo&rft.date=2009&rft.issn=0256-4602&rft.volume=26&rft.issue=1&rft.spage=55&rft_id=info:doi/10.4103%2F0256-4602.48469&rft.externalDBID=n%2Fa&rft.externalDocID=10_4103_0256_4602_48469
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0256-4602&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0256-4602&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0256-4602&client=summon