Implementation of Association Rule Mining Algorithms on Distributed Data Processing Platforms

Association rule mining algorithms are a frequently used data mining tecnique. It is aimed to find the items that are frequently found from the data. Nowadays, large data processing and analysis platforms are not focused on data mining, so they do not offer large-scale libraries for association rule...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2019 4th International Conference on Computer Science and Engineering (UBMK) S. 79 - 84
Hauptverfasser: Sesver, Duygu, Tuna, Sabah, Aktas, Mehmet S., Kalipsiz, Oya, Kanli, Alper Nebi, Turgut, Umut Orcun
Format: Tagungsbericht
Sprache:Englisch
Türkisch
Veröffentlicht: IEEE 01.09.2019
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Association rule mining algorithms are a frequently used data mining tecnique. It is aimed to find the items that are frequently found from the data. Nowadays, large data processing and analysis platforms are not focused on data mining, so they do not offer large-scale libraries for association rule mining algorithms. In the scope of this research, a library has been developed for association rule mining algorithms on a large data processing platform. The Apache Spark platform has been preferred in terms of common usage for the research case study. Implementation methods of different algorithms have been implemented on this platform to benefit from the Map-Reduce programming model. In this context, Apriori, Eclat and Pascal algorithms are implemented for large data platform. The library created by the implementation method we suggest is comparatively analyzed in terms of performance metrics on big data processing platforms with single and multiple nodes. The methods implemented within the scope of the research are also compared with the performance of the FpGrowth algorithm implemented by the Spark platform. The results of our research show that when tested on large scale data, the Apriori algorithm gives much better performance values than the other algorithms when switching from single-node cluster environment to multi-node cluster environment.
AbstractList Association rule mining algorithms are a frequently used data mining tecnique. It is aimed to find the items that are frequently found from the data. Nowadays, large data processing and analysis platforms are not focused on data mining, so they do not offer large-scale libraries for association rule mining algorithms. In the scope of this research, a library has been developed for association rule mining algorithms on a large data processing platform. The Apache Spark platform has been preferred in terms of common usage for the research case study. Implementation methods of different algorithms have been implemented on this platform to benefit from the Map-Reduce programming model. In this context, Apriori, Eclat and Pascal algorithms are implemented for large data platform. The library created by the implementation method we suggest is comparatively analyzed in terms of performance metrics on big data processing platforms with single and multiple nodes. The methods implemented within the scope of the research are also compared with the performance of the FpGrowth algorithm implemented by the Spark platform. The results of our research show that when tested on large scale data, the Apriori algorithm gives much better performance values than the other algorithms when switching from single-node cluster environment to multi-node cluster environment.
Author Sesver, Duygu
Aktas, Mehmet S.
Kanli, Alper Nebi
Tuna, Sabah
Kalipsiz, Oya
Turgut, Umut Orcun
Author_xml – sequence: 1
  givenname: Duygu
  surname: Sesver
  fullname: Sesver, Duygu
  organization: Bilgisayar Mühendisliği Bölümü, Yıldız Teknik Üniversitesi,Istanbul
– sequence: 2
  givenname: Sabah
  surname: Tuna
  fullname: Tuna, Sabah
  organization: Bilgisayar Mühendisliği Bölümü, Yıldız Teknik Üniversitesi,Istanbul
– sequence: 3
  givenname: Mehmet S.
  surname: Aktas
  fullname: Aktas, Mehmet S.
  organization: Bilgisayar Mühendisliği Bölümü, Yıldız Teknik Üniversitesi,Istanbul
– sequence: 4
  givenname: Oya
  surname: Kalipsiz
  fullname: Kalipsiz, Oya
  organization: Bilgisayar Mühendisliği Bölümü, Yıldız Teknik Üniversitesi,Istanbul
– sequence: 5
  givenname: Alper Nebi
  surname: Kanli
  fullname: Kanli, Alper Nebi
  organization: Ar-Ge Merkezi, Cybersoft,İstanbul
– sequence: 6
  givenname: Umut Orcun
  surname: Turgut
  fullname: Turgut, Umut Orcun
  organization: Ar-Ge Merkezi, Cybersoft,İstanbul
BookMark eNotj8tOAjEYhWuiC0EfwLjpCzD2MjOdLkdQJEIkRpaGdNq_2GSmJW1Z-PaOgdXJueQk3wRd--ABoQdKCkqJfNo9b94LRqgsGkkEKckVmlDBGsplXfJb9L0ajj0M4LPKLngcLG5TCtqd7eepB7xx3vkDbvtDiC7_DAmPzcKlHF13ymDwQmWFtzFoSOl_ue1VtiEO6Q7dWNUnuL_oFO1eX77mb7P1x3I1b9czRznPs84QZVlVS1lCZ3RFqbJCKKLHEKqy0yOAMaIeAWhjgImuYUwLo7m0RBjOp-jx_OsAYH-MblDxd38h5n_TK1F1
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/UBMK.2019.8907040
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1728139643
1728139635
9781728139630
9781728139647
EndPage 84
ExternalDocumentID 8907040
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i133t-bd0af256994ebdc511af77a0cf25e54bc201dd7607018de27b822c7dc39f07d33
IEDL.DBID RIE
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000609879900016&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:11:55 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
Turkish
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i133t-bd0af256994ebdc511af77a0cf25e54bc201dd7607018de27b822c7dc39f07d33
PageCount 6
ParticipantIDs ieee_primary_8907040
PublicationCentury 2000
PublicationDate 2019-Sept.
PublicationDateYYYYMMDD 2019-09-01
PublicationDate_xml – month: 09
  year: 2019
  text: 2019-Sept.
PublicationDecade 2010
PublicationTitle 2019 4th International Conference on Computer Science and Engineering (UBMK)
PublicationTitleAbbrev UBMK
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.70504
Snippet Association rule mining algorithms are a frequently used data mining tecnique. It is aimed to find the items that are frequently found from the data. Nowadays,...
SourceID ieee
SourceType Publisher
StartPage 79
SubjectTerms Apriori Algorithm
Association Rule Mining Algorithms
Big Data Processing and Analysis Platfoms
Eclat Algorithm
Pascal Algorithm
Spark Platform
Title Implementation of Association Rule Mining Algorithms on Distributed Data Processing Platforms
URI https://ieeexplore.ieee.org/document/8907040
WOSCitedRecordID wos000609879900016&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA7b8OBJZRN_k4NH69I1WZqjOoegG0Mc7CIjTV50sLWydv79vrRlKnjx1qShpS-BfF_zvu8Rculty2IX8wCEcAEPQwi0UwKbBpBwCcNs6a7_JMfjeDZTkwa52mphAKBMPoNrf1me5dvMbPyvsm6MTA4XXZM0pexXWq36oDJkqju9HT36XC2c_Grcr4Ip5X4x3Pvfm_ZJ51t4RyfbLeWANCBtk9fSwXdVi4RSmjn6I6j0ebMEOirrPNCb5VuGbP99lVO8M_CmuL6eFVg60IWmtSrAj5wsdeHxat4h0-H9y91DUFdFCBbIJ4sgsUw7BCpKcUisQcCknZSaGewEwRODIbBW9vEDwthCTyaIAYy0JlKOSRtFh6SVZikcEep4wnrAYh71Eh46oRBNSXyGiBwgczLHpO1DM_-ojC_mdVRO_u4-Jbs--lUC1hlpFesNnJMd81ks8vVFOVtfOaeZaQ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8NAEB1qFfSk0orf7sGjsUmz6yZHtZZKPyjSQi9Skt1ZLbSJtKm_39kkVAUv3pLNkpCZwL6XnfcG4NralgUm4A4KYRzueehEJhR0qpAIl1Cuzt31e3IwCCaTcFiBm40WBhHz4jO8tYf5Xr5O1dr-KmsExOToo9uCbds5SxRqrXKr0nPDxvih37XVWpT-Yuavlin5itHe_9-zDqD-Lb1jw82icggVTGrwmnv4LkqZUMJSw36Elb2s58j6eacHdj9_S4nvvy9WjK60rC2u7WiFmrWiLGKlLsDOHM6jzCLWVR3G7afRY8cp-yI4M2KUmRNrNzIEVcKQY6wVQabISBm5igZR8FhRCLSWd_QCXqCxKWNCAUpq5YfGldr3j6CapAkeAzM8dpvoBtxvxtwzIiQ8JekewjdI3EmdQM2GZvpRWF9My6ic_j18BbudUb837T0PumewZzNRlGOdQzVbrvECdtRnNlstL_PMfQFnzpy0
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+4th+International+Conference+on+Computer+Science+and+Engineering+%28UBMK%29&rft.atitle=Implementation+of+Association+Rule+Mining+Algorithms+on+Distributed+Data+Processing+Platforms&rft.au=Sesver%2C+Duygu&rft.au=Tuna%2C+Sabah&rft.au=Aktas%2C+Mehmet+S.&rft.au=Kalipsiz%2C+Oya&rft.date=2019-09-01&rft.pub=IEEE&rft.spage=79&rft.epage=84&rft_id=info:doi/10.1109%2FUBMK.2019.8907040&rft.externalDocID=8907040