Implementation of Association Rule Mining Algorithms on Distributed Data Processing Platforms
Association rule mining algorithms are a frequently used data mining tecnique. It is aimed to find the items that are frequently found from the data. Nowadays, large data processing and analysis platforms are not focused on data mining, so they do not offer large-scale libraries for association rule...
Saved in:
| Published in: | 2019 4th International Conference on Computer Science and Engineering (UBMK) pp. 79 - 84 |
|---|---|
| Main Authors: | , , , , , |
| Format: | Conference Proceeding |
| Language: | English Turkish |
| Published: |
IEEE
01.09.2019
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Association rule mining algorithms are a frequently used data mining tecnique. It is aimed to find the items that are frequently found from the data. Nowadays, large data processing and analysis platforms are not focused on data mining, so they do not offer large-scale libraries for association rule mining algorithms. In the scope of this research, a library has been developed for association rule mining algorithms on a large data processing platform. The Apache Spark platform has been preferred in terms of common usage for the research case study. Implementation methods of different algorithms have been implemented on this platform to benefit from the Map-Reduce programming model. In this context, Apriori, Eclat and Pascal algorithms are implemented for large data platform. The library created by the implementation method we suggest is comparatively analyzed in terms of performance metrics on big data processing platforms with single and multiple nodes. The methods implemented within the scope of the research are also compared with the performance of the FpGrowth algorithm implemented by the Spark platform. The results of our research show that when tested on large scale data, the Apriori algorithm gives much better performance values than the other algorithms when switching from single-node cluster environment to multi-node cluster environment. |
|---|---|
| AbstractList | Association rule mining algorithms are a frequently used data mining tecnique. It is aimed to find the items that are frequently found from the data. Nowadays, large data processing and analysis platforms are not focused on data mining, so they do not offer large-scale libraries for association rule mining algorithms. In the scope of this research, a library has been developed for association rule mining algorithms on a large data processing platform. The Apache Spark platform has been preferred in terms of common usage for the research case study. Implementation methods of different algorithms have been implemented on this platform to benefit from the Map-Reduce programming model. In this context, Apriori, Eclat and Pascal algorithms are implemented for large data platform. The library created by the implementation method we suggest is comparatively analyzed in terms of performance metrics on big data processing platforms with single and multiple nodes. The methods implemented within the scope of the research are also compared with the performance of the FpGrowth algorithm implemented by the Spark platform. The results of our research show that when tested on large scale data, the Apriori algorithm gives much better performance values than the other algorithms when switching from single-node cluster environment to multi-node cluster environment. |
| Author | Sesver, Duygu Aktas, Mehmet S. Kanli, Alper Nebi Tuna, Sabah Kalipsiz, Oya Turgut, Umut Orcun |
| Author_xml | – sequence: 1 givenname: Duygu surname: Sesver fullname: Sesver, Duygu organization: Bilgisayar Mühendisliği Bölümü, Yıldız Teknik Üniversitesi,Istanbul – sequence: 2 givenname: Sabah surname: Tuna fullname: Tuna, Sabah organization: Bilgisayar Mühendisliği Bölümü, Yıldız Teknik Üniversitesi,Istanbul – sequence: 3 givenname: Mehmet S. surname: Aktas fullname: Aktas, Mehmet S. organization: Bilgisayar Mühendisliği Bölümü, Yıldız Teknik Üniversitesi,Istanbul – sequence: 4 givenname: Oya surname: Kalipsiz fullname: Kalipsiz, Oya organization: Bilgisayar Mühendisliği Bölümü, Yıldız Teknik Üniversitesi,Istanbul – sequence: 5 givenname: Alper Nebi surname: Kanli fullname: Kanli, Alper Nebi organization: Ar-Ge Merkezi, Cybersoft,İstanbul – sequence: 6 givenname: Umut Orcun surname: Turgut fullname: Turgut, Umut Orcun organization: Ar-Ge Merkezi, Cybersoft,İstanbul |
| BookMark | eNotj8tOAjEYhWuiC0EfwLjpCzD2MjOdLkdQJEIkRpaGdNq_2GSmJW1Z-PaOgdXJueQk3wRd--ABoQdKCkqJfNo9b94LRqgsGkkEKckVmlDBGsplXfJb9L0ajj0M4LPKLngcLG5TCtqd7eepB7xx3vkDbvtDiC7_DAmPzcKlHF13ymDwQmWFtzFoSOl_ue1VtiEO6Q7dWNUnuL_oFO1eX77mb7P1x3I1b9czRznPs84QZVlVS1lCZ3RFqbJCKKLHEKqy0yOAMaIeAWhjgImuYUwLo7m0RBjOp-jx_OsAYH-MblDxd38h5n_TK1F1 |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/UBMK.2019.8907040 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 1728139643 1728139635 9781728139630 9781728139647 |
| EndPage | 84 |
| ExternalDocumentID | 8907040 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-i133t-bd0af256994ebdc511af77a0cf25e54bc201dd7607018de27b822c7dc39f07d33 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000609879900016&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:11:55 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English Turkish |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i133t-bd0af256994ebdc511af77a0cf25e54bc201dd7607018de27b822c7dc39f07d33 |
| PageCount | 6 |
| ParticipantIDs | ieee_primary_8907040 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-Sept. |
| PublicationDateYYYYMMDD | 2019-09-01 |
| PublicationDate_xml | – month: 09 year: 2019 text: 2019-Sept. |
| PublicationDecade | 2010 |
| PublicationTitle | 2019 4th International Conference on Computer Science and Engineering (UBMK) |
| PublicationTitleAbbrev | UBMK |
| PublicationYear | 2019 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 1.7049459 |
| Snippet | Association rule mining algorithms are a frequently used data mining tecnique. It is aimed to find the items that are frequently found from the data. Nowadays,... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 79 |
| SubjectTerms | Apriori Algorithm Association Rule Mining Algorithms Big Data Processing and Analysis Platfoms Eclat Algorithm Pascal Algorithm Spark Platform |
| Title | Implementation of Association Rule Mining Algorithms on Distributed Data Processing Platforms |
| URI | https://ieeexplore.ieee.org/document/8907040 |
| WOSCitedRecordID | wos000609879900016&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA1t8eBJpRW_ycGja7f70TRHtYqgLUWs9CIlyUy00O5Ku_X3O8kuVcGLt5AEln0Tdt5s5s0wdk4uLdUg40CC1kFirQ0UdGSAAqI4IX6cem3Vy6MYDnuTiRzV2MVGC4OIPvkML93Q3-VDbtbuV1m7R5EcHbo6qwvRLbVa1UVlJ5Tt8fXgweVqkfHLfb8apnh_cbfzvyftsta38I6PNi5lj9Uwa7JXX8F3UYmEMp5b_gNU_rSeIx_4Pg_8av6WU7T_vlhxWum7oriunxUC76tC8UoV4HaO5qpwfHXVYuO72-eb-6DqihDMKJ4sAg2hskRUpExQgyHCpKwQKjQ0iWmiDUEAILr0Ap0eYCQ0cQAjwMTShgLieJ81sjzDA8adAkGmNhJRpBIRgkpCEFLS5xtdINM9ZE0HzfSjLHwxrVA5-nv6mG079MsErBPWKJZrPGVb5rOYrZZn3lpfnsuZdg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5qFfSk0opvc_Bo7HYfTXNUa6n0QZFWepGS3Um0sN2Vduvvd7JdqoIXb0s2sGQm7HyTzDcfwDWFtCBE6XGJYch9YwxXWJdcC3Q9n_BxkHOrXnpiMGhOJnJYgpsNF0ZrnRef6Vv7mN_lYxqt7FFZrUmZHG26Ldi2ylkFW6u4qqw7sja-73dttRa5fz3zl2RKHjHa-__71gFUv6l3bLgJKodQ0kkFXvMevvOCJpSw1LAfZmXPq1izfq70wO7it5Ty_ff5ktGblm2LaxWtNLKWyhQreAF25jBWmUWsyyqM24-jhw4vdBH4jDLKjIfoKENQRUpfhxgRZFJGCOVENKgDP4zIBIiiQQuoN1G7IiQUEAmMPGkcgZ53BOUkTfQxMMtBkIFxhesqXziofAeFlPQD1zaVaZxAxZpm-rFufTEtrHL69_AV7HZG_d609zTonsGe9cS6HOscytlipS9gJ_rMZsvFZe65L8p0nL8 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+4th+International+Conference+on+Computer+Science+and+Engineering+%28UBMK%29&rft.atitle=Implementation+of+Association+Rule+Mining+Algorithms+on+Distributed+Data+Processing+Platforms&rft.au=Sesver%2C+Duygu&rft.au=Tuna%2C+Sabah&rft.au=Aktas%2C+Mehmet+S.&rft.au=Kalipsiz%2C+Oya&rft.date=2019-09-01&rft.pub=IEEE&rft.spage=79&rft.epage=84&rft_id=info:doi/10.1109%2FUBMK.2019.8907040&rft.externalDocID=8907040 |