A New Data Stream Mining Algorithm for Interestingness-Rich Association Rules

Frequent itemset mining and association rule generation is a challenging task in data stream. Even though, various algorithms have been proposed to solve the issue, it has been found out that only frequency does not decides the significance interestingness of the mined itemset and hence the associat...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of computer information systems Vol. 53; no. 3; pp. 14 - 27
Main Author: Kuthadi, Venu Madhav
Format: Journal Article
Language:English
Published: Stillwater Taylor & Francis 01.03.2013
Taylor & Francis Ltd
Subjects:
ISSN:0887-4417, 2380-2057
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Frequent itemset mining and association rule generation is a challenging task in data stream. Even though, various algorithms have been proposed to solve the issue, it has been found out that only frequency does not decides the significance interestingness of the mined itemset and hence the association rules. This accelerates the algorithms to mine the association rules based on utility i.e. proficiency of the mined rules. However, fewer algorithms exist in the literature to deal with the utility as most of them deals with reducing the complexity in frequent itemset/association rules mining algorithm. Also, those few algorithms consider only the overall utility of the association rules and not the consistency of the rules throughout a defined number of periods. To solve this issue, in this paper, an enhanced association rule mining algorithm is proposed. The algorithm introduces new weightage validation in the conventional association rule mining algorithms to validate the utility and its consistency in the mined association rules. The utility is validated by the integrated calculation of the cost/price efficiency of the itemsets and its frequency. The consistency validation is performed at every defined number of windows using the probability distribution function, assuming that the weights are normally distributed. Hence, validated and the obtained rules are frequent and utility efficient and their interestingness are distributed throughout the entire time period. The algorithm is implemented and the resultant rules are compared against the rules that can be obtained from conventional mining algorithms.
Bibliography:SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ISSN:0887-4417
2380-2057
DOI:10.1080/08874417.2013.11645628