Parallel Algorithm Model for Knowledge Reduction Using MapReduce

Knowledge reduction for massive datasets has attracted many research interests in rough set theory. Classical knowledge reduction algorithms assume that all datasets can be loaded into the main memory of a single machine, which are infeasible for large-scale data. Firstly, this paper analyzes the pa...

Full description

Saved in:
Bibliographic Details
Published in:Jisuanji Kexue yu Tansuo / Journal of Computer Science and Frontiers Vol. 7; no. 1; pp. 35 - 45
Main Authors: Qian, Jin, Miao, Duoqian, Zhang, Zehua, Zhang, Zhifei
Format: Journal Article
Language:Chinese
Published: 01.01.2013
Subjects:
ISSN:1673-9418
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Knowledge reduction for massive datasets has attracted many research interests in rough set theory. Classical knowledge reduction algorithms assume that all datasets can be loaded into the main memory of a single machine, which are infeasible for large-scale data. Firstly, this paper analyzes the parallel computations among classical knowledge reduction algorithms. Then, in order to compute the equivalence classes and attribute significance on different candidate attribute sets, it designs and implements the Map and Reduce functions using data and task parallelism. Finally, it constructs the parallel algorithm framework model for knowledge reduction using MapReduce, which can be used to compute a reduct for the algorithms based on positive region, discernibility matrix or information entropy. The experimental results demonstrate that the proposed parallel knowledge reduction algorithms can efficiently process massive datasets on Hadoop platform.
Bibliography:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
ISSN:1673-9418
DOI:10.3778/j.issn.1673-9418.1206048