Parallel Algorithm Model for Knowledge Reduction Using MapReduce

Knowledge reduction for massive datasets has attracted many research interests in rough set theory. Classical knowledge reduction algorithms assume that all datasets can be loaded into the main memory of a single machine, which are infeasible for large-scale data. Firstly, this paper analyzes the pa...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Jisuanji Kexue yu Tansuo / Journal of Computer Science and Frontiers Ročník 7; číslo 1; s. 35 - 45
Hlavní autoři: Qian, Jin, Miao, Duoqian, Zhang, Zehua, Zhang, Zhifei
Médium: Journal Article
Jazyk:čínština
Vydáno: 01.01.2013
Témata:
ISSN:1673-9418
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Knowledge reduction for massive datasets has attracted many research interests in rough set theory. Classical knowledge reduction algorithms assume that all datasets can be loaded into the main memory of a single machine, which are infeasible for large-scale data. Firstly, this paper analyzes the parallel computations among classical knowledge reduction algorithms. Then, in order to compute the equivalence classes and attribute significance on different candidate attribute sets, it designs and implements the Map and Reduce functions using data and task parallelism. Finally, it constructs the parallel algorithm framework model for knowledge reduction using MapReduce, which can be used to compute a reduct for the algorithms based on positive region, discernibility matrix or information entropy. The experimental results demonstrate that the proposed parallel knowledge reduction algorithms can efficiently process massive datasets on Hadoop platform.
Bibliografie:ObjectType-Article-2
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 23
ObjectType-Article-1
ObjectType-Feature-2
ISSN:1673-9418
DOI:10.3778/j.issn.1673-9418.1206048