Parallel online sequential extreme learning machine based on MapReduce

In this age of big data, analyzing big data is a very challenging problem. MapReduce is a simple, scalable and fault-tolerant data processing framework that enables us to process a massive volume of data. Many machine learning algorithms have been designed based on MapReduce, but there are only a fe...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Neurocomputing (Amsterdam) Jg. 149; S. 224 - 232
Hauptverfasser: Wang, Botao, Huang, Shan, Qiu, Junhao, Liu, Yu, Wang, Guoren
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier B.V 03.02.2015
Schlagworte:
ISSN:0925-2312, 1872-8286
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this age of big data, analyzing big data is a very challenging problem. MapReduce is a simple, scalable and fault-tolerant data processing framework that enables us to process a massive volume of data. Many machine learning algorithms have been designed based on MapReduce, but there are only a few works related to parallel extreme learning machine (ELM) which is a fast and accurate learning algorithm. Online sequential extreme learning machine (OS-ELM) is one of improved ELM algorithms to support online sequential learning efficiently. In this paper, we first analyze the dependency relationships of matrix calculations of OS-ELM, then propose a parallel online sequential extreme learning machine (POS-ELM) based on MapReduce. POS-ELM is evaluated with real and synthetic data with the maximum number of training data 1280K and the maximum number of attributes 128. The experimental results show that the training accuracy and testing accuracy of POS-ELM are at the same level as those of OS-ELM and ELM, and it has good scalability with regard to the number of training data and the number of attributes. Compared to original ELM and OS-ELM where the capability to process large scale data is bounded by the limitation of resources within a single processing unit, POS-ELM can deal with much larger scale data. The larger the number of training data is, the higher the speedup of POS-ELM is. It can be concluded that POS-ELM has more powerful capability than both ELM and OS-ELM for large scale learning.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2014.03.076