Novel Maximum-Margin Training Algorithms for Supervised Neural Networks

This paper proposes three novel training methods, two of them based on the backpropagation approach and a third one based on information theory for multilayer perceptron (MLP) binary classifiers. Both backpropagation methods are based on the maximal-margin (MM) principle. The first one, based on the...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on neural networks Ročník 21; číslo 6; s. 972 - 984
Hlavní autoři:	Ludwig, Oswaldo, Nunes, Urbano
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York, NY IEEE 01.06.2010 Institute of Electrical and Electronics Engineers
Témata:	Algorithms Applied sciences Artificial intelligence Back propagation Backpropagation algorithms Complexity Computer science; control theory; systems Computer Simulation Connectionism. Neural networks Constraint optimization Data processing. List processing. Character string processing Exact sciences and technology Feedback Humans Hyperplanes Information Theory Interference Kernel Learning - physiology Mathematical models maximal-margin (MM) principle Memory organisation. Data processing multilayer perceptron (MLP) Multilayer perceptrons Neural networks Neural Networks (Computer) Optimization methods pattern recognition Pattern Recognition, Automated - methods ROC Curve Software supervised learning Support vector machines Testing Training Adaptive algorithm multilayer perceptron (MLP) Gradient descent Backpropagation algorithm Space complexity Vector support machine Learning algorithm maximal-margin (MM) principle Mathematical programming Backpropagation Discriminant analysis Statistical analysis Minimization Pattern recognition Neural network Computational complexity Hyperplane Fisher information Constrained optimization Supervised learning Receiver operating characteristic curves Multilayer perceptrons Objective function Time complexity Artificial intelligence Information theory
ISSN:	1045-9227, 1941-0093, 1941-0093
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	This paper proposes three novel training methods, two of them based on the backpropagation approach and a third one based on information theory for multilayer perceptron (MLP) binary classifiers. Both backpropagation methods are based on the maximal-margin (MM) principle. The first one, based on the gradient descent with adaptive learning rate algorithm (GDX) and named maximum-margin GDX (MMGDX), directly increases the margin of the MLP output-layer hyperplane. The proposed method jointly optimizes both MLP layers in a single process, backpropagating the gradient of an MM-based objective function, through the output and hidden layers, in order to create a hidden-layer space that enables a higher margin for the output-layer hyperplane, avoiding the testing of many arbitrary kernels, as occurs in case of support vector machine (SVM) training. The proposed MM-based objective function aims to stretch out the margin to its limit. An objective function based on Lp -norm is also proposed in order to take into account the idea of support vectors, however, overcoming the complexity involved in solving a constrained optimization problem, usually in SVM training. In fact, all the training methods proposed in this paper have time and space complexities O ( N ) while usual SVM training methods have time complexity O ( N 3 ) and space complexity O ( N 2 ) , where N is the training-data-set size. The second approach, named minimization of interclass interference (MICI), has an objective function inspired on the Fisher discriminant analysis. Such algorithm aims to create an MLP hidden output where the patterns have a desirable statistical distribution. In both training methods, the maximum area under ROC curve (AUC) is applied as stop criterion. The third approach offers a robust training framework able to take the best of each proposed training method. The main idea is to compose a neural model by using neurons extracted from three other neural networks, each one previously trained by MICI, MMGDX, and Levenberg-Marquard (LM), respectively. The resulting neural network was named assembled neural network (ASNN). Benchmark data sets of real-world problems have been used in experiments that enable a comparison with other state-of-the-art classifiers. The results provide evidence of the effectiveness of our methods regarding accuracy, AUC, and balanced error rate.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Article-2 ObjectType-Feature-1
ISSN:	1045-9227 1941-0093 1941-0093
DOI:	10.1109/TNN.2010.2046423