Randomized Gauss–Seidel iterative algorithms for Extreme Learning Machines

Extreme Learning Machines (ELMs) are a class of single hidden-layer feedforward neural networks known for their rapid training process, structural simplicity, and strong generalization capabilities. ELM training requires solving a system of linear equations, where solution accuracy directly impacts...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Physica A Jg. 666; S. 130515
Hauptverfasser: Subramani, Chinnamuthu, Jagannath, Ravi Prasad K., Kuppili, Venkatanareshbabu
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier B.V 15.05.2025
Schlagworte:
ISSN:0378-4371
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Extreme Learning Machines (ELMs) are a class of single hidden-layer feedforward neural networks known for their rapid training process, structural simplicity, and strong generalization capabilities. ELM training requires solving a system of linear equations, where solution accuracy directly impacts model performance. However, conventional ELMs rely on the Moore–Penrose inverse, which is computationally expensive, memory-intensive, and numerically unstable in ill-conditioned problems. Additionally, stabilizing matrix inversion requires a hyperparameter, whose optimal selection further increases computational complexity. Iterative numerical techniques offer a promising alternative; however, the stochastic nature of the feature matrix challenges deterministic methods, while stochastic gradient approaches are hyperparameter-sensitive and prone to local minima. To address these limitations, this study introduces randomized iterative algorithms that solve the original linear system without requiring matrix inversion or full-system computation, instead leveraging random subsets of data in a hyperparameter-free framework. Although these methods incorporate randomness, they are not arbitrary but remain system-dependent, dynamically adapting to the structure of the feature matrix. Theoretical analysis establishes upper bounds on the expected number of iterations, expressed in terms of statistical properties of the feature matrix, providing insights into near-singularity, condition number, and network size. Empirical evaluations on classification datasets demonstrate that the proposed methods consistently outperform conventional ELM, deterministic solvers, and gradient descent-based methods in accuracy, efficiency, and robustness. Statistical validation using Friedman’s rank test and Wilcoxon post-hoc analysis confirms the superior performance and reliability of these randomized algorithms, establishing them as a computationally efficient and numerically stable alternative to existing approaches.
ISSN:0378-4371
DOI:10.1016/j.physa.2025.130515