Large-scale robust regression with truncated loss via majorization-minimization algorithm

The utilization of regression methods employing truncated loss functions is widely praised for its robustness in handling outliers and representing the solution in the sparse form of the samples. However, due to the non-convexity of the truncated loss, the commonly used algorithms such as difference...

Full description

Saved in:

Bibliographic Details
Published in:	European journal of operational research Vol. 319; no. 2; pp. 494 - 504
Main Authors:	Huang, Ling-Wei, Shao, Yuan-Hai, Lv, Xiao-Jing, Li, Chun-Na
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 01.12.2024
Subjects:	Huber regression and support vector regression Majorization-minimization algorithm Robust regression Sparsity and scalability Truncated loss function Sparsity and scalability Huber regression and support vector regression Truncated loss function Robust regression Majorization-minimization algorithm
ISSN:	0377-2217
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The utilization of regression methods employing truncated loss functions is widely praised for its robustness in handling outliers and representing the solution in the sparse form of the samples. However, due to the non-convexity of the truncated loss, the commonly used algorithms such as difference of convex algorithm (DCA) fail to maintain sparsity when dealing with non-convex loss functions, and adapting DCA for efficient optimization also incurs additional development costs. To address these challenges, we propose a novel approach called truncated loss regression via majorization-minimization algorithm (TLRM). TLRM employs a surrogate function to approximate the original truncated loss regression and offers several desirable properties: (i) Eliminating outliers before the training process and encapsulating general convex loss regression within its structure as iterative subproblems, (ii) Solving the convex loss problem iteratively thereby facilitating the use of a well-established toolbox for convex optimization. (iii) Converging to a truncated loss regression and providing a solution with sample sparsity. Extensive experiments demonstrate that TLRM achieves superior sparsity without sacrificing robustness, and it can be several tens of thousands of times faster than traditional DCA on large-scale problems. Moreover, TLRM is also applicable to datasets with millions of samples, making it a practical choice for real-world scenarios. The codebase for methods with truncated loss functions is accessible at https://i-do-lab.github.io/optimal-group.org/Resources/Code/TLRM.html. •Propose an algorithm frame (TLRM) for general truncation loss regression.•Eliminate outliers before the training process to ensure sparsity.•Unveil the intrinsic connection between convex loss and truncated loss.•Enhance efficiency and scalability with well-established convex algorithms.•Experiments verify TLRM excels in sparsity, scalability, efficiency, reliability.
ISSN:	0377-2217
DOI:	10.1016/j.ejor.2024.04.028