Predictive Subgroup Logistic Regression for Classification with Unobserved Heterogeneity

Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the co...

Full description

Saved in:
Bibliographic Details
Published in:Statistics and computing Vol. 35; no. 6
Main Authors: Chen, Kun, Huang, Rui, Tong, Zhiwei
Format: Journal Article
Language:English
Published: New York Springer US 01.12.2025
Springer Nature B.V
Subjects:
ISSN:0960-3174, 1573-1375
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the conventional logistic regression and is specifically designed to address unobserved heterogeneity in classification problems. The PSLR model incorporates subject-specific intercepts in the log odds, fitted through a penalized likelihood approach with a concave pairwise fusion penalty. A novel two-step procedure is developed to facilitate the out-of-sample predictions for new subjects whose subgroup membership labels are unknown. This procedure allows the PSLR model to perform both inferential and predictive tasks. Through extensive simulation studies and an empirical application to a customer churn dataset in the telecommunications industry, the PSLR model not only demonstrates great performance in various aggregate accuracy metrics but also achieves a balanced effectiveness in sensitivity and specificity.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0960-3174
1573-1375
DOI:10.1007/s11222-025-10712-9