Predictive Subgroup Logistic Regression for Classification with Unobserved Heterogeneity
Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the co...
Saved in:
| Published in: | Statistics and computing Vol. 35; no. 6 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
Springer US
01.12.2025
Springer Nature B.V |
| Subjects: | |
| ISSN: | 0960-3174, 1573-1375 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the conventional logistic regression and is specifically designed to address unobserved heterogeneity in classification problems. The PSLR model incorporates subject-specific intercepts in the log odds, fitted through a penalized likelihood approach with a concave pairwise fusion penalty. A novel two-step procedure is developed to facilitate the out-of-sample predictions for new subjects whose subgroup membership labels are unknown. This procedure allows the PSLR model to perform both inferential and predictive tasks. Through extensive simulation studies and an empirical application to a customer churn dataset in the telecommunications industry, the PSLR model not only demonstrates great performance in various aggregate accuracy metrics but also achieves a balanced effectiveness in sensitivity and specificity. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0960-3174 1573-1375 |
| DOI: | 10.1007/s11222-025-10712-9 |