Predictive Subgroup Logistic Regression for Classification with Unobserved Heterogeneity

Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the co...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Statistics and computing Ročník 35; číslo 6
Hlavní autori: Chen, Kun, Huang, Rui, Tong, Zhiwei
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York Springer US 01.12.2025
Springer Nature B.V
Predmet:
ISSN:0960-3174, 1573-1375
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the conventional logistic regression and is specifically designed to address unobserved heterogeneity in classification problems. The PSLR model incorporates subject-specific intercepts in the log odds, fitted through a penalized likelihood approach with a concave pairwise fusion penalty. A novel two-step procedure is developed to facilitate the out-of-sample predictions for new subjects whose subgroup membership labels are unknown. This procedure allows the PSLR model to perform both inferential and predictive tasks. Through extensive simulation studies and an empirical application to a customer churn dataset in the telecommunications industry, the PSLR model not only demonstrates great performance in various aggregate accuracy metrics but also achieves a balanced effectiveness in sensitivity and specificity.
AbstractList Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the conventional logistic regression and is specifically designed to address unobserved heterogeneity in classification problems. The PSLR model incorporates subject-specific intercepts in the log odds, fitted through a penalized likelihood approach with a concave pairwise fusion penalty. A novel two-step procedure is developed to facilitate the out-of-sample predictions for new subjects whose subgroup membership labels are unknown. This procedure allows the PSLR model to perform both inferential and predictive tasks. Through extensive simulation studies and an empirical application to a customer churn dataset in the telecommunications industry, the PSLR model not only demonstrates great performance in various aggregate accuracy metrics but also achieves a balanced effectiveness in sensitivity and specificity.
ArticleNumber 185
Author Huang, Rui
Tong, Zhiwei
Chen, Kun
Author_xml – sequence: 1
  givenname: Kun
  surname: Chen
  fullname: Chen, Kun
  organization: Southwestern University of Finance and Economics
– sequence: 2
  givenname: Rui
  surname: Huang
  fullname: Huang, Rui
  email: rhuang329@gmail.com
  organization: Nanjing University of Chinese Medicine
– sequence: 3
  givenname: Zhiwei
  surname: Tong
  fullname: Tong, Zhiwei
  organization: University of Iowa
BookMark eNp9kF1LwzAUhoNMsJv-Aa8KXkfz0Tb2UoY6YaCoA-9CmpzWjNnMJJ3s35tZwTuvDjl53vfAM0WT3vWA0Dkll5QQcRUoZYxhwkpMiaAM10coo6XgmHJRTlBG6opgTkVxgqYhrAmhtOJFht6ePBiro91B_jI0nXfDNl-6zoZodf4MnYcQrOvz1vl8vlHp0Vqt4mH1ZeN7vupdE8DvwOQLiOBdBz3YuD9Fx63aBDj7nTO0urt9nS_w8vH-YX6zxJoJFrFuWQmFAQUchNEFmKrkFeNNrQzoplBADQCvE61p3RhNiEr_JdCKJYDwGboYe7fefQ4Qoly7wffppOSsEEV1ndoSxUZKexeCh1Zuvf1Qfi8pkQeDcjQok0H5Y1DWKcTHUEhw34H_q_4n9Q1NH3gb
Cites_doi 10.1016/j.insmatheco.2019.01.009
10.1093/rfs/hht047
10.1016/j.csda.2004.12.015
10.1002/jae.770
10.1198/jbes.2009.07219
10.1023/A:1017501703105
10.1198/016214501753382273
10.1111/j.1469-8137.1912.tb05611.x
10.3982/ECTA15238
10.1561/2200000016
10.1007/BF01908075
10.1080/07350015.2018.1543126
10.1111/j.2517-6161.1996.tb02080.x
10.1214/09-AOS729
10.1080/01621459.2016.1148039
10.1016/j.ejor.2018.02.009
10.1093/biomet/asm053
10.1016/j.neunet.2014.09.003
10.1080/10618600.2014.948181
10.1080/07350015.2015.1052457
10.1509/jmkg.64.3.65.18028
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025 Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025 Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
– notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
DBID AAYXX
CITATION
JQ2
DOI 10.1007/s11222-025-10712-9
DatabaseName CrossRef
ProQuest Computer Science Collection
DatabaseTitle CrossRef
ProQuest Computer Science Collection
DatabaseTitleList
ProQuest Computer Science Collection
DeliveryMethod fulltext_linktorsrc
Discipline Statistics
Mathematics
Computer Science
EISSN 1573-1375
ExternalDocumentID 10_1007_s11222_025_10712_9
GrantInformation_xml – fundername: National Social Science Fund of China
  grantid: 24BTJ071
  funderid: https://doi.org/10.13039/501100012456
– fundername: National Natural Science Foundation of China
  grantid: 72203090
  funderid: https://doi.org/10.13039/501100001809
GroupedDBID -~C
.86
.DC
.VR
06D
0R~
0VY
123
199
1N0
203
29Q
2J2
2JN
2JY
2KG
2KM
2LR
2~H
30V
4.4
406
408
409
40D
40E
5VS
67Z
6NX
78A
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAPKM
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
ABAKF
ABBBX
ABBRH
ABBXA
ABDBE
ABDZT
ABECU
ABFSG
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABLJU
ABMNI
ABMQK
ABNWP
ABQBU
ABRTQ
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABWNU
ABXPI
ACAOD
ACDTI
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACSNA
ACSTC
ACZOJ
ADHHG
ADHIR
ADIMF
ADKFA
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AEZWR
AFBBN
AFDZB
AFHIU
AFLOW
AFOHR
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHPBZ
AHSBF
AHWEU
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AIXLP
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARMRJ
ASPBG
ATHPR
AVWKF
AXYYD
AYFIA
AYJHY
AZFZN
B-.
BA0
BAPOH
BGNMA
BSONS
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
EBLON
EBS
EIOEI
ESBYG
F5P
FEDTE
FERAY
FFXSO
FIGPU
FNLPD
FRRFC
FWDCC
GGCAI
GGRSB
GJIRD
GNWQR
GQ7
GQ8
GXS
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
KDC
KOV
LAK
LLZTM
M4Y
MA-
NB0
NPVJJ
NQJWS
O93
O9G
O9I
O9J
OAM
P19
P2P
P9R
PF0
PT4
PT5
QOK
QOS
R89
R9I
RHV
RNS
ROL
RPX
RSV
S16
S1Z
S27
S3B
SAP
SDD
SDH
SDM
SHX
SISQX
SJYHP
SMT
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
TN5
TSG
TSK
TSV
TUC
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
ZMTXR
~EX
-Y2
1SB
2.D
28-
2P1
2VQ
5QI
AARHV
AAYTO
AAYXX
ABQSL
ABULA
ACBXY
ADHKG
AEBTG
AEFIE
AEKMD
AFEXP
AFGCZ
AGGDS
AGJBK
AGQPQ
AJBLW
BBWZM
BDATZ
CAG
CITATION
COF
EJD
FINBP
FSGXE
H13
KOW
N2Q
NDZJH
NU0
O9-
OVD
R4E
RNI
RZC
RZE
RZK
S26
S28
SCJ
SCLPG
T16
TEORI
ZWQNP
JQ2
ID FETCH-LOGICAL-c272t-cf25e4deae3e7dc4ed653623b9adecb4ae1dee39c27c19bdc00a3625e162dec03
IEDL.DBID RSV
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001564689600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0960-3174
IngestDate Wed Nov 05 20:15:53 EST 2025
Sat Nov 29 06:52:30 EST 2025
Thu Nov 06 11:38:11 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords Churn modeling
Concave pairwise fusion
Penalized likelihood estimation
ADMM
Unobserved heterogeneity
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c272t-cf25e4deae3e7dc4ed653623b9adecb4ae1dee39c27c19bdc00a3625e162dec03
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 3247468623
PQPubID 2043829
ParticipantIDs proquest_journals_3247468623
crossref_primary_10_1007_s11222_025_10712_9
springer_journals_10_1007_s11222_025_10712_9
PublicationCentury 2000
PublicationDate 2025-12-01
PublicationDateYYYYMMDD 2025-12-01
PublicationDate_xml – month: 12
  year: 2025
  text: 2025-12-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
– name: Dordrecht
PublicationTitle Statistics and computing
PublicationTitleAbbrev Stat Comput
PublicationYear 2025
Publisher Springer US
Springer Nature B.V
Publisher_xml – name: Springer US
– name: Springer Nature B.V
References TA Gormley (10712_CR10) 2014; 27
10712_CR3
J Schmidhuber (10712_CR16) 2015; 61
A De Caigny (10712_CR7) 2018; 269
J Ganesh (10712_CR9) 2000; 64
10712_CR1
L Hubert (10712_CR12) 1985; 2
10712_CR2
10712_CR20
10712_CR5
10712_CR6
J Fan (10712_CR8) 2001; 96
10712_CR13
10712_CR21
K Chen (10712_CR4) 2019; 86
10712_CR11
10712_CR17
10712_CR14
10712_CR15
10712_CR18
10712_CR19
References_xml – volume: 86
  start-page: 8
  year: 2019
  ident: 10712_CR4
  publication-title: Insurance Math. Econom.
  doi: 10.1016/j.insmatheco.2019.01.009
– volume: 27
  start-page: 617
  year: 2014
  ident: 10712_CR10
  publication-title: The Review of Financial Studies
  doi: 10.1093/rfs/hht047
– ident: 10712_CR6
  doi: 10.1016/j.csda.2004.12.015
– ident: 10712_CR20
  doi: 10.1002/jae.770
– ident: 10712_CR3
  doi: 10.1198/jbes.2009.07219
– ident: 10712_CR18
  doi: 10.1023/A:1017501703105
– volume: 96
  start-page: 1348
  year: 2001
  ident: 10712_CR8
  publication-title: J. Am. Stat. Assoc.
  doi: 10.1198/016214501753382273
– ident: 10712_CR13
  doi: 10.1111/j.1469-8137.1912.tb05611.x
– ident: 10712_CR1
  doi: 10.3982/ECTA15238
– ident: 10712_CR2
  doi: 10.1561/2200000016
– volume: 2
  start-page: 193
  year: 1985
  ident: 10712_CR12
  publication-title: J. Classif.
  doi: 10.1007/BF01908075
– ident: 10712_CR14
  doi: 10.1080/07350015.2018.1543126
– ident: 10712_CR17
  doi: 10.1111/j.2517-6161.1996.tb02080.x
– ident: 10712_CR21
  doi: 10.1214/09-AOS729
– ident: 10712_CR15
  doi: 10.1080/01621459.2016.1148039
– volume: 269
  start-page: 760
  year: 2018
  ident: 10712_CR7
  publication-title: Eur. J. Oper. Res.
  doi: 10.1016/j.ejor.2018.02.009
– ident: 10712_CR19
  doi: 10.1093/biomet/asm053
– volume: 61
  start-page: 85
  year: 2015
  ident: 10712_CR16
  publication-title: Neural Netw.
  doi: 10.1016/j.neunet.2014.09.003
– ident: 10712_CR5
  doi: 10.1080/10618600.2014.948181
– ident: 10712_CR11
  doi: 10.1080/07350015.2015.1052457
– volume: 64
  start-page: 65
  year: 2000
  ident: 10712_CR9
  publication-title: J. Mark.
  doi: 10.1509/jmkg.64.3.65.18028
SSID ssj0011634
Score 2.413551
Snippet Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Index Database
Publisher
SubjectTerms Artificial Intelligence
Classification
Computer Science
Heterogeneity
Original Paper
Probability and Statistics in Computer Science
Regression
Regression analysis
Statistical analysis
Statistical models
Statistical Theory and Methods
Statistics and Computing/Statistics Programs
Subgroups
Telecommunications industry
Title Predictive Subgroup Logistic Regression for Classification with Unobserved Heterogeneity
URI https://link.springer.com/article/10.1007/s11222-025-10712-9
https://www.proquest.com/docview/3247468623
Volume 35
WOSCitedRecordID wos001564689600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1573-1375
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0011634
  issn: 0960-3174
  databaseCode: RSV
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED5BYSgDhQKivOSBDSzVdtLUI0JUHaCqCkXdosS-IJYUtYXfz9lJGoFggNknK7rnF98L4BKTXphJNJziR8oDYzOuTZZyqyO0kbZpWC6biEaj_mymx2VT2LKqdq9Skt5T181ugmIZd-tX6ZdFSK43YYvCXd-Z4-TxeZ07IIThh0YRNicPEwVlq8zPd3wNRzXG_JYW9dFm0Prfd-7Bboku2U2hDvuwgXkbWtXmBlYacht2HtbTWpdtaDrEWQxsPoDZeOFyN84LMvIqvuuD3fs-oVfDJvhSFM7mjNAu8ys1XbGRly9zj7psms9T99SLlg1drc2cVBQJ6x_CdHD3dDvk5foFbmQkV9xkMsTAYoIKI2sCtL2Qwp1KdWLRpEGCwiIqTdRG6NSabjeh8xBFTxJBVx1BI5_neAwMQy2UEYnpKxNY7VJ3qLJMhKgTqTDrwFUlhfitmLIR1_OUHT9j4mfs-RnrDpxVgopLi1vGBAyjwLW7qA5cV4Kpj3-_7eRv5KfQlF62rqLlDBqrxTuew7b5IDktLrwmfgIW8dsn
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1RT8IwEL4omogPoqgRRe2Db9qErRujj8ZIMAIhCIa3ZWtvxpdhAP39XruNRaMP-txLs9y1d996390BXGHU9hMXFaf4EXNP6YRLlcRcywB1IHXs58MmguGwM5vJUV4UtizY7kVK0nrqstjNoVjGzfhV-mVxXC43YcujiGWIfOOn53XugBCGbRpF2Jw8TODlpTI_7_E1HJUY81ta1Eabbu1_37kPezm6ZLfZcTiADUzrUCsmN7D8Itdhd7Du1rqsQ9Ugzqxh8yHMRguTuzFekJFXsVUfrG_rhF4VG-NLRpxNGaFdZkdqGrKRtS8zj7psms5j89SLmvUM12ZORxQJ6x_BtHs_uevxfPwCV27grrhKXB89jREKDLTyULd9CncilpFGFXsROhpRSJJWjoy1arUiWvfRabsk0BLHUEnnKZ4AQ186QjmR6gjlaWlSdyiSxPFRRq7ApAHXhRXCt6zLRlj2Uzb6DEmfodVnKBvQLAwV5jduGRIwDDxT7iIacFMYplz-fbfTv4lfwk5vMuiH_Yfh4xlUXWtnw25pQmW1eMdz2FYfZLPFhT2Vn2X43gs
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1RT8IwEL4oGoMPoqgRRe2Db9rA1o3RR6MSjEiIiuFt2dqb8WUQQH-_124DNfpgfG7TNHdt77vefXcAZxi1_MRFxcl-xNxTOuFSJTHXMkAdSB37ebOJoN9vj0Zy8InFb7Pdi5BkxmkwVZrSeWOik8aS-OaQXeOmFSu5L47L5SqseaZpkPHXH58XcQRCG7aAFOF0em0CL6fN_LzGV9O0xJvfQqTW8nQq_9_zNmzlqJNdZsdkB1YwrUKl6OjA8gtehc37RRXXWRXKBolmhZx3YTSYmpiOeR0ZvTaWDcJ6lj_0qtgDvmQJtSkjFMxsq02ThGT1zsxnLxum49h8AaNmXZODM6aji-QD7MGwc_N01eV5Wwau3MCdc5W4PnoaIxQYaOWhbvlkBkUsI40q9iJ0NKKQNFs5Mtaq2Yxo3Een5dKEptiHUjpO8QAY-tIRyolUWyhPSxPSQ5Ekjo8ycgUmNTgvNBJOsuob4bLOspFnSPIMrTxDWYN6obQwv4mzkABj4BkajKjBRaGk5fDvqx3-bfopbAyuO2Hvtn93BGXXqtkkvdShNJ--4TGsq3dS2fTEHtAPSqXm7w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Predictive+Subgroup+Logistic+Regression+for+Classification+with+Unobserved+Heterogeneity&rft.jtitle=Statistics+and+computing&rft.au=Chen%2C+Kun&rft.au=Huang%2C+Rui&rft.au=Tong+Zhiwei&rft.date=2025-12-01&rft.pub=Springer+Nature+B.V&rft.issn=0960-3174&rft.eissn=1573-1375&rft.volume=35&rft.issue=6&rft_id=info:doi/10.1007%2Fs11222-025-10712-9&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0960-3174&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0960-3174&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0960-3174&client=summon