Predictive Subgroup Logistic Regression for Classification with Unobserved Heterogeneity
Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the co...
Uložené v:
| Vydané v: | Statistics and computing Ročník 35; číslo 6 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
Springer US
01.12.2025
Springer Nature B.V |
| Predmet: | |
| ISSN: | 0960-3174, 1573-1375 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the conventional logistic regression and is specifically designed to address unobserved heterogeneity in classification problems. The PSLR model incorporates subject-specific intercepts in the log odds, fitted through a penalized likelihood approach with a concave pairwise fusion penalty. A novel two-step procedure is developed to facilitate the out-of-sample predictions for new subjects whose subgroup membership labels are unknown. This procedure allows the PSLR model to perform both inferential and predictive tasks. Through extensive simulation studies and an empirical application to a customer churn dataset in the telecommunications industry, the PSLR model not only demonstrates great performance in various aggregate accuracy metrics but also achieves a balanced effectiveness in sensitivity and specificity. |
|---|---|
| AbstractList | Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a substantial challenge to statistical modeling. This study introduces the Predictive Subgroup Logistic Regression (PSLR) model, which extends the conventional logistic regression and is specifically designed to address unobserved heterogeneity in classification problems. The PSLR model incorporates subject-specific intercepts in the log odds, fitted through a penalized likelihood approach with a concave pairwise fusion penalty. A novel two-step procedure is developed to facilitate the out-of-sample predictions for new subjects whose subgroup membership labels are unknown. This procedure allows the PSLR model to perform both inferential and predictive tasks. Through extensive simulation studies and an empirical application to a customer churn dataset in the telecommunications industry, the PSLR model not only demonstrates great performance in various aggregate accuracy metrics but also achieves a balanced effectiveness in sensitivity and specificity. |
| ArticleNumber | 185 |
| Author | Huang, Rui Tong, Zhiwei Chen, Kun |
| Author_xml | – sequence: 1 givenname: Kun surname: Chen fullname: Chen, Kun organization: Southwestern University of Finance and Economics – sequence: 2 givenname: Rui surname: Huang fullname: Huang, Rui email: rhuang329@gmail.com organization: Nanjing University of Chinese Medicine – sequence: 3 givenname: Zhiwei surname: Tong fullname: Tong, Zhiwei organization: University of Iowa |
| BookMark | eNp9kF1LwzAUhoNMsJv-Aa8KXkfz0Tb2UoY6YaCoA-9CmpzWjNnMJJ3s35tZwTuvDjl53vfAM0WT3vWA0Dkll5QQcRUoZYxhwkpMiaAM10coo6XgmHJRTlBG6opgTkVxgqYhrAmhtOJFht6ePBiro91B_jI0nXfDNl-6zoZodf4MnYcQrOvz1vl8vlHp0Vqt4mH1ZeN7vupdE8DvwOQLiOBdBz3YuD9Fx63aBDj7nTO0urt9nS_w8vH-YX6zxJoJFrFuWQmFAQUchNEFmKrkFeNNrQzoplBADQCvE61p3RhNiEr_JdCKJYDwGboYe7fefQ4Qoly7wffppOSsEEV1ndoSxUZKexeCh1Zuvf1Qfi8pkQeDcjQok0H5Y1DWKcTHUEhw34H_q_4n9Q1NH3gb |
| Cites_doi | 10.1016/j.insmatheco.2019.01.009 10.1093/rfs/hht047 10.1016/j.csda.2004.12.015 10.1002/jae.770 10.1198/jbes.2009.07219 10.1023/A:1017501703105 10.1198/016214501753382273 10.1111/j.1469-8137.1912.tb05611.x 10.3982/ECTA15238 10.1561/2200000016 10.1007/BF01908075 10.1080/07350015.2018.1543126 10.1111/j.2517-6161.1996.tb02080.x 10.1214/09-AOS729 10.1080/01621459.2016.1148039 10.1016/j.ejor.2018.02.009 10.1093/biomet/asm053 10.1016/j.neunet.2014.09.003 10.1080/10618600.2014.948181 10.1080/07350015.2015.1052457 10.1509/jmkg.64.3.65.18028 |
| ContentType | Journal Article |
| Copyright | The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025 Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025. |
| Copyright_xml | – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025 Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025. |
| DBID | AAYXX CITATION JQ2 |
| DOI | 10.1007/s11222-025-10712-9 |
| DatabaseName | CrossRef ProQuest Computer Science Collection |
| DatabaseTitle | CrossRef ProQuest Computer Science Collection |
| DatabaseTitleList | ProQuest Computer Science Collection |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Statistics Mathematics Computer Science |
| EISSN | 1573-1375 |
| ExternalDocumentID | 10_1007_s11222_025_10712_9 |
| GrantInformation_xml | – fundername: National Social Science Fund of China grantid: 24BTJ071 funderid: https://doi.org/10.13039/501100012456 – fundername: National Natural Science Foundation of China grantid: 72203090 funderid: https://doi.org/10.13039/501100001809 |
| GroupedDBID | -~C .86 .DC .VR 06D 0R~ 0VY 123 199 1N0 203 29Q 2J2 2JN 2JY 2KG 2KM 2LR 2~H 30V 4.4 406 408 409 40D 40E 5VS 67Z 6NX 78A 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAPKM AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN ABAKF ABBBX ABBRH ABBXA ABDBE ABDZT ABECU ABFSG ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABLJU ABMNI ABMQK ABNWP ABQBU ABRTQ ABSXP ABTEG ABTHY ABTKH ABTMW ABWNU ABXPI ACAOD ACDTI ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACSNA ACSTC ACZOJ ADHHG ADHIR ADIMF ADKFA ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEFQL AEGAL AEGNC AEJHL AEJRE AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AEZWR AFBBN AFDZB AFHIU AFLOW AFOHR AFQWF AFWTZ AFZKB AGAYW AGDGC AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHPBZ AHSBF AHWEU AHYZX AIAKS AIGIU AIIXL AILAN AITGF AIXLP AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARMRJ ASPBG ATHPR AVWKF AXYYD AYFIA AYJHY AZFZN B-. BA0 BAPOH BGNMA BSONS CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 EBLON EBS EIOEI ESBYG F5P FEDTE FERAY FFXSO FIGPU FNLPD FRRFC FWDCC GGCAI GGRSB GJIRD GNWQR GQ7 GQ8 GXS HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ KDC KOV LAK LLZTM M4Y MA- NB0 NPVJJ NQJWS O93 O9G O9I O9J OAM P19 P2P P9R PF0 PT4 PT5 QOK QOS R89 R9I RHV RNS ROL RPX RSV S16 S1Z S27 S3B SAP SDD SDH SDM SHX SISQX SJYHP SMT SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 TN5 TSG TSK TSV TUC U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WK8 YLTOR Z45 ZMTXR ~EX -Y2 1SB 2.D 28- 2P1 2VQ 5QI AARHV AAYTO AAYXX ABQSL ABULA ACBXY ADHKG AEBTG AEFIE AEKMD AFEXP AFGCZ AGGDS AGJBK AGQPQ AJBLW BBWZM BDATZ CAG CITATION COF EJD FINBP FSGXE H13 KOW N2Q NDZJH NU0 O9- OVD R4E RNI RZC RZE RZK S26 S28 SCJ SCLPG T16 TEORI ZWQNP JQ2 |
| ID | FETCH-LOGICAL-c272t-cf25e4deae3e7dc4ed653623b9adecb4ae1dee39c27c19bdc00a3625e162dec03 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001564689600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0960-3174 |
| IngestDate | Wed Nov 05 20:15:53 EST 2025 Sat Nov 29 06:52:30 EST 2025 Thu Nov 06 11:38:11 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Keywords | Churn modeling Concave pairwise fusion Penalized likelihood estimation ADMM Unobserved heterogeneity |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c272t-cf25e4deae3e7dc4ed653623b9adecb4ae1dee39c27c19bdc00a3625e162dec03 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 3247468623 |
| PQPubID | 2043829 |
| ParticipantIDs | proquest_journals_3247468623 crossref_primary_10_1007_s11222_025_10712_9 springer_journals_10_1007_s11222_025_10712_9 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-12-01 |
| PublicationDateYYYYMMDD | 2025-12-01 |
| PublicationDate_xml | – month: 12 year: 2025 text: 2025-12-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York – name: Dordrecht |
| PublicationTitle | Statistics and computing |
| PublicationTitleAbbrev | Stat Comput |
| PublicationYear | 2025 |
| Publisher | Springer US Springer Nature B.V |
| Publisher_xml | – name: Springer US – name: Springer Nature B.V |
| References | TA Gormley (10712_CR10) 2014; 27 10712_CR3 J Schmidhuber (10712_CR16) 2015; 61 A De Caigny (10712_CR7) 2018; 269 J Ganesh (10712_CR9) 2000; 64 10712_CR1 L Hubert (10712_CR12) 1985; 2 10712_CR2 10712_CR20 10712_CR5 10712_CR6 J Fan (10712_CR8) 2001; 96 10712_CR13 10712_CR21 K Chen (10712_CR4) 2019; 86 10712_CR11 10712_CR17 10712_CR14 10712_CR15 10712_CR18 10712_CR19 |
| References_xml | – volume: 86 start-page: 8 year: 2019 ident: 10712_CR4 publication-title: Insurance Math. Econom. doi: 10.1016/j.insmatheco.2019.01.009 – volume: 27 start-page: 617 year: 2014 ident: 10712_CR10 publication-title: The Review of Financial Studies doi: 10.1093/rfs/hht047 – ident: 10712_CR6 doi: 10.1016/j.csda.2004.12.015 – ident: 10712_CR20 doi: 10.1002/jae.770 – ident: 10712_CR3 doi: 10.1198/jbes.2009.07219 – ident: 10712_CR18 doi: 10.1023/A:1017501703105 – volume: 96 start-page: 1348 year: 2001 ident: 10712_CR8 publication-title: J. Am. Stat. Assoc. doi: 10.1198/016214501753382273 – ident: 10712_CR13 doi: 10.1111/j.1469-8137.1912.tb05611.x – ident: 10712_CR1 doi: 10.3982/ECTA15238 – ident: 10712_CR2 doi: 10.1561/2200000016 – volume: 2 start-page: 193 year: 1985 ident: 10712_CR12 publication-title: J. Classif. doi: 10.1007/BF01908075 – ident: 10712_CR14 doi: 10.1080/07350015.2018.1543126 – ident: 10712_CR17 doi: 10.1111/j.2517-6161.1996.tb02080.x – ident: 10712_CR21 doi: 10.1214/09-AOS729 – ident: 10712_CR15 doi: 10.1080/01621459.2016.1148039 – volume: 269 start-page: 760 year: 2018 ident: 10712_CR7 publication-title: Eur. J. Oper. Res. doi: 10.1016/j.ejor.2018.02.009 – ident: 10712_CR19 doi: 10.1093/biomet/asm053 – volume: 61 start-page: 85 year: 2015 ident: 10712_CR16 publication-title: Neural Netw. doi: 10.1016/j.neunet.2014.09.003 – ident: 10712_CR5 doi: 10.1080/10618600.2014.948181 – ident: 10712_CR11 doi: 10.1080/07350015.2015.1052457 – volume: 64 start-page: 65 year: 2000 ident: 10712_CR9 publication-title: J. Mark. doi: 10.1509/jmkg.64.3.65.18028 |
| SSID | ssj0011634 |
| Score | 2.413551 |
| Snippet | Unobserved heterogeneity refers to the variation among subjects that is not accounted for by the observed features used in a model. Its presence poses a... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Index Database Publisher |
| SubjectTerms | Artificial Intelligence Classification Computer Science Heterogeneity Original Paper Probability and Statistics in Computer Science Regression Regression analysis Statistical analysis Statistical models Statistical Theory and Methods Statistics and Computing/Statistics Programs Subgroups Telecommunications industry |
| Title | Predictive Subgroup Logistic Regression for Classification with Unobserved Heterogeneity |
| URI | https://link.springer.com/article/10.1007/s11222-025-10712-9 https://www.proquest.com/docview/3247468623 |
| Volume | 35 |
| WOSCitedRecordID | wos001564689600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1573-1375 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0011634 issn: 0960-3174 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwED5BYSgDhQKivOSBDSzVdtLUI0JUHaCqCkXdosS-IJYUtYXfz9lJGoFggNknK7rnF98L4BKTXphJNJziR8oDYzOuTZZyqyO0kbZpWC6biEaj_mymx2VT2LKqdq9Skt5T181ugmIZd-tX6ZdFSK43YYvCXd-Z4-TxeZ07IIThh0YRNicPEwVlq8zPd3wNRzXG_JYW9dFm0Prfd-7Bboku2U2hDvuwgXkbWtXmBlYacht2HtbTWpdtaDrEWQxsPoDZeOFyN84LMvIqvuuD3fs-oVfDJvhSFM7mjNAu8ys1XbGRly9zj7psms9T99SLlg1drc2cVBQJ6x_CdHD3dDvk5foFbmQkV9xkMsTAYoIKI2sCtL2Qwp1KdWLRpEGCwiIqTdRG6NSabjeh8xBFTxJBVx1BI5_neAwMQy2UEYnpKxNY7VJ3qLJMhKgTqTDrwFUlhfitmLIR1_OUHT9j4mfs-RnrDpxVgopLi1vGBAyjwLW7qA5cV4Kpj3-_7eRv5KfQlF62rqLlDBqrxTuew7b5IDktLrwmfgIW8dsn |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1RT8IwEL4omogPoqgRRe2Db9qErRujj8ZIMAIhCIa3ZWtvxpdhAP39XruNRaMP-txLs9y1d996390BXGHU9hMXFaf4EXNP6YRLlcRcywB1IHXs58MmguGwM5vJUV4UtizY7kVK0nrqstjNoVjGzfhV-mVxXC43YcujiGWIfOOn53XugBCGbRpF2Jw8TODlpTI_7_E1HJUY81ta1Eabbu1_37kPezm6ZLfZcTiADUzrUCsmN7D8Itdhd7Du1rqsQ9Ugzqxh8yHMRguTuzFekJFXsVUfrG_rhF4VG-NLRpxNGaFdZkdqGrKRtS8zj7psms5j89SLmvUM12ZORxQJ6x_BtHs_uevxfPwCV27grrhKXB89jREKDLTyULd9CncilpFGFXsROhpRSJJWjoy1arUiWvfRabsk0BLHUEnnKZ4AQ186QjmR6gjlaWlSdyiSxPFRRq7ApAHXhRXCt6zLRlj2Uzb6DEmfodVnKBvQLAwV5jduGRIwDDxT7iIacFMYplz-fbfTv4lfwk5vMuiH_Yfh4xlUXWtnw25pQmW1eMdz2FYfZLPFhT2Vn2X43gs |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1RT8IwEL4oGoMPoqgRRe2Db9rA1o3RR6MSjEiIiuFt2dqb8WUQQH-_124DNfpgfG7TNHdt77vefXcAZxi1_MRFxcl-xNxTOuFSJTHXMkAdSB37ebOJoN9vj0Zy8InFb7Pdi5BkxmkwVZrSeWOik8aS-OaQXeOmFSu5L47L5SqseaZpkPHXH58XcQRCG7aAFOF0em0CL6fN_LzGV9O0xJvfQqTW8nQq_9_zNmzlqJNdZsdkB1YwrUKl6OjA8gtehc37RRXXWRXKBolmhZx3YTSYmpiOeR0ZvTaWDcJ6lj_0qtgDvmQJtSkjFMxsq02ThGT1zsxnLxum49h8AaNmXZODM6aji-QD7MGwc_N01eV5Wwau3MCdc5W4PnoaIxQYaOWhbvlkBkUsI40q9iJ0NKKQNFs5Mtaq2Yxo3Een5dKEptiHUjpO8QAY-tIRyolUWyhPSxPSQ5Ekjo8ycgUmNTgvNBJOsuob4bLOspFnSPIMrTxDWYN6obQwv4mzkABj4BkajKjBRaGk5fDvqx3-bfopbAyuO2Hvtn93BGXXqtkkvdShNJ--4TGsq3dS2fTEHtAPSqXm7w |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Predictive+Subgroup+Logistic+Regression+for+Classification+with+Unobserved+Heterogeneity&rft.jtitle=Statistics+and+computing&rft.au=Chen%2C+Kun&rft.au=Huang%2C+Rui&rft.au=Tong+Zhiwei&rft.date=2025-12-01&rft.pub=Springer+Nature+B.V&rft.issn=0960-3174&rft.eissn=1573-1375&rft.volume=35&rft.issue=6&rft_id=info:doi/10.1007%2Fs11222-025-10712-9&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0960-3174&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0960-3174&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0960-3174&client=summon |