A likelihood-based approach for multivariate categorical response regression in high dimensions

We propose a penalized likelihood method to fit the bivariate categorical response regression model. Our method allows practitioners to estimate which predictors are irrelevant, which predictors only affect the marginal distributions of the bivariate response, and which predictors affect both the ma...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org
Main Authors: Molstad, Aaron J, Rothman, Adam J
Format: Paper
Language:English
Published: Ithaca Cornell University Library, arXiv.org 23.01.2022
Subjects:
ISSN:2331-8422
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We propose a penalized likelihood method to fit the bivariate categorical response regression model. Our method allows practitioners to estimate which predictors are irrelevant, which predictors only affect the marginal distributions of the bivariate response, and which predictors affect both the marginal distributions and log odds ratios. To compute our estimator, we propose an efficient first order algorithm which we extend to settings where some subjects have only one response variable measured, i.e., the semi-supervised setting. We derive an asymptotic error bound which illustrates the performance of our estimator in high-dimensional settings. Generalizations to the multivariate categorical response regression model are proposed. Finally, simulation studies and an application in pan-cancer risk prediction demonstrate the usefulness of our method in terms of interpretability and prediction accuracy. An R package implementing the proposed method is available for download at github.com/ajmolstad/BvCategorical.
Bibliography:SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
ISSN:2331-8422
DOI:10.48550/arxiv.2007.07953