On the optimism correction of the area under the receiver operating characteristic curve in logistic prediction models

Saved in:
Bibliographic Details
Title: On the optimism correction of the area under the receiver operating characteristic curve in logistic prediction models
Authors: Iparragirre, Amaia, Barrio, Irantzu, Rodríguez-Álvarez, María Xosé
Source: SORT-Statistics and Operations Research Transactions; 2019: Vol.: 43 Núm.: 1 January-June; p. 145-162
oai:raco.cat:article/356185
Repositori Institucional de la Universitat Rovira i Virgili
instname
Dipòsit Digital de Documents de la UAB
Universitat Autònoma de Barcelona
UPCommons. Portal del coneixement obert de la UPC
Universitat Politècnica de Catalunya (UPC)
Publisher Information: Universitat Rovira i Virgili, 2019.
Publication Year: 2019
Subject Terms: validation, Classificació AMS::62 Statistics::62J Linear inference, logistic regression, area under the receiver operating characteristic curve, Classificació AMS::62 Statistics::62J Linear inference, regression, Logistic regression, Prediction models, Bootstrap, Àrees temàtiques de la UPC::Matemàtiques i estadística::Estadística matemàtica, Matemàtiques i estadística::Estadística matemàtica [Àrees temàtiques de la UPC], 62 Statistics::62J Linear inference, regression [Classificació AMS], Validation, Area under the receiver operating characteristic curve, regression, 62J99, bootstrap, Prediction models, logistic regression, area under the receiver operating characteristic curve, validation, bootstrap
Description: When the same data are used to fit a model and estimate its predictive performance, this estimate may be optimistic, and its correction is required. The aim of this work is to compare the behaviour of different methods proposed in the literature when correcting for the optimism of the estimated area under the receiver operating characteristic curve in logistic regression models. A simulation study (where the theoretical model is known) is conducted considering different number of covariates, sample size, prevalence and correlation among covariates. The results suggest the use of k-fold cross-validation with replication and bootstrap.
Document Type: Article
Other literature type
File Description: application/pdf
DOI: 10.2436/20.8080.02.82
Access URL: http://hdl.handle.net/20.500.11797/RP3412
https://hdl.handle.net/20.500.11797/RP3412
https://ddd.uab.cat/record/205824
https://hdl.handle.net/2117/178517
http://hdl.handle.net/2117/178517
https://upcommons.upc.edu/handle/2117/178517
https://dialnet.unirioja.es/servlet/articulo?codigo=7013164
https://ddd.uab.cat/pub/sort/sort_a2019m1-6v43n1/sort_a2019m1-6v43n1p145.pdf
https://ddd.uab.cat/record/205824?ln=ca
https://www.raco.cat/index.php/SORT/article/view/356185
Rights: CC BY NC ND
Accession Number: edsair.dedup.wf.002..db8b32af8d507cbfc40077de79bff92c
Database: OpenAIRE
Description
Abstract:When the same data are used to fit a model and estimate its predictive performance, this estimate may be optimistic, and its correction is required. The aim of this work is to compare the behaviour of different methods proposed in the literature when correcting for the optimism of the estimated area under the receiver operating characteristic curve in logistic regression models. A simulation study (where the theoretical model is known) is conducted considering different number of covariates, sample size, prevalence and correlation among covariates. The results suggest the use of k-fold cross-validation with replication and bootstrap.
DOI:10.2436/20.8080.02.82