Estimation and prediction with data quality indexes in linear regressions

Despite many statistical applications brush the question of data quality aside, it is a fundamental concern inherent to external data collection. In this paper, data quality relates to the confidence one can have about the covariate values in a regression framework. More precisely, we study how to i...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Computational statistics Ročník 39; číslo 6; s. 3373 - 3404
Hlavní autori: Chatelain, P., Milhaud, X.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Berlin/Heidelberg Springer Berlin Heidelberg 01.09.2024
Springer Nature B.V
Springer Verlag
Predmet:
ISSN:0943-4062, 1613-9658
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Despite many statistical applications brush the question of data quality aside, it is a fundamental concern inherent to external data collection. In this paper, data quality relates to the confidence one can have about the covariate values in a regression framework. More precisely, we study how to integrate the information of data quality given by a ( n × p ) -matrix, with n the number of individuals and p the number of explanatory variables. In this view, we suggest a latent variable model that drives the generation of the covariate values, and introduce a new algorithm that takes all these information into account for prediction. Our approach provides unbiased estimators of the regression coefficients, and allows to make predictions adapted to some given quality pattern. The usefulness of our procedure is illustrated through simulations and real-life applications. Kindly check and confirm whether the corresponding author is correctly identified.Yes
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0943-4062
1613-9658
DOI:10.1007/s00180-023-01441-6