Estimation and prediction with data quality indexes in linear regressions

Despite many statistical applications brush the question of data quality aside, it is a fundamental concern inherent to external data collection. In this paper, data quality relates to the confidence one can have about the covariate values in a regression framework. More precisely, we study how to i...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Computational statistics Ročník 39; číslo 6; s. 3373 - 3404
Hlavní autoři: Chatelain, P., Milhaud, X.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Berlin/Heidelberg Springer Berlin Heidelberg 01.09.2024
Springer Nature B.V
Springer Verlag
Témata:
ISSN:0943-4062, 1613-9658
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Despite many statistical applications brush the question of data quality aside, it is a fundamental concern inherent to external data collection. In this paper, data quality relates to the confidence one can have about the covariate values in a regression framework. More precisely, we study how to integrate the information of data quality given by a ( n × p ) -matrix, with n the number of individuals and p the number of explanatory variables. In this view, we suggest a latent variable model that drives the generation of the covariate values, and introduce a new algorithm that takes all these information into account for prediction. Our approach provides unbiased estimators of the regression coefficients, and allows to make predictions adapted to some given quality pattern. The usefulness of our procedure is illustrated through simulations and real-life applications. Kindly check and confirm whether the corresponding author is correctly identified.Yes
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0943-4062
1613-9658
DOI:10.1007/s00180-023-01441-6