Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations
Uložené v:
| Názov: | Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations |
|---|---|
| Autori: | Tierney, Nicholas, Cook, Dianne |
| Zdroj: | Journal of Statistical Software, Vol 105, Iss 1 (2023) Journal of Statistical Software; Vol. 105 (2023); 1-31 |
| Publication Status: | Preprint |
| Informácie o vydavateľovi: | Foundation for Open Access Statistic, 2023. |
| Rok vydania: | 2023 |
| Predmety: | FOS: Computer and information sciences, QA299.6-433, Statistics, tidyverse, Statistics - Computation, 01 natural sciences, HA1-4737, statistical graphics, QA76.75-76.765, Econometric and statistical methods, QA1-939, data visualization, statistical computing, data science, Econometrics not elsewhere classified, 0101 mathematics, data pipeline, Computation (stat.CO) |
| Popis: | Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this gap. The new methodology builds upon tidy data principles, with the goal of integrating missing value handling as a key part of data analysis workflows. We define a new data structure, and a suite of new operations. Together, these provide a connected framework for handling, exploring, and imputing missing values. These methods are available in the R package `naniar`. 30 pages, 16 figures, 7 tables, package available at github.com/njtierney/naniar |
| Druh dokumentu: | Article Other literature type |
| Popis súboru: | application/pdf; application/gzip; text/plain; application/zip |
| Jazyk: | English |
| ISSN: | 1548-7660 |
| DOI: | 10.18637/jss.v105.i07 |
| DOI: | 10.48550/arxiv.1809.02264 |
| DOI: | 10.26180/21522555.v1 |
| DOI: | 10.26180/21522555 |
| Prístupová URL adresa: | http://arxiv.org/abs/1809.02264 https://doaj.org/article/b51e13ad39a8410c90413e746452a1f7 https://www.jstatsoft.org/index.php/jss/article/view/v105i07 |
| Rights: | arXiv Non-Exclusive Distribution CC BY |
| Prístupové číslo: | edsair.doi.dedup.....092088dc43903679a0c8c54c39cf35e1 |
| Databáza: | OpenAIRE |
| Abstrakt: | Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this gap. The new methodology builds upon tidy data principles, with the goal of integrating missing value handling as a key part of data analysis workflows. We define a new data structure, and a suite of new operations. Together, these provide a connected framework for handling, exploring, and imputing missing values. These methods are available in the R package `naniar`.<br />30 pages, 16 figures, 7 tables, package available at github.com/njtierney/naniar |
|---|---|
| ISSN: | 15487660 |
| DOI: | 10.18637/jss.v105.i07 |
Full Text Finder
Nájsť tento článok vo Web of Science