Bag of little bootstraps for massive and distributed longitudinal data

Linear mixed models are widely used for analyzing longitudinal datasets, and the inference for variance component parameters relies on the bootstrap method. However, health systems and technology companies routinely generate massive longitudinal datasets that make the traditional bootstrap method in...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Statistical analysis and data mining Ročník 15; číslo 3; s. 314 - 321
Hlavní autoři: Zhou, Xinkai, Zhou, Jin J., Zhou, Hua
Médium: Journal Article
Jazyk:angličtina
Vydáno: Hoboken Wiley Subscription Services, Inc., A Wiley Company 01.06.2022
Wiley Subscription Services, Inc
Témata:
ISSN:1932-1864, 1932-1872
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Linear mixed models are widely used for analyzing longitudinal datasets, and the inference for variance component parameters relies on the bootstrap method. However, health systems and technology companies routinely generate massive longitudinal datasets that make the traditional bootstrap method infeasible. To solve this problem, we extend the highly scalable bag of little bootstraps method for independent data to longitudinal data and develop a highly efficient Julia package MixedModelsBLB.jl. Simulation experiments and real data analysis demonstrate the favorable statistical performance and computational advantages of our method compared to the traditional bootstrap method. For the statistical inference of variance components, it achieves 200 times speedup on the scale of 1 million subjects (20 million total observations), and is the only currently available tool that can handle more than 10 million subjects (200 million total observations) using desktop computers.
Bibliografie:Funding information
Division of Mathematical Sciences, DMS‐2054253; National Heart, Lung, and Blood Institute, HL150374; National Human Genome Research Institute, HG006139; National Institute of Diabetes and Digestive and Kidney Diseases, DK106116; National Institute of General Medical Sciences, GM141798
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1932-1864
1932-1872
DOI:10.1002/sam.11563