‘dstidyverse’: An Implementation of TidyverseWithin the DataSHIELD Ecosystem [version 1; peer review: 2 approved]

Saved in:
Bibliographic Details
Title: ‘dstidyverse’: An Implementation of TidyverseWithin the DataSHIELD Ecosystem [version 1; peer review: 2 approved]
Authors: Eleanor Hyde, Demetris Avraam, Mariska Slofstra, Tim Cadman, Marije van der Geest, Morris Swertz, Erik Zwart, Stuart Wheater, Ruben Veenstra, Dick Postma, Niels Kikkert
Source: F1000Research, Vol 14 (2025)
Publisher Information: F1000 Research Ltd, 2025.
Publication Year: 2025
Collection: LCC:Medicine
LCC:Science
Subject Terms: datashield, federated analysis, tidyverse, data manipulation, eng, Medicine, Science
Description: Background DataSHIELD is a mature, R-based federated learning platform that enables multi-site analysis without sharing individual participant data. While DataSHIELD includes many packages for data analysis, it lacks user-friendly data manipulation tools. Methods To address this gap, we developed dsTidyverse, an implementation of selected functions from the popular Tidyverse package within the DataSHIELD client-server architecture. Disclosure checks were implemented to prevent individual-level data leakage. Results This package provides functionality for selecting, renaming, and creating columns; conditional recoding; combining data frames by rows or columns; filtering and arranging rows; grouping and ungrouping data; and converting data frames to tibbles. Through examples, we demonstrate how dsTidyverse simplifies common data manipulation tasks within DataSHIELD. Conclusions By providing additional data manipulation functionality, dsTidyverse improves the user experience and analytical efficiency within DataSHIELD. The package is open-source and freely available on CRAN and GitHub, and welcomes further development: https://github.com/molgenis/ds-tidyverse.
Document Type: article
File Description: electronic resource
Language: English
ISSN: 2046-1402
Relation: https://f1000research.com/articles/14-606/v1; https://doaj.org/toc/2046-1402
DOI: 10.12688/f1000research.164345.1
Access URL: https://doaj.org/article/f3aaa23e01b148a08c3eb8a5e6aff4e0
Accession Number: edsdoj.f3aaa23e01b148a08c3eb8a5e6aff4e0
Database: Directory of Open Access Journals
Description
Abstract:Background DataSHIELD is a mature, R-based federated learning platform that enables multi-site analysis without sharing individual participant data. While DataSHIELD includes many packages for data analysis, it lacks user-friendly data manipulation tools. Methods To address this gap, we developed dsTidyverse, an implementation of selected functions from the popular Tidyverse package within the DataSHIELD client-server architecture. Disclosure checks were implemented to prevent individual-level data leakage. Results This package provides functionality for selecting, renaming, and creating columns; conditional recoding; combining data frames by rows or columns; filtering and arranging rows; grouping and ungrouping data; and converting data frames to tibbles. Through examples, we demonstrate how dsTidyverse simplifies common data manipulation tasks within DataSHIELD. Conclusions By providing additional data manipulation functionality, dsTidyverse improves the user experience and analytical efficiency within DataSHIELD. The package is open-source and freely available on CRAN and GitHub, and welcomes further development: https://github.com/molgenis/ds-tidyverse.
ISSN:20461402
DOI:10.12688/f1000research.164345.1