DataSHIELD: mitigating disclosure risk in a multi-site federated analysis platform

Uložené v:
Podrobná bibliografia
Názov: DataSHIELD: mitigating disclosure risk in a multi-site federated analysis platform
Autori: Avraam, Demetris, Wilson, Rebecca C, Aguirre Chan, Noemi, Banerjee, Soumya, Bishop, Tom RP, Butters, Olly, Cadman, Tim, Cederkvist, Luise, Duijts, Liesbeth, Escribà Montagut, Xavier, Garner, Hugh, Gonçalves, Gonçalo, González, Juan R, Haakma, Sido, Hartlev, Mette, Hasenauer, Jan, Huth, Manuel, Hyde, Eleanor, Jaddoe, Vincent WV, Marcon, Yannick, Mayrhofer, Michaela Th, Molnar-Gabor, Fruzsina, Morgan, Andrei Scott, Murtagh, Madeleine, Nestor, Marc, Nybo Andersen, Anne-Marie, Parker, Simon, Pinot de Moira, Angela, Schwarz, Florian, Strandberg-Larsen, Katrine, Swertz, Morris A, Welten, Marieke, Wheater, Stuart, Burton, Paul
Prispievatelia: Lengauer, Thomas
Informácie o vydavateľovi: Oxford University Press (OUP)
Rok vydania: 2024
Zbierka: The University of Liverpool Repository
Popis: Abstract Motivation The validity of epidemiologic findings can be increased using triangulation, i.e. comparison of findings across contexts, and by having sufficiently large amounts of relevant data to analyse. However, access to data is often constrained by practical considerations and by ethico-legal and data governance restrictions. Gaining access to such data can be time-consuming due to the governance requirements associated with data access requests to institutions in different jurisdictions. Results DataSHIELD is a software solution that enables remote analysis without the need for data transfer (federated analysis). DataSHIELD is a scientifically mature, open-source data access and analysis platform aligned with the ‘Five Safes’ framework, the international framework governing safe research access to data. It allows real-time analysis while mitigating disclosure risk through an active multi-layer system of disclosure-preventing mechanisms. This combination of real-time remote statistical analysis, disclosure prevention mechanisms, and federation capabilities makes DataSHIELD a solution for addressing many of the technical and regulatory challenges in performing the large-scale statistical analysis of health and biomedical data. This paper describes the key components that comprise the disclosure protection system of DataSHIELD. These broadly fall into three classes: (i) system protection elements, (ii) analysis protection elements, and (iii) governance protection elements. Availability and implementation Information about the DataSHIELD software is available in https://datashield.org/ and https://github.com/datashield.
Druh dokumentu: article in journal/newspaper
Jazyk: English
Relation: Collapse authors list. Avraam, Demetris orcid:0000-0001-8908-2441 , Wilson, Rebecca C orcid:0000-0003-2294-593X , Aguirre Chan, Noemi, Banerjee, Soumya, Bishop, Tom RP, Butters, Olly orcid:0000-0003-0354-8461 , Cadman, Tim, Cederkvist, Luise, Duijts, Liesbeth, Escribà Montagut, Xavier et al (show 24 more authors) , Garner, Hugh, Gonçalves, Gonçalo, González, Juan R, Haakma, Sido, Hartlev, Mette, Hasenauer, Jan, Huth, Manuel, Hyde, Eleanor, Jaddoe, Vincent WV, Marcon, Yannick, Mayrhofer, Michaela Th, Molnar-Gabor, Fruzsina, Morgan, Andrei Scott, Murtagh, Madeleine, Nestor, Marc, Nybo Andersen, Anne-Marie, Parker, Simon, Pinot de Moira, Angela, Schwarz, Florian, Strandberg-Larsen, Katrine, Swertz, Morris A, Welten, Marieke, Wheater, Stuart and Burton, Paul (2024) DataSHIELD: mitigating disclosure risk in a multi-site federated analysis platform Bioinformatics Advances, 5 (1). vbaf046-. ISSN 2635-0041, 2635-0041
DOI: 10.1093/bioadv/vbaf046
Dostupnosť: https://livrepository.liverpool.ac.uk/3191352/
https://doi.org/10.1093/bioadv/vbaf046
Prístupové číslo: edsbas.B2774DFA
Databáza: BASE
Popis
Abstrakt:<jats:title>Abstract</jats:title> <jats:sec> <jats:title>Motivation</jats:title> <jats:p>The validity of epidemiologic findings can be increased using triangulation, i.e. comparison of findings across contexts, and by having sufficiently large amounts of relevant data to analyse. However, access to data is often constrained by practical considerations and by ethico-legal and data governance restrictions. Gaining access to such data can be time-consuming due to the governance requirements associated with data access requests to institutions in different jurisdictions.</jats:p> </jats:sec> <jats:sec> <jats:title>Results</jats:title> <jats:p>DataSHIELD is a software solution that enables remote analysis without the need for data transfer (federated analysis). DataSHIELD is a scientifically mature, open-source data access and analysis platform aligned with the ‘Five Safes’ framework, the international framework governing safe research access to data. It allows real-time analysis while mitigating disclosure risk through an active multi-layer system of disclosure-preventing mechanisms. This combination of real-time remote statistical analysis, disclosure prevention mechanisms, and federation capabilities makes DataSHIELD a solution for addressing many of the technical and regulatory challenges in performing the large-scale statistical analysis of health and biomedical data. This paper describes the key components that comprise the disclosure protection system of DataSHIELD. These broadly fall into three classes: (i) system protection elements, (ii) analysis protection elements, and (iii) governance protection elements.</jats:p> </jats:sec> <jats:sec> <jats:title>Availability and implementation</jats:title> <jats:p>Information about the DataSHIELD software is available in https://datashield.org/ and https://github.com/datashield.</jats:p> </jats:sec>
DOI:10.1093/bioadv/vbaf046