Sketching algorithms for genomic data analysis and querying in a secure enclave

Genome-wide association studies (GWAS), especially on rare diseases, may necessitate exchange of sensitive genomic data between multiple institutions. Since genomic data sharing is often infeasible due to privacy concerns, cryptographic methods, such as secure multiparty computation (SMC) protocols,...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Nature methods Ročník 17; číslo 3; s. 295 - 301
Hlavní autoři: Kockan, Can, Zhu, Kaiyuan, Dokmai, Natnatee, Karpov, Nikolai, Kulekci, M. Oguzhan, Woodruff, David P., Sahinalp, S. Cenk
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York Nature Publishing Group US 01.03.2020
Nature Publishing Group
Témata:
ISSN:1548-7091, 1548-7105, 1548-7105
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Genome-wide association studies (GWAS), especially on rare diseases, may necessitate exchange of sensitive genomic data between multiple institutions. Since genomic data sharing is often infeasible due to privacy concerns, cryptographic methods, such as secure multiparty computation (SMC) protocols, have been developed with the aim of offering privacy-preserving collaborative GWAS. Unfortunately, the computational overhead of these methods remain prohibitive for human-genome-scale data. Here we introduce SkSES ( https://github.com/ndokmai/sgx-genome-variants-search ), a hardware–software hybrid approach for privacy-preserving collaborative GWAS, which improves the running time of the most advanced cryptographic protocols by two orders of magnitude. The SkSES approach is based on trusted execution environments (TEEs) offered by current-generation microprocessors—in particular, Intel’s SGX. To overcome the severe memory limitation of the TEEs, SkSES employs novel ‘sketching’ algorithms that maintain essential statistical information on genomic variants in input VCF files. By additionally incorporating efficient data compression and population stratification reduction methods, SkSES identifies the top k genomic variants in a cohort quickly, accurately and in a privacy-preserving manner. The combination of Intel SGX platform with sketching algorithms enables efficient compaction of genomic data and the execution of secure GWAS in an untrusted cloud environment.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
C.K., N.D., M.O.K. and S.C.S. initially participated in the iDASH-2017 competition. K.Z., N.K. and S.C.S. formulated the problem with limited memory. K.Z., S.C.S. and D.P.W. further formulated the problem to correct for population stratification. C.K., K.Z. and N.D. implemented the proposed solution. C.K., K.Z., N.D., N.K. and S.C.S. co-wrote the manuscript. M.O.K., D.P.W. and S.C.S. supervised the study.
Author contributions
ISSN:1548-7091
1548-7105
1548-7105
DOI:10.1038/s41592-020-0761-8