SecureNLP: A System for Multi-Party Privacy-Preserving Natural Language Processing
Natural language processing (NLP) allows a computer program to understand human language as it is spoken, and has been increasingly deployed in a growing number of applications, such as machine translation, sentiment analysis, and electronic voice assistant. While information obtained from different...
Uloženo v:
| Vydáno v: | IEEE transactions on information forensics and security Ročník 15; s. 3709 - 3721 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1556-6013, 1556-6021 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Natural language processing (NLP) allows a computer program to understand human language as it is spoken, and has been increasingly deployed in a growing number of applications, such as machine translation, sentiment analysis, and electronic voice assistant. While information obtained from different sources can enhance the accuracy of NLP models, there are also privacy implications in the collection of such massive data. Thus, in this paper, we design a privacy-preserving system SecureNLP, focusing on the instance of recurrent neural network (RNN)-based sequence-to-sequence with attention model for neural machine translation. Specifically, for non-linear functions such as sigmoid and tanh, we design two efficient distributed protocols using secure multi-party computation (MPC), which are used to carry out the respective tasks in the SecureNLP. We also prove the security of these two protocols (i.e., privacy-preserving long short-term memory network <inline-formula> <tex-math notation="LaTeX">\textsf {PrivLSTM} </tex-math></inline-formula>, and privacy-preserving sequence to sequence transformation <inline-formula> <tex-math notation="LaTeX">\textsf {PrivSEQ2SEQ} </tex-math></inline-formula>) in the semi-honest adversary model, in the sense that any honest-but-curious adversary cannot learn anything else from the messages they receive from other parties. The proposed system is implemented in C++ and Python, and the findings from the evaluation demonstrate the utility of the protocols in cross-domain NLP. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1556-6013 1556-6021 |
| DOI: | 10.1109/TIFS.2020.2997134 |