SecureNLP: A System for Multi-Party Privacy-Preserving Natural Language Processing

Natural language processing (NLP) allows a computer program to understand human language as it is spoken, and has been increasingly deployed in a growing number of applications, such as machine translation, sentiment analysis, and electronic voice assistant. While information obtained from different...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on information forensics and security Ročník 15; s. 3709 - 3721
Hlavní autoři: Feng, Qi, He, Debiao, Liu, Zhe, Wang, Huaqun, Choo, Kim-Kwang Raymond
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1556-6013, 1556-6021
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Natural language processing (NLP) allows a computer program to understand human language as it is spoken, and has been increasingly deployed in a growing number of applications, such as machine translation, sentiment analysis, and electronic voice assistant. While information obtained from different sources can enhance the accuracy of NLP models, there are also privacy implications in the collection of such massive data. Thus, in this paper, we design a privacy-preserving system SecureNLP, focusing on the instance of recurrent neural network (RNN)-based sequence-to-sequence with attention model for neural machine translation. Specifically, for non-linear functions such as sigmoid and tanh, we design two efficient distributed protocols using secure multi-party computation (MPC), which are used to carry out the respective tasks in the SecureNLP. We also prove the security of these two protocols (i.e., privacy-preserving long short-term memory network <inline-formula> <tex-math notation="LaTeX">\textsf {PrivLSTM} </tex-math></inline-formula>, and privacy-preserving sequence to sequence transformation <inline-formula> <tex-math notation="LaTeX">\textsf {PrivSEQ2SEQ} </tex-math></inline-formula>) in the semi-honest adversary model, in the sense that any honest-but-curious adversary cannot learn anything else from the messages they receive from other parties. The proposed system is implemented in C++ and Python, and the findings from the evaluation demonstrate the utility of the protocols in cross-domain NLP.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1556-6013
1556-6021
DOI:10.1109/TIFS.2020.2997134