Membership Inference Attacks From First Principles

A membership inference attack allows an adversary to query a trained machine learning model to predict whether or not a particular example was contained in the model's training dataset. These attacks are currently evaluated using average-case "accuracy" metrics that fail to characteri...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings - IEEE Symposium on Security and Privacy s. 1897 - 1914
Hlavní autoři: Carlini, Nicholas, Chien, Steve, Nasr, Milad, Song, Shuang, Terzis, Andreas, Tramer, Florian
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.05.2022
Témata:
ISSN:2375-1207
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:A membership inference attack allows an adversary to query a trained machine learning model to predict whether or not a particular example was contained in the model's training dataset. These attacks are currently evaluated using average-case "accuracy" metrics that fail to characterize whether the attack can confidently identify any members of the training set. We argue that attacks should instead be evaluated by computing their true-positive rate at low (e.g., ≤ 0.1%) false-positive rates, and find most prior attacks perform poorly when evaluated in this way. To address this we develop a Likelihood Ratio Attack (LiRA) that carefully combines multiple ideas from the literature. Our attack is 10\times more powerful at low false-positive rates, and also strictly dominates prior attacks on existing metrics.
ISSN:2375-1207
DOI:10.1109/SP46214.2022.9833649