Scalable non-deterministic clustering-based k-anonymization for rich networks

In this paper, we tackle the problem of graph anonymization in the context of privacy-preserving social network mining. We present a greedy and non-deterministic algorithm to achieve k -anonymity on labeled and undirected networks. Our work aims to create a scalable algorithm for real-world big netw...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	International journal of information security Ročník 18; číslo 2; s. 219 - 238
Hlavní autoři:	Ros-Martín, Miguel, Salas, Julián, Casas-Roma, Jordi
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Berlin/Heidelberg Springer Berlin Heidelberg 01.04.2019 Springer Nature B.V
Témata:	Algorithms Anonymity Clustering Coding and Information Theory Communications Engineering Computer Communication Networks Computer privacy Computer Science Cryptology Data mining Greedy algorithms Machinery Management of Computing and Information Systems Networks Operating Systems Privacy Recommender systems Regular Contribution Social networks Anonymity Privacy-preserving Social networks Graphs Recommender systems
ISSN:	1615-5262, 1615-5270
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	In this paper, we tackle the problem of graph anonymization in the context of privacy-preserving social network mining. We present a greedy and non-deterministic algorithm to achieve k -anonymity on labeled and undirected networks. Our work aims to create a scalable algorithm for real-world big networks, which runs in parallel and uses biased randomization for improving the quality of the solutions. We propose new metrics that consider the utility of the clusters from a recommender system point of view. We compare our approach to SaNGreeA, a well-known state-of-the-art algorithm for k -anonymity generalization. Finally, we have performed scalability tests, with up to 160 machines within the Hadoop framework, for anonymizing a real-world dataset with around 830 K nodes and 63 M relationships, demonstrating our method’s utility and practical applicability.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1615-5262 1615-5270
DOI:	10.1007/s10207-018-0409-1