TY - JOUR
T1 - K-anonymization with minimal loss of information
AU - Gionis, Aristides
AU - Tassa, Tamir
N1 - Copyright:
Copyright 2009 Elsevier B.V., All rights reserved.
PY - 2009/2
Y1 - 2009/2
N2 - The technique of k-anonymization allows the releasing of databases that contain personal information while ensuring some degree of individual privacy. Anonymization is usually performed by generalizing database entries. We formally study the concept of generalization, and propose three information-theoretic measures for capturing the amount of information that is lost during the anonymization process. The proposed measures are more general and more accurate than those that were proposed by Meyerson and Williams [23] and Aggarwal et al. [1]. We study the problem of achieving k-anonymity with minimal loss of information. We prove that it is NP-hard and study polynomial approximations for the optimal solution. Our first algorithm gives an approximation guarantee of O(ln k) for two of our measures as well as for the previously studied measures. This improves the best-known O (k)-approximation in [1]. While the previous approximation algorithms relied on the graph representation framework, our algorithm relies on a novel hypergraph representation that enables the improvement in the approximation ratio from O(k) to O(ln k). As the running time of the algorithm is O(n2k), we also show how to adapt the algorithm in [1] in order to obtain an O(k)-approximation algorithm that is polynomial. in both n and k.
AB - The technique of k-anonymization allows the releasing of databases that contain personal information while ensuring some degree of individual privacy. Anonymization is usually performed by generalizing database entries. We formally study the concept of generalization, and propose three information-theoretic measures for capturing the amount of information that is lost during the anonymization process. The proposed measures are more general and more accurate than those that were proposed by Meyerson and Williams [23] and Aggarwal et al. [1]. We study the problem of achieving k-anonymity with minimal loss of information. We prove that it is NP-hard and study polynomial approximations for the optimal solution. Our first algorithm gives an approximation guarantee of O(ln k) for two of our measures as well as for the previously studied measures. This improves the best-known O (k)-approximation in [1]. While the previous approximation algorithms relied on the graph representation framework, our algorithm relies on a novel hypergraph representation that enables the improvement in the approximation ratio from O(k) to O(ln k). As the running time of the algorithm is O(n2k), we also show how to adapt the algorithm in [1] in order to obtain an O(k)-approximation algorithm that is polynomial. in both n and k.
KW - Approximation algorithms for NP-hard problems
KW - K-anonymization
KW - Privacy-preserving data mining
UR - http://www.scopus.com/inward/record.url?scp=69549119179&partnerID=8YFLogxK
U2 - 10.1109/TKDE.2008.129
DO - 10.1109/TKDE.2008.129
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:69549119179
SN - 1041-4347
VL - 21
SP - 206
EP - 219
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 2
ER -