TY - JOUR
T1 - Privacy by diversity in sequential releases of databases
AU - Shmueli, Erez
AU - Tassa, Tamir
N1 - Publisher Copyright:
© 2014 Elsevier Inc.
PY - 2015/3/20
Y1 - 2015/3/20
N2 - We study the problem of privacy preservation in sequential releases of databases. In that scenario, several releases of the same table are published over a period of time, where each release contains a different set of the table attributes, as dictated by the purposes of the release. The goal is to protect the private information from adversaries who examine the entire sequential release. That scenario was studied in [32] and was further investigated in [28]. We revisit their privacy definitions, and suggest a significantly stronger adversarial assumption and privacy definition. We then present a sequential anonymization algorithm that achieves ℓ-diversity. The algorithm exploits the fact that different releases may include different attributes in order to reduce the information loss that the anonymization entails. Unlike the previous algorithms, ours is perfectly scalable as the runtime to compute the anonymization of each release is independent of the number of previous releases. In addition, we consider here the fully dynamic setting in which the different releases differ in the set of attributes as well as in the set of tuples. The advantages of our approach are demonstrated by extensive experimentation.
AB - We study the problem of privacy preservation in sequential releases of databases. In that scenario, several releases of the same table are published over a period of time, where each release contains a different set of the table attributes, as dictated by the purposes of the release. The goal is to protect the private information from adversaries who examine the entire sequential release. That scenario was studied in [32] and was further investigated in [28]. We revisit their privacy definitions, and suggest a significantly stronger adversarial assumption and privacy definition. We then present a sequential anonymization algorithm that achieves ℓ-diversity. The algorithm exploits the fact that different releases may include different attributes in order to reduce the information loss that the anonymization entails. Unlike the previous algorithms, ours is perfectly scalable as the runtime to compute the anonymization of each release is independent of the number of previous releases. In addition, we consider here the fully dynamic setting in which the different releases differ in the set of attributes as well as in the set of tuples. The advantages of our approach are demonstrated by extensive experimentation.
KW - Anonymization
KW - Continuous data publishing
KW - Diversity
KW - Multipartite graphs
KW - Privacy preserving data publishing
KW - Sequential release
UR - http://www.scopus.com/inward/record.url?scp=84922456136&partnerID=8YFLogxK
U2 - 10.1016/j.ins.2014.11.005
DO - 10.1016/j.ins.2014.11.005
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:84922456136
SN - 0020-0255
VL - 298
SP - 344
EP - 372
JO - Information Sciences
JF - Information Sciences
ER -