Abstract
We study the problem of privacy preservation in sequential releases of databases. In that scenario, several releases of the same table are published over a period of time, where each release contains a different set of the table attributes, as dictated by the purposes of the release. The goal is to protect the private information from adversaries who examine the entire sequential release. That scenario was studied in [32] and was further investigated in [28]. We revisit their privacy definitions, and suggest a significantly stronger adversarial assumption and privacy definition. We then present a sequential anonymization algorithm that achieves ℓ-diversity. The algorithm exploits the fact that different releases may include different attributes in order to reduce the information loss that the anonymization entails. Unlike the previous algorithms, ours is perfectly scalable as the runtime to compute the anonymization of each release is independent of the number of previous releases. In addition, we consider here the fully dynamic setting in which the different releases differ in the set of attributes as well as in the set of tuples. The advantages of our approach are demonstrated by extensive experimentation.
Original language | English |
---|---|
Pages (from-to) | 344-372 |
Number of pages | 29 |
Journal | Information Sciences |
Volume | 298 |
DOIs | |
State | Published - 20 Mar 2015 |
Bibliographical note
Publisher Copyright:© 2014 Elsevier Inc.
Keywords
- Anonymization
- Continuous data publishing
- Diversity
- Multipartite graphs
- Privacy preserving data publishing
- Sequential release