Nebula: Efficient, Private and Accurate Histogram Estimation
Ali Shahin Shamsabadi (Brave Software), Peter Snyder (Brave Software), Ralph Giles (Brave Software), Aurélien Bellet (Inria, Université de Montpellier), and Hamed Haddadi (Brave Software, Imperial College London) | Differential Privacy, Cryptography
We present Nebula, a system for differentially private histogram estimation on data distributed among clients. Nebula allows clients to independently decide whether to participate in the system, and locally encode their data so that an untrusted server only learns data values whose multiplicity exceeds a predefined aggregation threshold, with ((\varepsilon,\delta)) differential privacy guarantees.
Compared to existing systems, Nebula uniquely achieves:
i) a strict upper bound on client privacy leakage;
ii) significantly higher utility than standard local differential privacy systems; and
iii) no requirement for trusted third-parties, multi-party computation, or trusted hardware.
We provide a formal evaluation of Nebula’s privacy, utility and efficiency guarantees, along with an empirical assessment on three real-world datasets. On the United States Census dataset, clients can submit their data in just 0.0036 seconds and 0.0016 MB (efficient), under strong ((\varepsilon=1,\delta=10^{-8})) differential privacy guarantees (private), enabling Nebula’s untrusted aggregation server to estimate histograms with over 88% better utility than existing local differential privacy deployments (accurate).
Additionally, we describe a variant that allows clients to submit multi-dimensional data, with similar privacy, utility, and performance.
Finally, we provide an implementation of Nebula.