Resources
Searching for statistical data privacy resources? You’re in the right place. Here we provide a range of resources, from CRAN packages to influential articles.
Articles
- A Statistical Framework for Differential Privacy (2012) by Larry Wasserman and Shuheng Zhou
- Advances and Open Problems in Federated Learning (2021) by Peter Kairouz et al.
- Balancing Data Privacy and Usability in the Federal Statistical System (2022) by V. Joseph Hotz et al.
- Disclosure Control and Random Tabular Adjustment (2018) by Mark Stinner
- Federated Learning: Challenges, Methods, and Future Directions (2020) by Tian Li et al.
- How Statisticians Should Grapple with Privacy in a Changing Data Landscape (2020) by Joshua Snoke and Claire McKay Bowen
- Policy Impacts of Statistical Uncertainty and Privacy (2022) by Ryan Steed et al.
- Privacy-Preserving Data Analysis for the Federal Statistical Agencies (2017) by John Abowd et al.
- Statistical Data Privacy: A Song of Privacy and Utility (2023) by Aleksandra Slavković and Jeremy Seeman
Books and Book Chapters
- Statistical Disclosure Control for Survey Data (2009) by Chris Skinner
- Protecting Your Privacy in a Data-Driven World (2021) by Claire McKay Bowen
CRAN (R) Packages
- acro: A Tool for Automating the Statistical Disclosure Control of Research Outputs
- AQuadtree: Confidentiality of Spatial Point Data
- cellKey: Consistent Perturbation of Statistical Frequency and Magnitude Tables
- diffpriv: Easy Differential Privacy
- duawranglr: Securely Wrangle Dataset According to Data Usage Agreement
- easySdcTable: Easy Interface to the Statistical Disclosure Control Package ‘sdcTable’ Extended with Own Implementation of ‘GaussSuppression’
- pda: Privacy-Preserving Distributed Algorithms
- ppmf: Read Census Privacy Protected Microdata Files
- ppmHR: Privacy-Protecting Hazard Ratio Estimation in Distributed Data Networks
- PPRL: Privacy Preserving Record Linkage
- RegSDC: Information Preserving Regression-Based Tools for Statistical Disclosure Control
- sdcHierarchies: Create and (Interactively) Modify Nested Hierarchies
- sdcLog: Tools for Statistical Disclosure Control in Research Data Centers
- SDCNWay: Tools to Evaluate Disclosure Risk
- sdcSpatial: Statistical Disclosure Control for Spatial Data
- sdcTable: Methods for Statistical Disclosure Control in Tabular Data
- synthpop: Generating Synthetic Versions of Sensitive Microdata for Statistical Disclosure Control
- uwedragon: Data Research, Access, Governance Network : Statistical Disclosure Control
Python Packages
- Google’s Differential Privacy Libraries
- Diffprivlib: General-Purpose Library for Experimenting with, Investigating, and Developing Applications in Differential Privacy
- OpenDP: The Core Library of Differential Privacy Algorithms Powering the OpenDP Project
- PipelineDP: Python Framework for Applying Differentially Private Aggregations to Large Datasets Using Batch Processing Systems such as Apache Spark, Apache Beam, and More
- PySyft: Perform Data Science on Data that Remains in Someone Else’s Server
- PyDP: OpenMined’s Python Differential Privacy Library
- Tensorflow Privacy: Library for Training Machine Learning Models with Privacy for Training Data
- Tumult Analytics: Tumult Labs’ Differential Privacy Library, running on PySpark
Additional Resources
If you have additional resources that you’d like to share on this page, join our group and let us know on our discussion forum.