This repository contains algorithms for performing differentially private simple linear regression. We also provide an example usage of the main algorithms.
There are three families of algorithms we've implemented for differentially private (DP) simple linear regression:
-
DPGradDescent: A DP mechanism that uses differentially private gradient descent to solve the convex optimization problem that defines OLS (Ordinary Least Squares). -
DPTheilSen: A DP version of Theil-Sen, a robust linear regression estimator that computes the point estimate for every pair of points and outputs the median of these estimates. We consider some variants of this algorithm that use different DP median algorithms:DPExpTheilSen,DPWideTheilSen, andDPSSTheilSen. -
NoisyStats: A DP mechanism that perturbs the sufficient statistics for OLS. It has two main advantages: it is no less efficient than its non-private analogue, and it allows us to release DP versions of the sufficient statistics without any extra privacy cost.
-
DPGradDescent: link to main code file -
dpMedTS_exp: computesDPExpTheilSenlink to main code file -
dpMedTS_exp_wide: computesDPWideTheilSenlink to main code file -
dpMedTS_ss_ST_no_split: computesDPSSTheilSenwith a smooth sensitivity calculation based on the student's T distribution link to main code file -
NoisyStats: link to main code file
In example.py, we show how to run each method.
-
Opportunity Insights Data -
Washington, DC Bikeshare UCI Dataset -
Carbon Nanotubes UCI Dataset -
Stock Exchange UCI Dataset -
Synthetic Datasets