ENH Add "adversarial debiasing" #973

SeanMcCarren · 2021-10-06T08:36:32Z

Solves issue #785 and implements Adversarial Debiasing paper

Updated 24 March.

UPDATE

Should have responded to all feedback now

Old

resolved issues mentioned in MNT CI error: SciKeras requires Tensorflow > 2.7.0 #1009
Every experiment from the original paper (Adversarial Debiasing paper) is reproduced on this repo! Not exact replicas, but close. See repo for discussion of results as well.

To summarize important design choices:

Follow scikit-learn guidelines to create this estimator
- fit(X, y, sensitive_feature), predict(X), decision_function(X).
- we handle preprocessing of y and sensitive_features, but the user needs to preprocess X. (the models are NN's, so we require numeric inputs everywhere)
Base class is general and mostly handles API, BackendEngine provides PyTorch/TensorFlow-specific code.
Allow many kwargs, in order to serve many use cases. Especially note:
- y and sensitive_features can be from arbitrary distributions. We try to infer the distribution of this data and use this to choose appropriate preprocessors (y_transform and a_transform), loss functions (predictor_loss and adversary_loss), and the decision function (predictor_function). Currently, we only infer whether y or sensitive_features is univariate binomial, univariate multinomial, or multivariate normal. If we can infer such a distribution, we know precisely what to choose as aforementioned kwargs. Otherwise, the user must supply these kwargs explicitly.
- predictor_optimizer and adversary_optimizer are kwargs, because in practice we see many different optimizers used.
- callbacks. I found supporting callback functions is particularly useful (as is done in skorch for instance)

- implemented tensorflow part - fit, partial_fit framework implementation - input validation - added TODOs

- moved some stuff - thought about structure, now only predictor is variable - started predict()

worked on UCI adult example!

hildeweerts

I have embarrassingly little experience with pytorch/tensofrlow so I've mostly added a few nitpicks re. naming and such.

fairlearn/adversarial/_adversarial_mitigation.py

hildeweerts · 2021-10-06T09:50:29Z

fairlearn/adversarial/_adversarial_mitigation.py

+        respectively. If none is specified, default is torch, else tensorflow,
+        depending on which is installed.
+
+    predictor_model : torch.nn.Module, tensorflow.keras.Model


Should we change predictor_model to estimator? I guess it's not strictly an estimator in the scikit-learn sense because it specifically requires a neural network, but it would be more consistent with reductions module and ThresholdOptimizer.

fairlearn/adversarial/_adversarial_mitigation.py

examples/plot_adversarial_mitigation.py

Co-authored-by: Hilde Weerts <24417440+hildeweerts@users.noreply.github.com>

…en/fairlearn into adversarial_debiasing

fairlearn/adversarial/_pytorch_models.py

romanlutz · 2021-10-07T19:09:12Z

fairlearn/adversarial/_tensorflow_models.py

+# Copyright (c) Microsoft Corporation and Fairlearn contributors.
+# Licensed under the MIT License.
+
+from tensorflow.keras import Model


torch and tensorflow are somewhat problematic imports. We can't add them to the default dependencies of fairlearn. However, you could check if they're installed before importing and otherwise surface an error message. We're doing that for matplotlib elsewhere:

fairlearn/fairlearn/postprocessing/_plotting.py

Line 21 in 62fc80c

raise RuntimeError(_MATPLOTLIB_IMPORT_ERROR_MESSAGE)

Another question is whether we should have a default way of installing tensorflow and torch, for example fairlearn[torch] or fairlearn[tensorflow]. Such "extras" would need to be defined in setup.py, or rather through another requirements-*.txt file.

Finally, we'd need another set of installation tests. You can check test/install for examples on how we do that for matplotlib. Basically, this is to check that everything works as expected in the case that we don't have these packages installed.

fairlearn/adversarial/_adversarial_mitigation.py

docs/api_reference/fairlearn.adversarial.rst

fairlearn/adversarial/_adversarial_mitigation.py

romanlutz · 2021-10-07T19:21:00Z

fairlearn/adversarial/_adversarial_mitigation.py

+    alpha : float, default = 0.1
+        A small number $\alpha$ as specified in the paper.
+
+    cuda : bool, default = False


Would cuda require extra dependencies? If so, we'd need to test this in two configurations: with and without cuda.

Yes you require a GPU, special GPU driver (NVIDIA CUDA Toolkit I think), and an extra pip install torch+cuda or something like that (https://pytorch.org/get-started/locally/). But, torch.cuda.is_available() should only be True if the system supports cuda.

Is this at all testable on the CI server? I wouldn't know where to start.

I believe TensorFlow models automatically run on a single GPU if the TensorFlow install is set up properly (with cuda), and run on CPU otherwise. So I was thinking about removing this argument and defaulting to use the GPU if available?

fairlearn/adversarial/_adversarial_mitigation.py

romanlutz · 2021-10-07T19:32:38Z

fairlearn/adversarial/__init__.py

+# Copyright (c) Microsoft Corporation and Fairlearn contributors.
+# Licensed under the MIT License.
+
+"""Adversarial techniques to help mitigate fairness disparities."""


Hmm, broadly I would say "to mitigate unfairness." For this particular one it's more about making the model lose the ability to distinguish between sensitive feature groups, right? That is intended to make it fairer, although there's no real guarantee associated with that. Does anyone else have thoughts about naming (including the class name)?

The technique does specifically optimize for particular fairness constraint (demographic parity or equalized odds). I think the assumption is that if the model is penalized for learning the sensitive feature, the model's predictions are encouraged to be independent of the sensitive feature, which would satisfy demographic parity. For equalized odds the idea is similar but then we condition on both the sensitive feature and the ground truth target variable.

So maybe something like: "Adversarial techniques for learning neural networks under fairness constraints." Or something like that? I suppose in theory the approach could be extended beyond neural networks, so we could also say "for machine learning under fairness constraints".

FYI in fairlearn.reductions we currently have: "This module contains algorithms implementing the reductions approach to disparity mitigation." - we might want to reconsider that description.

Small note, the paper does show that under some typical assumptions (one of which is a sufficiently large adversarial model and that both models converge, which needn't be true in practice) then at convergence the constraint (demographic parity or equalized odds) is satisfied. For some toy example I was able to consistently reproduce this, but for the UCI adult dataset not (yet).

I mean, this is the __init__ file, not sure it even needs a comment :D

I don't like it either, but flake8 is telling me to

Agreed with others that this is not a biggie, but people copy-paste. So for the sake of consistency with what we say elsewhere, I'd just say "help mitigate unfairness" (fairness disparities is weird).

Co-authored-by: Roman Lutz <romanlutz13@gmail.com>

…en/fairlearn into adversarial_debiasing

SeanMcCarren · 2021-10-09T16:02:48Z

@hildeweerts @romanlutz The author of the paper confirmed that the sensitive feature and prediction could be more than one-dimensional, so I am working hard to make this work. I want to model the API as follows:

for both the prediction and sensitive features, I would like to set their Loss functions according to a kwarg. For instance, for predictions:
- 'prediction'=='binary': then assume one-dimensional data (with only binary values) and choose sigmoid+binary cross entropy loss (by choosing this, we implicitly assume that the data comes from a binomial distribution)
- 'prediction'=='categorical': N-dimensional data (for N classes) and choose softmax + categorical CE loss (so assuming the data comes from a multinomial I think)
- 'prediction'=='regression': N-dimensional data (for N-dimensional continuous predictions and choose mean squared error loss (which you would do if you assume the data is normally distributed with fixed variance)

However, I have two points of concern:

The first is whether we want to assume whether data is normally distributed in the regression case. Generally, I'd say this is the assumption most people will want, but we do lose some flexibility for the specialized user.
I can imagine we have some very odd scenario where we want as sensitive_features both a categorical and a continuous variable. We can't do that if we only allow sensitive_features to be either categorical or continuous. I personally don't see an alternative that doesn't require a very complex interface.

romanlutz · 2021-10-10T08:12:03Z

This is perhaps naive for reasons I haven't quite thought through yet, but how about reading the targets and deciding if it's binary, multiclass, or regression based on that?

ExponentiatedGradient does this (without multiclass) AFAIK without dedicated input variable.

hildeweerts · 2021-10-10T10:00:30Z

For binary/multiclass there's a whole bunch of utils in scikit-learn that may be of help.

For regression a separate AdversarialMitigationRegressor class seems more intuitive to me (this is also a pattern that is common in scikit-learn, e.g., RandomForestClassifier versus RandomForestRegressor).

I bet @adrinjalali has thoughts on this as well.

adrinjalali · 2021-10-11T10:02:39Z

Yes, it makes much more sense to me to have two classes, which share most of the code in a parent class, and they do the specific parts for classification and regression there. HistGradeintBoosting{Classifier/Regressor} are the most recent estimators added to sklearn, I'd refer to them for reference.

SeanMcCarren · 2021-10-11T12:58:29Z

This is perhaps naive for reasons I haven't quite thought through yet, but how about reading the targets and deciding if it's binary, multiclass, or regression based on that?

ExponentiatedGradient does this (without multiclass) AFAIK without dedicated input variable.

@romanlutz That seems like a good idea actually! Now thinking about it, even if the user will want to do regression while all labels are either 0 or 1, a multinomial distribution will fit better than a normal distribution anyway, so we might want to make this decision for the user.

@hildeweerts @adrinjalali Thanks, that structure makes sense! Virtually all code would be shared, but that is also done in BaseHistGradientBoosting so that is good. Then, we force the user to use either classification or regression, not both in one model.

However, the problem remains for the sensitive_features, as the adversarial also needs to predict these sensitive features. How does one specify whether variables A, B, C are one-hot encoded multiclass or three independent binary variables? This matters in terms of which loss to use (underlying assumption about the data, otherwise we can't train for the correct constraint perfectly).

I'm kind of tempted to only support binary and continuous sensitive_features, as these can be mixed freely and don't span multiple columns (like multiclass as one-hot encoding does), so this would be a clear and concise solution. Or is there a lot of use for also supporting multiclass features, and letting the users map various groups of columns of sensitive_features?

hildeweerts · 2021-10-11T14:24:00Z

Or is there a lot of use for also supporting multiclass features, and letting the users map various groups of columns of sensitive_features?

Tutorials like to pretend that everything is binary, but in practice there's hardly any sensitive feature that can truly be considered binary. So my first reaction would be to do things the other way around: assume none of the features are one-hot encoded and do one-hot encoding internally for multicategorical features (if necessary?)

To distinguish categorical / continuous features I could imagine an argument infer_type (or whatever we want to call it) that's either 'auto' (automatically infer type of sensitive features) or a dict { 'colname1' : 'continuous', 'colname2' : 'categorical' }.

The independence assumption should be described clearly in the documentation btw, because sensitive features may be statistically related even if they are not one-hot-encoded.

SeanMcCarren · 2021-10-12T11:08:16Z

Okay, I think I was able to incorporate all of the comments now, so I will write them down here.

Let’s call X, Y, Z the input, prediction, and sensitive features from now on. All data are pd.DataFrame,
pd.Series, or np.ndarray. NO NaN’s allowed!

Training

Before training for the first time, we need to preprocess X, Y, Z.

Firstly, for each column of X, Y, Z, infer whether the column is binary, categorical, or continuous. Users can supply, for instance, assumption sensitive features={”column name”:”categorical”}
(or integer column index for np arrays). For column where this is not supplied, we try to
infer using the following rules:
- if dataframe/series with dtype=’categorical’: if categories == 2: binary. Else: categorical
- if dataframe/series of strings: if unique items == 2: binary. Else: categorical
- finding a float that is not integral: immediately assume continuous
- column of only 0 or 1: assume binary?
- What remains is the cases where the column are integers that are not all 0 or 1. Then,
  raise ValueError in that it is unclear whether to treat the column as categorical/continuous
Binary columns of strings are translated to 0/1 np.ndarrays or torch.tensors
For categorical columns of strings, we create a mapping to a one-hot-encoding using all
values present at that time, and expand X, Y , or Z using this mapping. We will end up with
either np.ndarray or torch.tensor with strictly floats! No more dataframes at this point.
Define loss functions per original column such that we take binary cross entropy for every
binary columns, categorical cross entropy per K columns of a one-hot-encoding of K classes,
and squared-error loss for continuous columns. The total loss is the sum of all the individual
column losses.
If the user passed something like predictor_model=[20, 20], then the predictormodel is
constructed using two hidden layers with 20 nodes each and the inferred number of inputs/outputs (so after expanding the one-hot-encodings). If instead the user passed a predictor_model that is an initialized torch.nn.Module or tensorflow equivalent, then the user will have to make sure the dimensions are correct (after expanding 1hot).

Predicting

preprocess input using the previous mappings, pass through model, and use the
mappings to get to original form.

Sklearn

From sklearn I can use OneHotEncoding.
I should also make the entire preprocessing step as a sklearn-style transform. (What should I call it?)

MiroDudik

I'm not fully done yet with my pass. I'm focusing on API and documentation.

MiroDudik · 2022-03-02T14:56:30Z

docs/api_reference/index.rst

   fairlearn.postprocessing
   fairlearn.preprocessing
   fairlearn.reductions
+   fairlearn.adversarial


Because of the existing conflicts the API docs webpage does not show yet. I'll review it once it renders.

Conflicts have been resolved, but it still seems that the webpage on CI doesn't process things correctly. Not sure what's going on.

MiroDudik · 2022-03-02T15:05:43Z

docs/user_guide/adversarial.rst

+the adversary will attain a loss equal to the entropy, so the adversary
+can not
+predict the sensitive features from the predictions.
+Moreover, this model can be trained for either *demographic parity* or


In the original paper, they simply suggest to restrict training of the adversary to y=0 and y=1. I suggest we leave this for future though, because the implied notion of fairness would be somewhat different than what we call TruePositiveRateParity and FalsePositiveRateParity.

MiroDudik · 2022-03-02T15:34:31Z

fairlearn/adversarial/__init__.py

+# Copyright (c) Microsoft Corporation and Fairlearn contributors.
+# Licensed under the MIT License.
+
+"""Adversarial techniques to help mitigate fairness disparities."""


Agreed with others that this is not a biggie, but people copy-paste. So for the sake of consistency with what we say elsewhere, I'd just say "help mitigate unfairness" (fairness disparities is weird).

MiroDudik · 2022-03-02T15:54:44Z

fairlearn/adversarial/_adversarial_mitigation.py

+from numpy import zeros, argmax, arange
+
+
+class AdversarialFairness(BaseEstimator):


Actually, I think I'd be in favor of keeping this one as is--AdversarialFairness. Adding Estimator feels very redundant. I haven't found a single instance of the naming pattern ...Estimator for concrete estimators in sklearn. Also, we don't say things like ExponentiatedGradientEstimator just ExponentiatedGradient.

fairlearn/adversarial/_adversarial_mitigation.py

MiroDudik · 2022-03-02T17:04:21Z

fairlearn/adversarial/_adversarial_mitigation.py

+        one-hot encodings, and it maps strictly continuous-valued (possible 2d)
+        to itself.
+
+    a_transform : sklearn.base.TransformerMixin, default = fairlearn.adversarial.FloatTransformer("auto")


Since the argument name is sensitive_features, I think this should be called sf_transform.

MiroDudik · 2022-03-02T18:34:29Z

fairlearn/adversarial/_adversarial_mitigation.py

+        Must be the same type as the
+        :code:`predictor_model`.
+
+    predictor_loss : str, callable, default = 'auto'


I don't love the current keyword choices for predictor_loss, adversary_loss, because they seem to refer to the type of the target, rather than the loss. If we go that route, we should consider using something similar to sklearn's target types.

Alternatively, we could use keywords that describe the loss, like "square_loss", "logistic_loss"... but let me think a bit more about this.

You raise an excellent point. Ideally, we'd also provide something like Y_distribution_type and A_distribution_type. I've thought about this before, but I can't remember why I let go of the idea. We should not get rid of the loss parameters though, and we would have to be sure that the inferred distribution type of the preprocessor agrees.

I had some further thoughts on this.

I think that the very basic question is whether we want to represent binary classification problems via networks with a single output or two outputs... so that should be decided upfront.

With that said, what do you think about the following tweaks of the current API:

base class (currently AdversarialFairness)

allow specifying y_transform, sf_transform, predictor_loss, adversary_loss, predictor_function

they can take values None (no transformation; this might still mean casting ints as a floats?), 'auto' (default), a callable, and additionally:

for predictor/adversary_loss, we support 'logistic_loss', 'square_loss'

if needed, we could also distinguish 'multinomial_logistic_loss' (which acts on one-hot encoding), but this could be inferred from the number of outputs of the network

for y/sf_transform, we support 'one_hot_encoder'

for predictor_function, we support 'argmax' (for one-hot representation) and 'threshold' (for one-output representation of binary classifiers), with an additional argument threshold_value with the default value 0

AdversarialFairnessClassifier

allow specifying sf_transform, adversary_loss

fill in y_transform, predictor_loss, adversary_loss, predictor_function so as to support (multinomial) logistic regression; I wouldn't allow overriding these (if that's what you want to do, you can just call the base class)

besides predict and decision_function, we should consider supporting predict_proba and predict_log_proba

AdversarialFairnessRegressor

allow specifying sf_transform, predictor_loss, adversary_loss

fill in y_transform, predictor_function as None

And one more idea--not sure how much I like it, but something that occurred to me. We could change:

predictor_loss -> y_loss

adversary_loss -> sf_loss

I like your thoughts! I really appreciate you taking the time!

I just want to reiterate that we (it was a suggestion from adrin or hilde I think) made the design choice that as much is automatically inferred from data as possible, so that means what kind of transform/loss/predictor function to use. Looking at this now, I am not so sure why I included y/sf_transform as parameters, because without this FloatTransformer class thingy (FloatTransform is the default y/sf_transform) we lose this automatic inference. The nice thing is that if the user has already applied a one-hot-encoding, the FloatTransform class will still infer and tell AdversarialFairness that the data is categorical, which is required to infer what loss function to use. I agree that the current API needs changing, but I do not see yet how we can resolve this nicely.

for y/sf_transform, we support 'one_hot_encoder'

I think I like this keyword-style better than how it is currently done, but I'm still on the fence. Currently, you can achieve this sf_transform='one_hot_encoder' with sf_transform=FloatEncoder('categorical'), so what you suggest is definitely cleaner. What do you think should happen if the user provides a custom transform? Should we then require the user to (1) also pass loss functions, because we can no longer infer them from FloatEncoder? Or (2) should we do the inferring outside of FloatEncoder? or (3) should we not even let the user pass a custom transform? I find this a difficult choice because all options feel bad in some sense. I've actually switched implementations from (2) to (1) in the past, but I am now tending back towards (2) and using the keywords you suggested. I'd love to hear your thoughts on this. @adrinjalali you were quite involved in this design choice regarding preprocessing in the past, so perhaps you can help us here.

for predictor/adversary_loss, we support 'logistic_loss', 'square_loss'

Currently we accept 'binary', 'category', 'continuous', I chose those because they are descriptive of the distribution that you assume, but I understand that you favor more precise names such as 'logistic_loss', 'square_loss', or 'argmax'? I might agree actually.

fill in y_transform, predictor_loss, adversary_loss, predictor_function so as to support (multinomial) logistic regression;

Then we'd still need to infer whether it is categorical or binary I'd say, hence I'd still love to know the distribution type somehow (rather through inferring it from the data than through a keyword parameter)

And one more idea--not sure how much I like it, but something that occurred to me.

I like this!

besides predict and decision_function, we should consider supporting predict_proba and predict_log_proba

Yeah sure!

Let me try to answer your questions--but also let me know if I left something unanswered!

What to do about 'auto' loss functions and predictor_function with custom y/sf_transform.

The behavior should be the same as what we do when the transform is None. I'm not sure what you do currently, but a sensible option would be an automatic inference among:

univariate {0.0,1.0} (-> univariate logistic model with sign-based prediction function),

one-hot-encoded categorical (-> multinomial logistic with argmax prediction function),

continuous univariate or vectors (-> square loss/L2 norm with identity prediction function),

otherwise: throw an exception

Loss versus link function ("distribution assumption").

I think by "distribution assumption" you mean the link function that is used in generalized linear models. I'd be in favor of keeping the idea of link (which is basically our prediction_function) separate from loss function. For example, it would be awkward to specify Huber loss (for robust regression) using the language of link functions.

Categorical vs. binary encoding in classification.

For this one, I'm not sure that we have any other option other than inferring from the provided y. My idea would be to support the binary vs. multiclass classification similarly to sklearn's LogisticRegression.

Thanks, this was really clarifying!

What to do about 'auto' loss functions and predictor_function with custom y/sf_transform.

Oh so doing this inference if the transform is None, and otherwise just do the transform and infer then. That sounds good. Basically decoupling this inference from the transform. Okay! Apart from that, I make the same inferences (the i. through iv.). (I initially chose to be a bit more general and accept mixed column types (so for instance binary and continuous columns), but we decided there was no use for this.)

Loss versus link function ("distribution assumption").

I must say I am learning a lot here, I was not familiar with these terms. I understand a reason for keeping these things seperate now, thanks!

@MiroDudik Hey, I finally got around to this, was a bit more work than I anticipated. What do you think? Internally, I still use "binary", "category", "continuous", because I feel this may be relevant to know later, but the API changed and accepts the better more concrete terms now.

As for the predict_(log_)proba, I am not sure how to interpret this. In case of a single output neuron $\hat y$, should we unfold it into a vector with separate probabilities for $\hat y = 0$ and $\hat y = 1$? And what if there are more weird cases? For now, we do have decision_function, which might give enough flexibility?

See my comment above -> if we specialize to logistic models, then predict_proba and predict_log_proba are obvious, right? These would be only provided for AdversarialMitigationClassifier.

@MiroDudik

…ggested by @MiroDudik

Co-authored-by: Roman Lutz <romanlutz13@gmail.com> Co-authored-by: MiroDudik <mdudik@gmail.com>

adrinjalali · 2022-04-08T11:33:07Z

fairlearn/adversarial/_adversarial_mitigation.py

+        self.warm_start = warm_start
+        self.random_state = random_state
+
+    def __setup(self, X, Y, A):


Suggested change

def __setup(self, X, Y, A):

def _setup(self, X, Y, A):

the canonical way to indicate a method is private is a single _.

adrinjalali · 2022-04-08T11:34:02Z

fairlearn/adversarial/_adversarial_mitigation.py

+        # Numbers
+        check_scalar(self.threshold_value, "threshold_value", (int, float))
+
+        # Non-negative numbers


I'm agnostic on whether these should be a separate method or not. Either way is fine with me.

adrinjalali · 2022-04-08T11:53:48Z

fairlearn/adversarial/_preprocessor.py

+
+class FloatTransformer(BaseEstimator, TransformerMixin):
+    """
+    Transformer that maps dataframes to numpy arrays of floats.


What I meant was to have a ColumnTransformer that does exactly what this transformer does. The output would be an array of floats, and then nothing would be different. You'd apply your ColumnTransformer internally on your y.

SeanMcCarren · 2022-04-14T15:00:24Z

@adrinjalali

What I meant was to have a ColumnTransformer that does exactly what this transformer does. The output would be an array of floats, and then nothing would be different. You'd apply your ColumnTransformer internally on your y.

I like those too, its much simpler. I was aware of them, but there are some reasons why I am not using this:

I need to do all the bookkeeping about types and such to make sure we infer all the variables (loss, predictor, shape and function of last layer of neural networks) according to the same type assumptions
ColumnTransformer won't distinguish whether we've passed continuous values or numbers that need to be one hot encoded

Maybe you know of a better way to achieve this?

MiroDudik · 2022-04-16T13:15:21Z

fairlearn/adversarial/_adversarial_mitigation.py

+        (choose :math:`\alpha` closer to zero) or increasing fairness
+        (choose larger :math:`\alpha`).
+
+    epochs : int, default = 1


What would be more common? epochs or num_epochs as here: https://aif360.readthedocs.io/en/latest/modules/generated/aif360.sklearn.inprocessing.AdversarialDebiasing.html

Also, I don't see documentation for max_iter.

MiroDudik · 2022-04-16T13:28:43Z

fairlearn/adversarial/_adversarial_mitigation.py

+    epochs : int, default = 1
+        Number of epochs to train for.
+
+    batch_size : int, default = -1


I would be in favor of changing the default for batch_size. I think that any default in the range 1-32 will work (the paper above suggests 2-32). The AIF360 implementation uses num_epochs=50, batch_size=128.

MiroDudik · 2022-04-16T18:49:43Z

fairlearn/adversarial/_adversarial_mitigation.py

+from numpy import zeros, argmax, arange
+
+
+class AdversarialFairness(BaseEstimator):


I thought about all of this some more, and I'm still finding parts of the constructor API unnecessarily complicated--and there are various dependencies spread across multiple parameters. In particular, I'd like to suggest an alternative to what's currently handled by: predictor_loss, adversary_loss, predictor_function, y_transform, sf_transform

AdversarialMitigationClassifier

can handle binary classification and multi-class classification

for encoding of y, we follow sklearn`s target types, which are described here

Binary classification

y has shape (n_samples,) and it contains two values typically {0,1}; if other values are provided they are transformed into {0,1} during fit and predict

the neural net is outputting a single scalar, corresponding to the logit of P(Y=1|X), and is trained by minimizing logistic loss

Multiclass classification

y has either shape (n_samples,n_classes) with 0/1 values implementing one-hot-encoding, or it has shape (n_samples,) and contains 3 or more distinct values; in the latter case, it is transformed into one-hot-encoding during fit and predict

the neural net is outputting a vector of n_classes scalars, corresponding to the linear scores of multinomial logistic model of P(Y|X), and is trained by minimizing log loss (this is called CrossEntropyLoss in PyTorch)

Adversarial model

sensible defaults around fitting sensitive features are a bit more tricky, but I think that we should begin by covering the common case of binary or categorical sensitive features:

sensitive_features is shape (n_samples,) or (n_samples,n_sensitive_features) where every column has two distinct values -> columns are transformed into {0,1}, and the training loss is the sum of logistic losses (i.e., treated as the sum of binary problems)

sensitive_features is shape (n_samples,) or (n_samples,n_sensitive_features) where at least one column has three or more values -> columns are transformed using one-hot-encoding, and the training loss is the sum of log losses (i.e., treated as the sum of multiclass problems)

This means that we could get rid of the parameters predictor_loss, adversary_loss, predictor_function, y_transform, sf_transform.

If we want to expose more generality (say continuous sensitive features), we could do that later.

AdversarialMitigationRegressor

y has shape (n_samples,) and contains floats, the loss is square loss

sensitive features and adversarial model are treated the same way as in AdversarialMitigationClassifier.

again, this means we could get rid of predictor_loss, adversary_loss, predictor_function, y_transform, sf_transform, and expose more generality later.

AdversarialMitigation

I suggest to make this private; in that case I don't care too much about the API.

MiroDudik · 2022-04-16T18:52:19Z

fairlearn/adversarial/_adversarial_mitigation.py

+        Must be the same type as the
+        :code:`predictor_model`.
+
+    predictor_loss : str, callable, default = 'auto'


See my comment above -> if we specialize to logistic models, then predict_proba and predict_log_proba are obvious, right? These would be only provided for AdversarialMitigationClassifier.

adrinjalali · 2022-04-28T15:51:09Z

Meeting summary: have separate PR from this one with the minimal implementation, as small as we're happy to have it released, and have the main base class private at least for now.

SeanMcCarren · 2022-04-29T16:04:18Z

@adrinjalali @MiroDudik @romanlutz Moved to #1079 :)

riedgar-ms · 2022-09-29T15:12:01Z

Is this still active, or should it be closed in favour of #1079 ?

hildeweerts · 2022-09-29T15:12:13Z

Close in favor of #1079

Add Adversarial Debiasing algorithm. Replaces PR #973

SeanMcCarren added 9 commits September 27, 2021 22:12

Start for adversarial debiasing

154a873

Update _adversarial_debiasing.py

cc1913e

various additions

f1be567

- implemented tensorflow part - fit, partial_fit framework implementation - input validation - added TODOs

added to table

e286f18

name change

af95feb

small improvements

034db41

- moved some stuff - thought about structure, now only predictor is variable - started predict()

Create plot_adversarial_mitigation.py

632a982

worked on UCI adult example!

cuda support & docstrings for public API

312c9a6

added to API and improved typing in docstrings

a8b1872

SeanMcCarren changed the title ~~Adversarial mitigation~~ ENH Add "adversarial debiasing" Oct 6, 2021

hildeweerts reviewed Oct 6, 2021

View reviewed changes

SeanMcCarren and others added 5 commits October 7, 2021 15:52

linting

1265c54

Apply suggestions from code review

949ac16

Co-authored-by: Hilde Weerts <24417440+hildeweerts@users.noreply.github.com>

Add suggetions from @hildeweerts

54c03fc

Merge branch 'adversarial_debiasing' of https://github.com/SeanMcCarr…

76c847c

…en/fairlearn into adversarial_debiasing

More linting and generalized the given linear networks

d5ea844

romanlutz reviewed Oct 7, 2021

View reviewed changes

SeanMcCarren and others added 5 commits October 8, 2021 08:17

Changed copyright sentence

60f2218

Apply suggestions from code review

9580e2a

Co-authored-by: Roman Lutz <romanlutz13@gmail.com>

Applied some suggestions from code review

99e7573

Merge branch 'adversarial_debiasing' of https://github.com/SeanMcCarr…

adc7549

…en/fairlearn into adversarial_debiasing

default cuda=False in example

ebba4fe

rewrite model helpers to have soft dependency

fca03e3

MiroDudik reviewed Mar 2, 2022

View reviewed changes

MiroDudik and others added 12 commits March 3, 2022 09:30

Merge branch 'main' into adversarial_debiasing

e3b09b5

fixed testcase

0796b69

flake8

9d9a69b

removed test based on randomness

a353f1b

Changed losses/transforms/predictor arguments to more intuitive as su…

4e60e71

…ggested by @MiroDudik

improved documentation to go with changes in API

fc288e0

add some bookkeeping for a scikit-learn guideline

2466396

trigger test_othermlpackages CI

3740003

Apply suggestions from code review

4f75848

Co-authored-by: Roman Lutz <romanlutz13@gmail.com> Co-authored-by: MiroDudik <mdudik@gmail.com>

Apply suggestions from code review

85e203e

shuffle_start

1570b28

Suggestion from code review

c8d33f2

adrinjalali self-assigned this Mar 31, 2022

SeanMcCarren mentioned this pull request Apr 4, 2022

CI test_othermlpackages not triggered #1053

Closed

adrinjalali reviewed Apr 8, 2022

View reviewed changes

MiroDudik reviewed Apr 16, 2022

View reviewed changes

SeanMcCarren added 3 commits April 29, 2022 16:39

batch size suggestion

c9a2bc3

remove shuffle_start, default to shuffle start as well

20dde15

supplied OHE must be 0 and 1s

bda2ea8

SeanMcCarren mentioned this pull request Apr 29, 2022

ENH Add "adversarial debiasing" 2nd PR #1079

Merged

hildeweerts closed this Sep 29, 2022

riedgar-ms pushed a commit that referenced this pull request Oct 27, 2022

ENH Add "adversarial debiasing" (#1079)

bc12236

Add Adversarial Debiasing algorithm. Replaces PR #973

This was referenced Mar 17, 2025

MNT remove condition in FloatTransformer #1529

Merged

MNT Improving the adversarial module #1480

Open

		from numpy import zeros, argmax, arange


		class AdversarialFairness(BaseEstimator):

ENH Add "adversarial debiasing" #973

ENH Add "adversarial debiasing" #973

Uh oh!

Conversation

SeanMcCarren commented Oct 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

UPDATE

Old

To summarize important design choices:

Uh oh!

hildeweerts left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SeanMcCarren Oct 8, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SeanMcCarren commented Oct 9, 2021

Uh oh!

romanlutz commented Oct 10, 2021

Uh oh!

hildeweerts commented Oct 10, 2021

Uh oh!

adrinjalali commented Oct 11, 2021

Uh oh!

SeanMcCarren commented Oct 11, 2021

Uh oh!

hildeweerts commented Oct 11, 2021

Uh oh!

SeanMcCarren commented Oct 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Training

Predicting

Sklearn

Uh oh!

MiroDudik left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MiroDudik Mar 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SeanMcCarren commented Oct 6, 2021 •

edited

Loading

SeanMcCarren Oct 8, 2021 •

edited

Loading

SeanMcCarren commented Oct 12, 2021 •

edited

Loading

MiroDudik Mar 2, 2022 •

edited

Loading

SeanMcCarren commented Apr 14, 2022 •

edited

Loading