Skip to content

Conversation

@SeanMcCarren
Copy link
Contributor

@SeanMcCarren SeanMcCarren commented Oct 6, 2021

Solves issue #785 and implements Adversarial Debiasing paper

Updated 24 March.

UPDATE

Should have responded to all feedback now

Old

To summarize important design choices:

  • Follow scikit-learn guidelines to create this estimator
    • fit(X, y, sensitive_feature), predict(X), decision_function(X).
    • we handle preprocessing of y and sensitive_features, but the user needs to preprocess X. (the models are NN's, so we require numeric inputs everywhere)
  • Base class is general and mostly handles API, BackendEngine provides PyTorch/TensorFlow-specific code.
  • Allow many kwargs, in order to serve many use cases. Especially note:
    • y and sensitive_features can be from arbitrary distributions. We try to infer the distribution of this data and use this to choose appropriate preprocessors (y_transform and a_transform), loss functions (predictor_loss and adversary_loss), and the decision function (predictor_function). Currently, we only infer whether y or sensitive_features is univariate binomial, univariate multinomial, or multivariate normal. If we can infer such a distribution, we know precisely what to choose as aforementioned kwargs. Otherwise, the user must supply these kwargs explicitly.
    • predictor_optimizer and adversary_optimizer are kwargs, because in practice we see many different optimizers used.
    • callbacks. I found supporting callback functions is particularly useful (as is done in skorch for instance)

- implemented tensorflow part
- fit, partial_fit framework implementation
- input validation
- added TODOs
- moved some stuff
- thought about structure, now only predictor is variable
- started predict()
worked on UCI adult example!
@SeanMcCarren SeanMcCarren changed the title Adversarial mitigation ENH Add "adversarial debiasing" Oct 6, 2021
Copy link
Contributor

@hildeweerts hildeweerts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have embarrassingly little experience with pytorch/tensofrlow so I've mostly added a few nitpicks re. naming and such.

respectively. If none is specified, default is torch, else tensorflow,
depending on which is installed.
predictor_model : torch.nn.Module, tensorflow.keras.Model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we change predictor_model to estimator? I guess it's not strictly an estimator in the scikit-learn sense because it specifically requires a neural network, but it would be more consistent with reductions module and ThresholdOptimizer.

# Copyright (c) Microsoft Corporation and Fairlearn contributors.
# Licensed under the MIT License.

from tensorflow.keras import Model
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

torch and tensorflow are somewhat problematic imports. We can't add them to the default dependencies of fairlearn. However, you could check if they're installed before importing and otherwise surface an error message. We're doing that for matplotlib elsewhere:

raise RuntimeError(_MATPLOTLIB_IMPORT_ERROR_MESSAGE)

Another question is whether we should have a default way of installing tensorflow and torch, for example fairlearn[torch] or fairlearn[tensorflow]. Such "extras" would need to be defined in setup.py, or rather through another requirements-*.txt file.

Finally, we'd need another set of installation tests. You can check test/install for examples on how we do that for matplotlib. Basically, this is to check that everything works as expected in the case that we don't have these packages installed.

alpha : float, default = 0.1
A small number $\alpha$ as specified in the paper.
cuda : bool, default = False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would cuda require extra dependencies? If so, we'd need to test this in two configurations: with and without cuda.

Copy link
Contributor Author

@SeanMcCarren SeanMcCarren Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you require a GPU, special GPU driver (NVIDIA CUDA Toolkit I think), and an extra pip install torch+cuda or something like that (https://pytorch.org/get-started/locally/). But, torch.cuda.is_available() should only be True if the system supports cuda.

Is this at all testable on the CI server? I wouldn't know where to start.

I believe TensorFlow models automatically run on a single GPU if the TensorFlow install is set up properly (with cuda), and run on CPU otherwise. So I was thinking about removing this argument and defaulting to use the GPU if available?

# Copyright (c) Microsoft Corporation and Fairlearn contributors.
# Licensed under the MIT License.

"""Adversarial techniques to help mitigate fairness disparities."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, broadly I would say "to mitigate unfairness." For this particular one it's more about making the model lose the ability to distinguish between sensitive feature groups, right? That is intended to make it fairer, although there's no real guarantee associated with that. Does anyone else have thoughts about naming (including the class name)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The technique does specifically optimize for particular fairness constraint (demographic parity or equalized odds). I think the assumption is that if the model is penalized for learning the sensitive feature, the model's predictions are encouraged to be independent of the sensitive feature, which would satisfy demographic parity. For equalized odds the idea is similar but then we condition on both the sensitive feature and the ground truth target variable.

So maybe something like: "Adversarial techniques for learning neural networks under fairness constraints." Or something like that? I suppose in theory the approach could be extended beyond neural networks, so we could also say "for machine learning under fairness constraints".

FYI in fairlearn.reductions we currently have: "This module contains algorithms implementing the reductions approach to disparity mitigation." - we might want to reconsider that description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small note, the paper does show that under some typical assumptions (one of which is a sufficiently large adversarial model and that both models converge, which needn't be true in practice) then at convergence the constraint (demographic parity or equalized odds) is satisfied. For some toy example I was able to consistently reproduce this, but for the UCI adult dataset not (yet).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, this is the __init__ file, not sure it even needs a comment :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like it either, but flake8 is telling me to

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with others that this is not a biggie, but people copy-paste. So for the sake of consistency with what we say elsewhere, I'd just say "help mitigate unfairness" (fairness disparities is weird).

@SeanMcCarren
Copy link
Contributor Author

@hildeweerts @romanlutz The author of the paper confirmed that the sensitive feature and prediction could be more than one-dimensional, so I am working hard to make this work. I want to model the API as follows:

  • for both the prediction and sensitive features, I would like to set their Loss functions according to a kwarg. For instance, for predictions:
    • 'prediction'=='binary': then assume one-dimensional data (with only binary values) and choose sigmoid+binary cross entropy loss (by choosing this, we implicitly assume that the data comes from a binomial distribution)
    • 'prediction'=='categorical': N-dimensional data (for N classes) and choose softmax + categorical CE loss (so assuming the data comes from a multinomial I think)
    • 'prediction'=='regression': N-dimensional data (for N-dimensional continuous predictions and choose mean squared error loss (which you would do if you assume the data is normally distributed with fixed variance)

However, I have two points of concern:

  1. The first is whether we want to assume whether data is normally distributed in the regression case. Generally, I'd say this is the assumption most people will want, but we do lose some flexibility for the specialized user.
  2. I can imagine we have some very odd scenario where we want as sensitive_features both a categorical and a continuous variable. We can't do that if we only allow sensitive_features to be either categorical or continuous. I personally don't see an alternative that doesn't require a very complex interface.

@romanlutz
Copy link
Member

This is perhaps naive for reasons I haven't quite thought through yet, but how about reading the targets and deciding if it's binary, multiclass, or regression based on that?

ExponentiatedGradient does this (without multiclass) AFAIK without dedicated input variable.

@hildeweerts
Copy link
Contributor

For binary/multiclass there's a whole bunch of utils in scikit-learn that may be of help.

For regression a separate AdversarialMitigationRegressor class seems more intuitive to me (this is also a pattern that is common in scikit-learn, e.g., RandomForestClassifier versus RandomForestRegressor).

I bet @adrinjalali has thoughts on this as well.

@adrinjalali
Copy link
Member

Yes, it makes much more sense to me to have two classes, which share most of the code in a parent class, and they do the specific parts for classification and regression there. HistGradeintBoosting{Classifier/Regressor} are the most recent estimators added to sklearn, I'd refer to them for reference.

@SeanMcCarren
Copy link
Contributor Author

This is perhaps naive for reasons I haven't quite thought through yet, but how about reading the targets and deciding if it's binary, multiclass, or regression based on that?

ExponentiatedGradient does this (without multiclass) AFAIK without dedicated input variable.

@romanlutz That seems like a good idea actually! Now thinking about it, even if the user will want to do regression while all labels are either 0 or 1, a multinomial distribution will fit better than a normal distribution anyway, so we might want to make this decision for the user.

@hildeweerts @adrinjalali Thanks, that structure makes sense! Virtually all code would be shared, but that is also done in BaseHistGradientBoosting so that is good. Then, we force the user to use either classification or regression, not both in one model.

However, the problem remains for the sensitive_features, as the adversarial also needs to predict these sensitive features. How does one specify whether variables A, B, C are one-hot encoded multiclass or three independent binary variables? This matters in terms of which loss to use (underlying assumption about the data, otherwise we can't train for the correct constraint perfectly).

I'm kind of tempted to only support binary and continuous sensitive_features, as these can be mixed freely and don't span multiple columns (like multiclass as one-hot encoding does), so this would be a clear and concise solution. Or is there a lot of use for also supporting multiclass features, and letting the users map various groups of columns of sensitive_features?

@hildeweerts
Copy link
Contributor

Or is there a lot of use for also supporting multiclass features, and letting the users map various groups of columns of sensitive_features?

Tutorials like to pretend that everything is binary, but in practice there's hardly any sensitive feature that can truly be considered binary. So my first reaction would be to do things the other way around: assume none of the features are one-hot encoded and do one-hot encoding internally for multicategorical features (if necessary?)

To distinguish categorical / continuous features I could imagine an argument infer_type (or whatever we want to call it) that's either 'auto' (automatically infer type of sensitive features) or a dict { 'colname1' : 'continuous', 'colname2' : 'categorical' }.

The independence assumption should be described clearly in the documentation btw, because sensitive features may be statistically related even if they are not one-hot-encoded.

@SeanMcCarren
Copy link
Contributor Author

SeanMcCarren commented Oct 12, 2021

Okay, I think I was able to incorporate all of the comments now, so I will write them down here.

Let’s call X, Y, Z the input, prediction, and sensitive features from now on. All data are pd.DataFrame,
pd.Series, or np.ndarray. NO NaN’s allowed!

Training

Before training for the first time, we need to preprocess X, Y, Z.

  • Firstly, for each column of X, Y, Z, infer whether the column is binary, categorical, or continuous. Users can supply, for instance, assumption sensitive features={”column name”:”categorical”}
    (or integer column index for np arrays). For column where this is not supplied, we try to
    infer using the following rules:
    • if dataframe/series with dtype=’categorical’: if categories == 2: binary. Else: categorical
    • if dataframe/series of strings: if unique items == 2: binary. Else: categorical
    • finding a float that is not integral: immediately assume continuous
    • column of only 0 or 1: assume binary?
    • What remains is the cases where the column are integers that are not all 0 or 1. Then,
      raise ValueError in that it is unclear whether to treat the column as categorical/continuous
  • Binary columns of strings are translated to 0/1 np.ndarrays or torch.tensors
  • For categorical columns of strings, we create a mapping to a one-hot-encoding using all
    values present at that time, and expand X, Y , or Z using this mapping. We will end up with
    either np.ndarray or torch.tensor with strictly floats! No more dataframes at this point.
  • Define loss functions per original column such that we take binary cross entropy for every
    binary columns, categorical cross entropy per K columns of a one-hot-encoding of K classes,
    and squared-error loss for continuous columns. The total loss is the sum of all the individual
    column losses.
  • If the user passed something like predictor_model=[20, 20], then the predictormodel is
    constructed using two hidden layers with 20 nodes each and the inferred number of inputs/outputs (so after expanding the one-hot-encodings). If instead the user passed a predictor_model that is an initialized torch.nn.Module or tensorflow equivalent, then the user will have to make sure the dimensions are correct (after expanding 1hot).

Predicting

preprocess input using the previous mappings, pass through model, and use the
mappings to get to original form.

Sklearn

From sklearn I can use OneHotEncoding.
I should also make the entire preprocessing step as a sklearn-style transform. (What should I call it?)

Copy link
Member

@MiroDudik MiroDudik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fully done yet with my pass. I'm focusing on API and documentation.

fairlearn.postprocessing
fairlearn.preprocessing
fairlearn.reductions
fairlearn.adversarial
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the existing conflicts the API docs webpage does not show yet. I'll review it once it renders.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conflicts have been resolved, but it still seems that the webpage on CI doesn't process things correctly. Not sure what's going on.

the adversary will attain a loss equal to the entropy, so the adversary
can not
predict the sensitive features from the predictions.
Moreover, this model can be trained for either *demographic parity* or
Copy link
Member

@MiroDudik MiroDudik Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the original paper, they simply suggest to restrict training of the adversary to y=0 and y=1. I suggest we leave this for future though, because the implied notion of fairness would be somewhat different than what we call TruePositiveRateParity and FalsePositiveRateParity.

# Copyright (c) Microsoft Corporation and Fairlearn contributors.
# Licensed under the MIT License.

"""Adversarial techniques to help mitigate fairness disparities."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with others that this is not a biggie, but people copy-paste. So for the sake of consistency with what we say elsewhere, I'd just say "help mitigate unfairness" (fairness disparities is weird).

from numpy import zeros, argmax, arange


class AdversarialFairness(BaseEstimator):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I'd be in favor of keeping this one as is--AdversarialFairness. Adding Estimator feels very redundant. I haven't found a single instance of the naming pattern ...Estimator for concrete estimators in sklearn. Also, we don't say things like ExponentiatedGradientEstimator just ExponentiatedGradient.

one-hot encodings, and it maps strictly continuous-valued (possible 2d)
to itself.
a_transform : sklearn.base.TransformerMixin, default = fairlearn.adversarial.FloatTransformer("auto")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the argument name is sensitive_features, I think this should be called sf_transform.

Must be the same type as the
:code:`predictor_model`.
predictor_loss : str, callable, default = 'auto'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the current keyword choices for predictor_loss, adversary_loss, because they seem to refer to the type of the target, rather than the loss. If we go that route, we should consider using something similar to sklearn's target types.

Alternatively, we could use keywords that describe the loss, like "square_loss", "logistic_loss"... but let me think a bit more about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You raise an excellent point. Ideally, we'd also provide something like Y_distribution_type and A_distribution_type. I've thought about this before, but I can't remember why I let go of the idea. We should not get rid of the loss parameters though, and we would have to be sure that the inferred distribution type of the preprocessor agrees.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some further thoughts on this.

I think that the very basic question is whether we want to represent binary classification problems via networks with a single output or two outputs... so that should be decided upfront.

With that said, what do you think about the following tweaks of the current API:

  • base class (currently AdversarialFairness)

    • allow specifying y_transform, sf_transform, predictor_loss, adversary_loss, predictor_function
    • they can take values None (no transformation; this might still mean casting ints as a floats?), 'auto' (default), a callable, and additionally:
      • for predictor/adversary_loss, we support 'logistic_loss', 'square_loss'
        • if needed, we could also distinguish 'multinomial_logistic_loss' (which acts on one-hot encoding), but this could be inferred from the number of outputs of the network
      • for y/sf_transform, we support 'one_hot_encoder'
      • for predictor_function, we support 'argmax' (for one-hot representation) and 'threshold' (for one-output representation of binary classifiers), with an additional argument threshold_value with the default value 0
  • AdversarialFairnessClassifier

    • allow specifying sf_transform, adversary_loss
    • fill in y_transform, predictor_loss, adversary_loss, predictor_function so as to support (multinomial) logistic regression; I wouldn't allow overriding these (if that's what you want to do, you can just call the base class)
    • besides predict and decision_function, we should consider supporting predict_proba and predict_log_proba
  • AdversarialFairnessRegressor

    • allow specifying sf_transform, predictor_loss, adversary_loss
    • fill in y_transform, predictor_function as None

And one more idea--not sure how much I like it, but something that occurred to me. We could change:

  • predictor_loss -> y_loss
  • adversary_loss -> sf_loss

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your thoughts! I really appreciate you taking the time!

I just want to reiterate that we (it was a suggestion from adrin or hilde I think) made the design choice that as much is automatically inferred from data as possible, so that means what kind of transform/loss/predictor function to use. Looking at this now, I am not so sure why I included y/sf_transform as parameters, because without this FloatTransformer class thingy (FloatTransform is the default y/sf_transform) we lose this automatic inference. The nice thing is that if the user has already applied a one-hot-encoding, the FloatTransform class will still infer and tell AdversarialFairness that the data is categorical, which is required to infer what loss function to use. I agree that the current API needs changing, but I do not see yet how we can resolve this nicely.

  • for y/sf_transform, we support 'one_hot_encoder'

I think I like this keyword-style better than how it is currently done, but I'm still on the fence. Currently, you can achieve this sf_transform='one_hot_encoder' with sf_transform=FloatEncoder('categorical'), so what you suggest is definitely cleaner. What do you think should happen if the user provides a custom transform? Should we then require the user to (1) also pass loss functions, because we can no longer infer them from FloatEncoder? Or (2) should we do the inferring outside of FloatEncoder? or (3) should we not even let the user pass a custom transform? I find this a difficult choice because all options feel bad in some sense. I've actually switched implementations from (2) to (1) in the past, but I am now tending back towards (2) and using the keywords you suggested. I'd love to hear your thoughts on this. @adrinjalali you were quite involved in this design choice regarding preprocessing in the past, so perhaps you can help us here.

  • for predictor/adversary_loss, we support 'logistic_loss', 'square_loss'

Currently we accept 'binary', 'category', 'continuous', I chose those because they are descriptive of the distribution that you assume, but I understand that you favor more precise names such as 'logistic_loss', 'square_loss', or 'argmax'? I might agree actually.

  • fill in y_transform, predictor_loss, adversary_loss, predictor_function so as to support (multinomial) logistic regression;

Then we'd still need to infer whether it is categorical or binary I'd say, hence I'd still love to know the distribution type somehow (rather through inferring it from the data than through a keyword parameter)

And one more idea--not sure how much I like it, but something that occurred to me.

I like this!

  • besides predict and decision_function, we should consider supporting predict_proba and predict_log_proba

Yeah sure!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try to answer your questions--but also let me know if I left something unanswered!

What to do about 'auto' loss functions and predictor_function with custom y/sf_transform.

  • The behavior should be the same as what we do when the transform is None. I'm not sure what you do currently, but a sensible option would be an automatic inference among:
    1. univariate {0.0,1.0} (-> univariate logistic model with sign-based prediction function),
    2. one-hot-encoded categorical (-> multinomial logistic with argmax prediction function),
    3. continuous univariate or vectors (-> square loss/L2 norm with identity prediction function),
    4. otherwise: throw an exception

Loss versus link function ("distribution assumption").

  • I think by "distribution assumption" you mean the link function that is used in generalized linear models. I'd be in favor of keeping the idea of link (which is basically our prediction_function) separate from loss function. For example, it would be awkward to specify Huber loss (for robust regression) using the language of link functions.

Categorical vs. binary encoding in classification.

  • For this one, I'm not sure that we have any other option other than inferring from the provided y. My idea would be to support the binary vs. multiclass classification similarly to sklearn's LogisticRegression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this was really clarifying!

What to do about 'auto' loss functions and predictor_function with custom y/sf_transform.

Oh so doing this inference if the transform is None, and otherwise just do the transform and infer then. That sounds good. Basically decoupling this inference from the transform. Okay! Apart from that, I make the same inferences (the i. through iv.). (I initially chose to be a bit more general and accept mixed column types (so for instance binary and continuous columns), but we decided there was no use for this.)

Loss versus link function ("distribution assumption").

I must say I am learning a lot here, I was not familiar with these terms. I understand a reason for keeping these things seperate now, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MiroDudik Hey, I finally got around to this, was a bit more work than I anticipated. What do you think? Internally, I still use "binary", "category", "continuous", because I feel this may be relevant to know later, but the API changed and accepts the better more concrete terms now.

As for the predict_(log_)proba, I am not sure how to interpret this. In case of a single output neuron $\hat y$, should we unfold it into a vector with separate probabilities for $\hat y = 0$ and $\hat y = 1$? And what if there are more weird cases? For now, we do have decision_function, which might give enough flexibility?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above -> if we specialize to logistic models, then predict_proba and predict_log_proba are obvious, right? These would be only provided for AdversarialMitigationClassifier.

self.warm_start = warm_start
self.random_state = random_state

def __setup(self, X, Y, A):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def __setup(self, X, Y, A):
def _setup(self, X, Y, A):

the canonical way to indicate a method is private is a single _.

# Numbers
check_scalar(self.threshold_value, "threshold_value", (int, float))

# Non-negative numbers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm agnostic on whether these should be a separate method or not. Either way is fine with me.


class FloatTransformer(BaseEstimator, TransformerMixin):
"""
Transformer that maps dataframes to numpy arrays of floats.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant was to have a ColumnTransformer that does exactly what this transformer does. The output would be an array of floats, and then nothing would be different. You'd apply your ColumnTransformer internally on your y.

@SeanMcCarren
Copy link
Contributor Author

SeanMcCarren commented Apr 14, 2022

@adrinjalali

What I meant was to have a ColumnTransformer that does exactly what this transformer does. The output would be an array of floats, and then nothing would be different. You'd apply your ColumnTransformer internally on your y.

I like those too, its much simpler. I was aware of them, but there are some reasons why I am not using this:

  • I need to do all the bookkeeping about types and such to make sure we infer all the variables (loss, predictor, shape and function of last layer of neural networks) according to the same type assumptions
  • ColumnTransformer won't distinguish whether we've passed continuous values or numbers that need to be one hot encoded

Maybe you know of a better way to achieve this?

(choose :math:`\alpha` closer to zero) or increasing fairness
(choose larger :math:`\alpha`).
epochs : int, default = 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be more common? epochs or num_epochs as here: https://aif360.readthedocs.io/en/latest/modules/generated/aif360.sklearn.inprocessing.AdversarialDebiasing.html

Also, I don't see documentation for max_iter.

epochs : int, default = 1
Number of epochs to train for.
batch_size : int, default = -1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of changing the default for batch_size. I think that any default in the range 1-32 will work (the paper above suggests 2-32). The AIF360 implementation uses num_epochs=50, batch_size=128.

from numpy import zeros, argmax, arange


class AdversarialFairness(BaseEstimator):
Copy link
Member

@MiroDudik MiroDudik Apr 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about all of this some more, and I'm still finding parts of the constructor API unnecessarily complicated--and there are various dependencies spread across multiple parameters. In particular, I'd like to suggest an alternative to what's currently handled by: predictor_loss, adversary_loss, predictor_function, y_transform, sf_transform

AdversarialMitigationClassifier

  • can handle binary classification and multi-class classification
  • for encoding of y, we follow sklearn`s target types, which are described here
  • Binary classification
    • y has shape (n_samples,) and it contains two values typically {0,1}; if other values are provided they are transformed into {0,1} during fit and predict
    • the neural net is outputting a single scalar, corresponding to the logit of P(Y=1|X), and is trained by minimizing logistic loss
  • Multiclass classification
    • y has either shape (n_samples,n_classes) with 0/1 values implementing one-hot-encoding, or it has shape (n_samples,) and contains 3 or more distinct values; in the latter case, it is transformed into one-hot-encoding during fit and predict
    • the neural net is outputting a vector of n_classes scalars, corresponding to the linear scores of multinomial logistic model of P(Y|X), and is trained by minimizing log loss (this is called CrossEntropyLoss in PyTorch)
  • Adversarial model
    • sensible defaults around fitting sensitive features are a bit more tricky, but I think that we should begin by covering the common case of binary or categorical sensitive features:
      • sensitive_features is shape (n_samples,) or (n_samples,n_sensitive_features) where every column has two distinct values -> columns are transformed into {0,1}, and the training loss is the sum of logistic losses (i.e., treated as the sum of binary problems)
      • sensitive_features is shape (n_samples,) or (n_samples,n_sensitive_features) where at least one column has three or more values -> columns are transformed using one-hot-encoding, and the training loss is the sum of log losses (i.e., treated as the sum of multiclass problems)
  • This means that we could get rid of the parameters predictor_loss, adversary_loss, predictor_function, y_transform, sf_transform.
    • If we want to expose more generality (say continuous sensitive features), we could do that later.

AdversarialMitigationRegressor

  • y has shape (n_samples,) and contains floats, the loss is square loss
  • sensitive features and adversarial model are treated the same way as in AdversarialMitigationClassifier.
  • again, this means we could get rid of predictor_loss, adversary_loss, predictor_function, y_transform, sf_transform, and expose more generality later.

AdversarialMitigation

  • I suggest to make this private; in that case I don't care too much about the API.

Must be the same type as the
:code:`predictor_model`.
predictor_loss : str, callable, default = 'auto'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above -> if we specialize to logistic models, then predict_proba and predict_log_proba are obvious, right? These would be only provided for AdversarialMitigationClassifier.

@adrinjalali
Copy link
Member

Meeting summary: have separate PR from this one with the minimal implementation, as small as we're happy to have it released, and have the main base class private at least for now.

@SeanMcCarren
Copy link
Contributor Author

@adrinjalali @MiroDudik @romanlutz Moved to #1079 :)

@riedgar-ms
Copy link
Member

Is this still active, or should it be closed in favour of #1079 ?

@hildeweerts
Copy link
Contributor

Close in favor of #1079

riedgar-ms pushed a commit that referenced this pull request Oct 27, 2022
Add Adversarial Debiasing algorithm. Replaces PR #973
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants