Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Salary automl fairness #1836

Merged
merged 8 commits into from
Sep 14, 2024
Merged

Conversation

moonlanderr
Copy link
Collaborator

<insert pull request description here>


Checklist

Please go through each entry in the below checklist and mark an 'X' if that condition has been met. Every entry should be marked with an 'X' to be get the Pull Request approved.

  • All imports are in the first cell?
    • First block of imports are standard libraries
    • Second block are 3rd party libraries
    • Third block are all arcgis imports? Note that in some cases, for samples, it is a good idea to keep the imports next to where they are used, particularly for uncommonly used features that we want to highlight.
  • All GIS object instantiations are one of the following?
    • gis = GIS()
    • gis = GIS('home') or gis = GIS('pro')
    • gis = GIS(profile="your_online_portal")
    • gis = GIS(profile="your_enterprise_portal")
  • If this notebook requires setup or teardown, did you add the appropriate code to ./misc/setup.py and/or ./misc/teardown.py?
  • If this notebook references any portal items that need to be staged on AGOL/Python API playground, did you coordinate with a Python API team member to stage the item the correct way with the api_data_owner user?
  • If the notebook requires working with local data (such as CSV, FGDB, SHP, Raster files), upload the files as items to the Geosaurus Online Org using api_data_owner account and change the notebook to first download and unpack the files.
  • Code simplified & split out across multiple cells, useful comments?
  • Consistent voice/tense/narrative style? Thoroughly checked for typos?
  • All images used like <img src="base64str_here"> instead of <img src="https://some.url">? All map widgets contain a static image preview? (Call mapview_inst.take_screenshot() to do so)
  • All file paths are constructed in an OS-agnostic fashion with os.path.join()? (Instead of r"\foo\bar", os.path.join(os.path.sep, "foo", "bar"), etc.)
  • Is your code formatted using Jupyter Black? You can use Jupyter Black to format your code in the notebook.
  • If this notebook showcases deep learning capabilities, please go through the following checklist:
    • Are the inputs required for Export Training Data Using Deep Learning tool published on geosaurus org (api data owner account) and added in the notebook using gis.content.get function?
    • Is training data zipped and published as Image Collection? Note: Whole folder is zipped with name same as the notebook name.
    • Are the inputs required for model inferencing published on geosaurus org (api data owner account) and added in the notebook using gis.content.get function? Note: This includes providing test raster and trained model.
    • Are the inferenced results displayed using a webmap widget?
  • IF YOU WANT THIS SAMPLE TO BE DISPLAYED ON THE DEVELOPERS.ARCGIS.COM WEBSITE, ping @jyaistMap so he can add it to the list for the next deploy.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@moonlanderr moonlanderr self-assigned this May 24, 2024
Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:17Z
----------------------------------------------------------------

I think we should clearly call out that we are trying to see if there is any bias towards a specific gender either in this section or in the next before we proceed further. This will set the context and problem statement upfront


Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:17Z
----------------------------------------------------------------

We can remove unwanted/unused imports here


moonlanderr commented on 2024-07-17T04:47:02Z
----------------------------------------------------------------

removed unwanted

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:18Z
----------------------------------------------------------------

Incomplete sentence : Using this dataset we will attempt to train a model to predict

Also, any reason why we chose to keep only 104 records?


moonlanderr commented on 2024-07-17T04:46:48Z
----------------------------------------------------------------

corrected.

KarthikDutt commented on 2024-08-21T04:29:58Z
----------------------------------------------------------------

The dataset consists of 32,561 records, with 21,790 males and 10,771 females, suggesting a bias favoring males. THis statement is incorrect. The dataset is biased because males are more likely to be classified as earning more than 50k compared to women and not because there is less representation of women in the dataset.

moonlanderr commented on 2024-09-06T05:42:17Z
----------------------------------------------------------------

corrected

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:19Z
----------------------------------------------------------------

Before building the model, we might also consider showing if the data is inherently biased. This can be done by applying some formulas to calculate DPR and EOR. By doing so the story will be:

  • Before training the model, the raw data was biased.
  • After training the model, this bias is amplified. (general problem with ML models)
  • We mitigate it by using fairness features in the API

moonlanderr commented on 2024-07-17T04:49:48Z
----------------------------------------------------------------

Now I have shown this by just breaking up the male vs female counts , with showing male bias

KarthikDutt commented on 2024-08-21T04:30:20Z
----------------------------------------------------------------

Please refer to previous comment.

moonlanderr commented on 2024-09-06T05:45:20Z
----------------------------------------------------------------

added a basic analysis of male vs female higher salary stats that indicate bias

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:20Z
----------------------------------------------------------------

Brief description on what is happening during prepare_data, fit, score, report would be useful if there any users who are first time users of AutoML api.


moonlanderr commented on 2024-07-17T05:39:52Z
----------------------------------------------------------------

added descriptions

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:20Z
----------------------------------------------------------------

vanilla-trained may not be the correct term.


moonlanderr commented on 2024-07-17T05:39:33Z
----------------------------------------------------------------

replaced vanilla

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:21Z
----------------------------------------------------------------

Brief explanation of the curves would be good to have


moonlanderr commented on 2024-07-17T06:26:00Z
----------------------------------------------------------------

added briefly, along with the link to refer for more detail

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:22Z
----------------------------------------------------------------

Since we are bringing in the terms eqr and dpr, we must explain here what these terms mean.


moonlanderr commented on 2024-07-17T06:34:15Z
----------------------------------------------------------------

added the how automl and fairness works, where users can check the detail

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:23Z
----------------------------------------------------------------

Here, we should also explain specific to this example/dataset, what EOR and DPR mean. For ex -

  1. DPR means that we are solely concerned about Women in this dataset are represented in high income category
  2. EOR means ..... .

moonlanderr commented on 2024-07-17T07:18:37Z
----------------------------------------------------------------

added brief explanation

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:23Z
----------------------------------------------------------------

Here we will need to talk about sensitive_variable and other new fairness related parameters that we have added and explain what those parameters are and what values they can take.


moonlanderr commented on 2024-07-17T08:16:52Z
----------------------------------------------------------------

added brief explanation on sensitive variable

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:24Z
----------------------------------------------------------------

We will have to explain how we mitigated the bias. What strategy we are using to mitigate the bias.


moonlanderr commented on 2024-07-17T08:35:20Z
----------------------------------------------------------------

so here I thought to mention the grid searching by automl, do you mean something else by strategy? I am not sure what is internally happening. Could you add this section here, referring to the startegy it is using,

KarthikDutt commented on 2024-08-21T04:31:53Z
----------------------------------------------------------------

Internally we are using an approach called Reweighing.

Reweighing is a preprocessing technique that Weights the examples in each (group, label) combination differently to ensure fairness before classification 

moonlanderr commented on 2024-09-06T06:51:39Z
----------------------------------------------------------------

added

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:25Z
----------------------------------------------------------------

When we say 'Hence we can consider the model is now mitigated' , we will have to tell why we consider it mitigated


moonlanderr commented on 2024-07-17T08:38:34Z
----------------------------------------------------------------

I explained that here "The model report shows that 2_Default_LightGBM_SampleWeigthing_Update_2 is the best trained model and the respective demograpihc_parity_ratio is now 0.84 which is up from 0.29, which is also higher than the minimum threshold of 0.80."

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:25Z
----------------------------------------------------------------

Formating is bad. If needed we might use image here so tht it is more readable.


moonlanderr commented on 2024-07-17T08:40:40Z
----------------------------------------------------------------

It is showing the table properly in firefox, is it due to browser? I think it will be in the table format when it gets published, otherwise will add as image

KarthikDutt commented on 2024-08-21T04:33:15Z
----------------------------------------------------------------

I am using chrome.

I feel it is better to use images. In case you are sure that it will be fine after publishing, then you can ignore this.

moonlanderr commented on 2024-09-06T07:01:23Z
----------------------------------------------------------------

yes, it will be formatted further by the publishing team, will give it the proper display

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:26Z
----------------------------------------------------------------

Bad formating


moonlanderr commented on 2024-07-17T08:41:22Z
----------------------------------------------------------------

same as above, showing ok in the notebook my machine,

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:27Z
----------------------------------------------------------------

We will need to explain what selection rate means (Definition)


moonlanderr commented on 2024-07-17T08:48:05Z
----------------------------------------------------------------

added

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:28Z
----------------------------------------------------------------

This indicates that females were more likely to be incorrectly classified as negative cases.

The above statement does not convey the problem that females might face due to this. Clearly stating that the model is likely to classify a female as earning less than 50k might give more description of the problem.


moonlanderr commented on 2024-07-17T08:51:14Z
----------------------------------------------------------------

added

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:28Z
----------------------------------------------------------------

males are being incorrectly classified as negative cases after mitigation.

Can we use a better term instead of negative cases. Users might wonder what negative cases would mean in this context.


moonlanderr commented on 2024-07-17T08:53:18Z
----------------------------------------------------------------

added

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:29Z
----------------------------------------------------------------

We might have to add a couple more sentence to describe why we are reducing threshold from 0.8 to 0.7.

We can probably start by acknowledging that with EOR and threshold of 0.8 , the model was not able to find a fair model. But it AuotoML was able to mitigate to an extent that EOR improved to 0.7 which is still a substantial improvement over 0.17.

Then we can explain that , we can formalize this by reducing the threshold to 0.7 in the API as well.


moonlanderr commented on 2024-07-17T08:56:25Z
----------------------------------------------------------------

added

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:30Z
----------------------------------------------------------------

Formating is not readable.


moonlanderr commented on 2024-07-17T08:57:11Z
----------------------------------------------------------------

same, as I think it would get fixed once published

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:30Z
----------------------------------------------------------------

Formatting needs to be fixed.


moonlanderr commented on 2024-07-17T08:57:55Z
----------------------------------------------------------------

will get fixed in publishing as with earlier notebooks

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:31Z
----------------------------------------------------------------

Abbreviations FNR , FPR and SR needs to be expanded.

We might have to reword the drawbacks so that it does not appear as if , all that we did to mitigate the bias has resulted in nothing at the end as new bias got introduced.


moonlanderr commented on 2024-07-17T09:02:28Z
----------------------------------------------------------------

removed part of drawback, the rest is ok, slightly hinting that new biases have crept in which can be worked on further.

Copy link

review-notebook-app bot commented Jun 5, 2024

View / edit / reply to this conversation on ReviewNB

KarthikDutt commented on 2024-06-05T09:58:32Z
----------------------------------------------------------------

Conclusion must talk a little bit about how users can mitigate biases in their own datasets. What fairness parameters to choose and when.


moonlanderr commented on 2024-07-17T09:05:32Z
----------------------------------------------------------------

I think that this will depend on the user data, and will be a bit open ended to comment, besides the automl fairness guide is there to help for that. Since this is now getting a bit heavy with technical detail and concepts.

Copy link
Collaborator Author

corrected.


View entire conversation on ReviewNB

Copy link

review-notebook-app bot commented Sep 5, 2024

View / edit / reply to this conversation on ReviewNB

BP-Ent commented on 2024-09-05T21:53:08Z
----------------------------------------------------------------

In this study, we explored the application of fairness metrics in machine learning, particularly focusing on the limitations and benefits of Demographic Parity Ratio (DPR) and Equalized Odds Ratio (EOR) for fairness assessment.

First, we performed an initial fairness assessment of the model predicting salary by utilizing the demographic variable dataset and a vanilla automl workflow. The initial model showed discrepancies in fairness metrics, particularly with higher false positive rates for certain groups revealed by the Demographic Parity Ratio (DPR) and the Equalized Odds Ratio (EOR).

Subsequently, fairness mitigation was done first with DPR and then with EOR. While DPR addressed some aspects of fairness, it fell short in balancing false positive and false negative rates across groups, leading to suboptimal performance in fairness. Then mitigation using the Equalized Odds Ratio metric provided a more comprehensive fairness assessment by ensuring equal false positive and true positive rates across all groups, thereby addressing the limitations observed with DPR.

Finally, adjusting the threshold allowed automl to construct a fair model, which is useful for getting an Ensemble model. Otherwise, if the model is not able to construct a fair model, a model ensemble is not created.

Although there might be bias still present in the model, the mitigation workflow was able to reduce it significantly. Thus continuous evaluation and refinement of the fairness workflow would be crucial for achieving more equitable machine learning models and unbiased decision-making processes.


moonlanderr commented on 2024-09-06T08:51:52Z
----------------------------------------------------------------

added

BP-Ent
BP-Ent previously requested changes Sep 5, 2024
Copy link
Collaborator

@BP-Ent BP-Ent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested changes made on reviewNB

Copy link
Collaborator Author

corrected


View entire conversation on ReviewNB

Copy link
Collaborator Author

added a basic analysis of male vs female higher salary stats that indicate bias


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

yes, it will be formatted further by the publishing team, will give it the proper display


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

Copy link
Collaborator Author

added


View entire conversation on ReviewNB

@moonlanderr
Copy link
Collaborator Author

@BP-Ent , all suggestions added pls check

@moonlanderr
Copy link
Collaborator Author

@KarthikDutt , I have corrected the bias indication paragraph, pls check,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants