research-article

Adaptive Sampling Strategies to Construct Equitable Training Datasets

Authors:

Ro Encarnacion,

Sam Corbett-Davies,

Stevie Bergman,

Sharad GoelAuthors Info & Claims

FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency

Pages 1467 - 1478

https://doi.org/10.1145/3531146.3533203

Published: 20 June 2022 Publication History

Abstract

In domains ranging from computer vision to natural language processing, machine learning models have been shown to exhibit stark disparities, often performing worse for members of traditionally underserved groups. One factor contributing to these performance gaps is a lack of representation in the data the models are trained on. It is often unclear, however, how to operationalize representativeness in specific applications. Here we formalize the problem of creating equitable training datasets, and propose a statistical framework for addressing this problem. We consider a setting where a model builder must decide how to allocate a fixed data collection budget to gather training data from different subgroups. We then frame dataset creation as a constrained optimization problem, in which one maximizes a function of group-specific performance metrics based on (estimated) group-specific learning rates and costs per sample. This flexible approach incorporates preferences of model-builders and other stakeholders, as well as the statistical properties of the learning task. When data collection decisions are made sequentially, we show that under certain conditions this optimization problem can be efficiently solved even without prior knowledge of the learning rates. To illustrate our approach, we conduct a simulation study of polygenic risk scores on synthetic genomic data—an application domain that often suffers from non-representative data collection. When optimizing policies for overall or group-specific average health, we find that our adaptive approach outperforms heuristic strategies, including equal and representative sampling. In this sense, equal treatment with respect to sampling decisions does not guarantee equal or equitable outcomes.

References

[1]

Jacob Abernethy, Pranjal Awasthi, Matthäus Kleindessner, Jamie Morgenstern, Chris Russell, and Jie Zhang. 2021. Active Sampling for Min-Max Fairness. arxiv:2006.06879 [stat.ML]

[2]

Artificial Intelligence Act. 2021. Proposal for a regulation of the European Parliament and the Council laying down harmonised rules on Artificial Intelligence (Artificial Intelligence Act) and amending certain Union legislative acts. EUR-Lex-52021PC0206 (2021).

[3]

HLEG AI. 2019. High-level Expert Group on Artificial Intelligence.

[4]

Hadis Anahideh, Abolfazl Asudeh, and Saravanan Thirumuruganathan. 2020. Fair active learning. arXiv preprint arXiv:2001.01796(2020).

[5]

Adrien Badré, Li Zhang, Wellington Muchero, Justin C Reynolds, and Chongle Pan. 2021. Deep neural network improves the estimation of polygenic risk scores for breast cancer. Journal of Human Genetics 66, 4 (2021), 359–369.

[6]

Michiel A Bakker, Alejandro Noriega-Campero, Duy Patrick Tu, Prasanna Sattigeri, Kush R Varshney, and AS Pentland. 2019. On fairness in budget-constrained decision making. In KDD Workshop of Explainable Artificial Intelligence.

[7]

Yahav Bechavod, Katrina Ligett, Aaron Roth, Bo Waggoner, and Steven Z Wu. 2019. Equal opportunity in online classification with partial feedback. Advances in Neural Information Processing Systems 32 (2019).

[8]

EM Bender and Batya Friedman. 2018. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Transactions of the Association for Computational Linguistics 6 (2018), 587–604.

[9]

Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. 2017. A convex framework for fair regression. arXiv preprint arXiv:1706.02409(2017).

[10]

Su Lin Blodgett, Lisa Green, and Brendan O’Connor. 2016. Demographic Dialectal Variation in Social Media: A Case Study of African-American English. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 1119–1130.

[11]

Frédéric Branchaud-Charron, Parmida Atighehchian, Pau Rodríguez, Grace Abuhamad, and Alexandre Lacoste. 2021. Can Active Learning Preemptively Mitigate Fairness Issues?arXiv preprint arXiv:2104.06879(2021).

[12]

Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Conference on Fairness, Accountability and Transparency. PMLR, 77–91.

[13]

US Census Bureau. 2020. Decennial Census.

[14]

William Cai, Johann Gaebler, Nikhil Garg, and Sharad Goel. 2020. Fair allocation through selective information acquisition. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 22–28.

Digital Library

[15]

Toon Calders and Sicco Verwer. 2010. Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery 21, 2 (2010), 277–292.

Digital Library

[16]

Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183–186.

[17]

Taylor B Cavazos and John S Witte. 2021. Inclusion of variants discovered from diverse populations improves polygenic risk score transferability. Human Genetics and Genomics Advances 2, 1 (2021), 100017.

[18]

Alex Chohlas-Wood, Madison Coots, Emma Brunskill, and Sharad Goel. 2021. Learning to be Fair: A Consequentialist Approach to Equitable Decision-Making. arXiv preprint arXiv:2109.08792(2021).

[19]

Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data 5, 2 (2017), 153–163.

[20]

Sam Corbett-Davies and Sharad Goel. 2018. The measure and mismeasure of fairness: A critical review of fair machine learning. arXiv preprint arXiv:1808.00023(2018).

[21]

Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 797–806.

Digital Library

[22]

Amanda Coston, Alan Mishler, Edward H Kennedy, and Alexandra Chouldechova. 2020. Counterfactual risk assessments, evaluation, and fairness. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 582–593.

Digital Library

[23]

Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, and Adam Tauman Kalai. 2019. Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 120–128.

Digital Library

[24]

Francisco M De La Vega and Carlos D Bustamante. 2018. Polygenic risk scores: a biased prediction?Genome Medicine 10, 1 (2018), 1–3.

[25]

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. 214–226.

Digital Library

[26]

Leonard E Egede. 2006. Race, ethnicity, culture, and disparities in health care. Journal of General Internal Medicine 21, 6 (2006), 667.

[27]

Benjamin Fish, Jeremy Kun, and Ádám D Lelkes. 2016. A Confidence-Based Approach for Balancing Fairness and Accuracy. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 144–152.

[28]

Daniel Galvez, Greg Diamos, Juan Manuel Ciro Torres, Keith Achorn, Anjali Gopi, David Kanter, Max Lam, Mark Mazumder, and Vijay Janapa Reddi. 2021. The People’s Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage. (2021).

[29]

Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021. Datasheets for datasets. Commun. ACM 64, 12 (2021), 86–92.

Digital Library

[30]

Steven N Goodman, Sharad Goel, and Mark R Cullen. 2018. Machine Learning, Health Disparities, and Causal Reasoning. Annals of Internal Medicine 169, 12 (2018), 883–884.

[31]

Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of Opportunity in Supervised Learning. Advances in Neural Information Processing Systems 29 (2016), 3315–3323.

[32]

Caner Hazirbas, Joanna Bitton, Brian Dolhansky, Jacqueline Pan, Albert Gordo, and Cristian Canton Ferrer. 2021. Towards Measuring Fairness in AI: the Casual Conversations Dataset. IEEE Transactions on Biometrics, Behavior, and Identity Science (2021).

[33]

Michael Hind, Sameep Mehta, Aleksandra Mojsilovic, Ravi Nair, Karthikeyan Natesan Ramamurthy, Alexandra Olteanu, and Kush R Varshney. 2018. Increasing Trust in AI Services through Supplier’s Declarations of Conformity. arXiv preprint arXiv:1808.07261 18 (2018), 2813–2869.

[34]

Sarah Holland, Ahmed Hosny, and Sarah Newman. 2020. The Dataset Nutrition Label. Data Protection and Privacy: Data Protection and Democracy (2020) 1 (2020).

[35]

Jeremy Irvin, Hao Sheng, Neel Ramachandran, Sonja Johnson-Yu, Sharon Zhou, Kyle Story, Rose Rustowicz, Cooper Elsworth, Kemen Austin, and Andrew Y Ng. 2020. Forestnet: Classifying Drivers of Deforestation in Indonesia using Deep Learning on Satellite Imagery. arXiv preprint arXiv:2011.05479(2020).

[36]

Faisal Kamiran, Indrė Žliobaitė, and Toon Calders. 2013. Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowledge and Information Systems 35, 3 (2013), 613–644.

[37]

Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 35–50.

[38]

Jerome Kelleher, Alison M Etheridge, and Gilean McVean. 2016. Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes. PLOS Computational Biology 12, 5 (2016), e1004842.

[39]

Amit V Khera, Mark Chaffin, Krishna G Aragam, Mary E Haas, Carolina Roselli, Seung Hoan Choi, Pradeep Natarajan, Eric S Lander, Steven A Lubitz, Patrick T Ellinor, 2018. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nature Genetics 50, 9 (2018), 1219–1224.

[40]

Niki Kilbertus, Manuel Gomez Rodriguez, Bernhard Schölkopf, Krikamol Muandet, and Isabel Valera. 2020. Fair decisions despite imperfect predictions. In International Conference on Artificial Intelligence and Statistics. PMLR, 277–287.

[41]

Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2017. Inherent Trade-Offs in the Fair Determination of Risk Scores. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). 43:1–43:23. https://doi.org/10.4230/LIPIcs.ITCS.2017.43

[42]

Allison Koenecke, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R Rickford, Dan Jurafsky, and Sharad Goel. 2020. Racial disparities in automated speech recognition. Proceedings of the National Academy of Sciences 117, 14(2020), 7684–7689.

[43]

Matt Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual Fairness. In Proceedings of the 31st International Conference on Neural Information Processing Systems.

[44]

Raian V Maretto, Leila MG Fonseca, Nathan Jacobs, Thales S Körting, Hugo N Bendini, and Leandro L Parente. 2020. Spatio-Temporal Deep Learning Approach to Map Deforestation in Amazon Rainforest. IEEE Geoscience and Remote Sensing Letters 18, 5 (2020), 771–775.

[45]

Alicia R Martin, Christopher R Gignoux, Raymond K Walters, Genevieve L Wojcik, Benjamin M Neale, Simon Gravel, Mark J Daly, Carlos D Bustamante, and Eimear E Kenny. 2017. Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations. The American Journal of Human Genetics 100, 4 (2017), 635–649.

[46]

Tara C Matise, Jose Luis Ambite, Steven Buyske, Christopher S Carlson, Shelley A Cole, Dana C Crawford, Christopher A Haiman, Gerardo Heiss, Charles Kooperberg, Loic Le Marchand, 2011. The Next PAGE in Understanding Complex Traits: Design for the Analysis of Population Architecture Using Genetics and Epidemiology (PAGE) Study. American Journal of Epidemiology 174, 7 (2011), 849–859.

[47]

Prem Melville, Maytal Saar-Tsechansky, Foster Provost, and Raymond Mooney. 2004. Active feature-value acquisition for classifier induction. In Fourth IEEE International Conference on Data Mining (ICDM’04). IEEE, 483–486.

Digital Library

[48]

Prem Melville, Maytal Saar-Tsechansky, Foster Provost, and Raymond Mooney. 2005. An expected utility approach to active feature-value acquisition. In Fifth IEEE International Conference on Data Mining (ICDM’05). IEEE, 4–pp.

Digital Library

[49]

Melinda C Mills and Charles Rahal. 2019. A scientometric review of genome-wide association studies. Communications biology 2, 1 (2019), 1–11.

[50]

Alan Mishler, Edward H Kennedy, and Alexandra Chouldechova. 2021. Fairness in Risk Assessment Instruments: Post-Processing to Achieve Counterfactual Equalized Odds. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 386–400.

Digital Library

[51]

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 220–229.

Digital Library

[52]

Hamed Nilforoshan, Johann Gaebler, Ravi Shroff, and Sharad Goel. 2022. Causal Conceptions of Fairness and their Consequences. Preprint.

[53]

Alejandro Noriega-Campero, Michiel A Bakker, Bernardo Garcia-Bulle, and Alex’Sandy’ Pentland. 2019. Active fairness in algorithmic decision making. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 77–83.

Digital Library

[54]

Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 6464 (2019), 447–453.

[55]

AJ Piergiovanni and Michael S. Ryoo. 2020. AViD Dataset: Anonymized Videos from Diverse Countries. In Advances in Neural Information Processing Systems (NeurIPS).

[56]

Alice B Popejoy and Stephanie M Fullerton. 2016. Genomics is failing on diversity. Nature News 538, 7624 (2016), 161.

[57]

Inioluwa Deborah Raji and Joy Buolamwini. 2019. Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 429–435.

Digital Library

[58]

Goce Ristanoski, Wei Liu, and James Bailey. 2013. Discrimination aware classification for imbalanced datasets. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 1529–1532.

Digital Library

[59]

Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A Smith. 2019. The Risk of Racial Bias in Hate Speech Detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1668–1678.

[60]

Amr Sharaf and Hal Daumé III. 2020. Promoting fairness in learned models by learning to active learn under parity constraints. In Workshop on Real World Experiment Design and Active Learning. International Conference on Machine Learning.

[61]

Giorgio Sirugo, Scott M Williams, and Sarah A Tishkoff. 2019. The Missing Diversity in Human Genetic Studies. Cell 177, 1 (2019), 26–31.

[62]

Sahil Verma and Julia Rubin. 2018. Fairness definitions explained. In 2018 IEEE/ACM International Workshop on Software Fairness (FairWare). IEEE, 1–7.

Digital Library

[63]

Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1171–1180.

Digital Library

[64]

Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, Krishna P Gummadi, and Adrian Weller. 2017. From Parity to Preference-based Notions of Fairness in Classification. In Advances in Neural Information Processing Systems.

Cited By

Camilleri RWagenmaker AMorgenstern JJain LJamieson KKiyavash NMooij J(2024)Fair active learning in low-data regimesProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702701(517-531)Online publication date: 15-Jul-2024
https://dl.acm.org/doi/10.5555/3702676.3702701
Keswani VMehrotra ACelis LSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Fair classification with partial feedbackProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693017(23547-23576)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693017
Holm S(2024)Ethical trade-offs in AI for mental healthFrontiers in Psychiatry10.3389/fpsyt.2024.140756215Online publication date: 29-Aug-2024
https://doi.org/10.3389/fpsyt.2024.1407562
Show More Cited By

Index Terms

Adaptive Sampling Strategies to Construct Equitable Training Datasets

Index terms have been assigned to the content through auto-classification.

Recommendations

Passive Sampling for Regression
ICDM '10: Proceedings of the 2010 IEEE International Conference on Data Mining

Active sampling (also called active learning or selective sampling) has been extensively researched for classification and rank learning methods, which is to select the most informative samples from unlabeled data such that, once the samples are labeled,...
Certainty-based active learning for sampling imbalanced datasets

Active learning is to learn an accurate classifier within as few queried labels as possible. For practical applications, we propose a Certainty-Based Active Learning (CBAL) algorithm to solve the imbalanced data classification problem in active ...
Combining active learning and semi-supervised learning to construct SVM classifier

One key issue for most classification algorithms is that they need large amounts of labeled samples to train the classifier. Since manual labeling is time consuming, researchers have proposed technologies of active learning and semi-supervised learning ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency

June 2022

2351 pages

ISBN:9781450393522

DOI:10.1145/3531146

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

FAccT '22

Sponsor:

ACM

FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency

June 21 - 24, 2022

Seoul, Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
368
Total Downloads

Downloads (Last 12 months)93
Downloads (Last 6 weeks)12

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Camilleri RWagenmaker AMorgenstern JJain LJamieson KKiyavash NMooij J(2024)Fair active learning in low-data regimesProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702701(517-531)Online publication date: 15-Jul-2024
https://dl.acm.org/doi/10.5555/3702676.3702701
Keswani VMehrotra ACelis LSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Fair classification with partial feedbackProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693017(23547-23576)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693017
Holm S(2024)Ethical trade-offs in AI for mental healthFrontiers in Psychiatry10.3389/fpsyt.2024.140756215Online publication date: 29-Aug-2024
https://doi.org/10.3389/fpsyt.2024.1407562
Trotter CChen YLo DGamess E(2024)Exploring Fairness-Accuracy Trade-Offs in Binary Classification: A Comparative Analysis Using Modified Loss FunctionsProceedings of the 2024 ACM Southeast Conference10.1145/3603287.3651192(148-156)Online publication date: 18-Apr-2024
https://dl.acm.org/doi/10.1145/3603287.3651192
Wang ZPan ZYi TJin GWang Y(2024)A Sequential Experimental Design Method Based on Binary Response Prediction2024 10th International Symposium on System Security, Safety, and Reliability (ISSSR)10.1109/ISSSR61934.2024.00064(441-451)Online publication date: 16-Mar-2024
https://doi.org/10.1109/ISSSR61934.2024.00064
Derner EBatistič KZahálka JBabuška R(2024)A Security Risk Taxonomy for Prompt-Based Interaction With Large Language ModelsIEEE Access10.1109/ACCESS.2024.345038812(126176-126187)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3450388
Li YGoel S(2024)Making It Possible for the Auditing of AI: A Systematic Review of AI Audits and AI AuditabilityInformation Systems Frontiers10.1007/s10796-024-10508-8Online publication date: 2-Jul-2024
https://doi.org/10.1007/s10796-024-10508-8
Hridi AWatson B(2024)Are Fair Machine Learning Models More Useful?HCI International 2024 – Late Breaking Papers10.1007/978-3-031-76827-9_3(38-53)Online publication date: 31-Dec-2024
https://doi.org/10.1007/978-3-031-76827-9_3
Shao RSim AWu KKim J(2023)Leveraging History to Predict Infrequent Abnormal Transfers in Distributed WorkflowsSensors10.3390/s2312548523:12(5485)Online publication date: 10-Jun-2023
https://doi.org/10.3390/s23125485
Black ENaidu RGhani RRodolfa KHo DHeidari H(2023)Toward Operationalizing Pipeline-aware ML Fairness: A Research Agenda for Developing Practical Guidelines and ToolsEquity and Access in Algorithms, Mechanisms, and Optimization10.1145/3617694.3623259(1-11)Online publication date: 30-Oct-2023
https://doi.org/10.1145/3617694.3623259
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents