skip to main content
10.1145/3531146.3533203acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article

Adaptive Sampling Strategies to Construct Equitable Training Datasets

Published: 20 June 2022 Publication History

Abstract

In domains ranging from computer vision to natural language processing, machine learning models have been shown to exhibit stark disparities, often performing worse for members of traditionally underserved groups. One factor contributing to these performance gaps is a lack of representation in the data the models are trained on. It is often unclear, however, how to operationalize representativeness in specific applications. Here we formalize the problem of creating equitable training datasets, and propose a statistical framework for addressing this problem. We consider a setting where a model builder must decide how to allocate a fixed data collection budget to gather training data from different subgroups. We then frame dataset creation as a constrained optimization problem, in which one maximizes a function of group-specific performance metrics based on (estimated) group-specific learning rates and costs per sample. This flexible approach incorporates preferences of model-builders and other stakeholders, as well as the statistical properties of the learning task. When data collection decisions are made sequentially, we show that under certain conditions this optimization problem can be efficiently solved even without prior knowledge of the learning rates. To illustrate our approach, we conduct a simulation study of polygenic risk scores on synthetic genomic data—an application domain that often suffers from non-representative data collection. When optimizing policies for overall or group-specific average health, we find that our adaptive approach outperforms heuristic strategies, including equal and representative sampling. In this sense, equal treatment with respect to sampling decisions does not guarantee equal or equitable outcomes.

References

[1]
Jacob Abernethy, Pranjal Awasthi, Matthäus Kleindessner, Jamie Morgenstern, Chris Russell, and Jie Zhang. 2021. Active Sampling for Min-Max Fairness. arxiv:2006.06879 [stat.ML]
[2]
Artificial Intelligence Act. 2021. Proposal for a regulation of the European Parliament and the Council laying down harmonised rules on Artificial Intelligence (Artificial Intelligence Act) and amending certain Union legislative acts. EUR-Lex-52021PC0206 (2021).
[3]
HLEG AI. 2019. High-level Expert Group on Artificial Intelligence.
[4]
Hadis Anahideh, Abolfazl Asudeh, and Saravanan Thirumuruganathan. 2020. Fair active learning. arXiv preprint arXiv:2001.01796(2020).
[5]
Adrien Badré, Li Zhang, Wellington Muchero, Justin C Reynolds, and Chongle Pan. 2021. Deep neural network improves the estimation of polygenic risk scores for breast cancer. Journal of Human Genetics 66, 4 (2021), 359–369.
[6]
Michiel A Bakker, Alejandro Noriega-Campero, Duy Patrick Tu, Prasanna Sattigeri, Kush R Varshney, and AS Pentland. 2019. On fairness in budget-constrained decision making. In KDD Workshop of Explainable Artificial Intelligence.
[7]
Yahav Bechavod, Katrina Ligett, Aaron Roth, Bo Waggoner, and Steven Z Wu. 2019. Equal opportunity in online classification with partial feedback. Advances in Neural Information Processing Systems 32 (2019).
[8]
EM Bender and Batya Friedman. 2018. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science. Transactions of the Association for Computational Linguistics 6 (2018), 587–604.
[9]
Richard Berk, Hoda Heidari, Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. 2017. A convex framework for fair regression. arXiv preprint arXiv:1706.02409(2017).
[10]
Su Lin Blodgett, Lisa Green, and Brendan O’Connor. 2016. Demographic Dialectal Variation in Social Media: A Case Study of African-American English. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 1119–1130.
[11]
Frédéric Branchaud-Charron, Parmida Atighehchian, Pau Rodríguez, Grace Abuhamad, and Alexandre Lacoste. 2021. Can Active Learning Preemptively Mitigate Fairness Issues?arXiv preprint arXiv:2104.06879(2021).
[12]
Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. In Conference on Fairness, Accountability and Transparency. PMLR, 77–91.
[13]
US Census Bureau. 2020. Decennial Census.
[14]
William Cai, Johann Gaebler, Nikhil Garg, and Sharad Goel. 2020. Fair allocation through selective information acquisition. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 22–28.
[15]
Toon Calders and Sicco Verwer. 2010. Three naive bayes approaches for discrimination-free classification. Data Mining and Knowledge Discovery 21, 2 (2010), 277–292.
[16]
Aylin Caliskan, Joanna J Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183–186.
[17]
Taylor B Cavazos and John S Witte. 2021. Inclusion of variants discovered from diverse populations improves polygenic risk score transferability. Human Genetics and Genomics Advances 2, 1 (2021), 100017.
[18]
Alex Chohlas-Wood, Madison Coots, Emma Brunskill, and Sharad Goel. 2021. Learning to be Fair: A Consequentialist Approach to Equitable Decision-Making. arXiv preprint arXiv:2109.08792(2021).
[19]
Alexandra Chouldechova. 2017. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big data 5, 2 (2017), 153–163.
[20]
Sam Corbett-Davies and Sharad Goel. 2018. The measure and mismeasure of fairness: A critical review of fair machine learning. arXiv preprint arXiv:1808.00023(2018).
[21]
Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. 2017. Algorithmic decision making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 797–806.
[22]
Amanda Coston, Alan Mishler, Edward H Kennedy, and Alexandra Chouldechova. 2020. Counterfactual risk assessments, evaluation, and fairness. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 582–593.
[23]
Maria De-Arteaga, Alexey Romanov, Hanna Wallach, Jennifer Chayes, Christian Borgs, Alexandra Chouldechova, Sahin Geyik, Krishnaram Kenthapadi, and Adam Tauman Kalai. 2019. Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 120–128.
[24]
Francisco M De La Vega and Carlos D Bustamante. 2018. Polygenic risk scores: a biased prediction?Genome Medicine 10, 1 (2018), 1–3.
[25]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. 214–226.
[26]
Leonard E Egede. 2006. Race, ethnicity, culture, and disparities in health care. Journal of General Internal Medicine 21, 6 (2006), 667.
[27]
Benjamin Fish, Jeremy Kun, and Ádám D Lelkes. 2016. A Confidence-Based Approach for Balancing Fairness and Accuracy. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 144–152.
[28]
Daniel Galvez, Greg Diamos, Juan Manuel Ciro Torres, Keith Achorn, Anjali Gopi, David Kanter, Max Lam, Mark Mazumder, and Vijay Janapa Reddi. 2021. The People’s Speech: A Large-Scale Diverse English Speech Recognition Dataset for Commercial Usage. (2021).
[29]
Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, Hal Daumé Iii, and Kate Crawford. 2021. Datasheets for datasets. Commun. ACM 64, 12 (2021), 86–92.
[30]
Steven N Goodman, Sharad Goel, and Mark R Cullen. 2018. Machine Learning, Health Disparities, and Causal Reasoning. Annals of Internal Medicine 169, 12 (2018), 883–884.
[31]
Moritz Hardt, Eric Price, and Nati Srebro. 2016. Equality of Opportunity in Supervised Learning. Advances in Neural Information Processing Systems 29 (2016), 3315–3323.
[32]
Caner Hazirbas, Joanna Bitton, Brian Dolhansky, Jacqueline Pan, Albert Gordo, and Cristian Canton Ferrer. 2021. Towards Measuring Fairness in AI: the Casual Conversations Dataset. IEEE Transactions on Biometrics, Behavior, and Identity Science (2021).
[33]
Michael Hind, Sameep Mehta, Aleksandra Mojsilovic, Ravi Nair, Karthikeyan Natesan Ramamurthy, Alexandra Olteanu, and Kush R Varshney. 2018. Increasing Trust in AI Services through Supplier’s Declarations of Conformity. arXiv preprint arXiv:1808.07261 18 (2018), 2813–2869.
[34]
Sarah Holland, Ahmed Hosny, and Sarah Newman. 2020. The Dataset Nutrition Label. Data Protection and Privacy: Data Protection and Democracy (2020) 1 (2020).
[35]
Jeremy Irvin, Hao Sheng, Neel Ramachandran, Sonja Johnson-Yu, Sharon Zhou, Kyle Story, Rose Rustowicz, Cooper Elsworth, Kemen Austin, and Andrew Y Ng. 2020. Forestnet: Classifying Drivers of Deforestation in Indonesia using Deep Learning on Satellite Imagery. arXiv preprint arXiv:2011.05479(2020).
[36]
Faisal Kamiran, Indrė Žliobaitė, and Toon Calders. 2013. Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowledge and Information Systems 35, 3 (2013), 613–644.
[37]
Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 35–50.
[38]
Jerome Kelleher, Alison M Etheridge, and Gilean McVean. 2016. Efficient Coalescent Simulation and Genealogical Analysis for Large Sample Sizes. PLOS Computational Biology 12, 5 (2016), e1004842.
[39]
Amit V Khera, Mark Chaffin, Krishna G Aragam, Mary E Haas, Carolina Roselli, Seung Hoan Choi, Pradeep Natarajan, Eric S Lander, Steven A Lubitz, Patrick T Ellinor, 2018. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nature Genetics 50, 9 (2018), 1219–1224.
[40]
Niki Kilbertus, Manuel Gomez Rodriguez, Bernhard Schölkopf, Krikamol Muandet, and Isabel Valera. 2020. Fair decisions despite imperfect predictions. In International Conference on Artificial Intelligence and Statistics. PMLR, 277–287.
[41]
Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2017. Inherent Trade-Offs in the Fair Determination of Risk Scores. In 8th Innovations in Theoretical Computer Science Conference (ITCS 2017). 43:1–43:23. https://doi.org/10.4230/LIPIcs.ITCS.2017.43
[42]
Allison Koenecke, Andrew Nam, Emily Lake, Joe Nudell, Minnie Quartey, Zion Mengesha, Connor Toups, John R Rickford, Dan Jurafsky, and Sharad Goel. 2020. Racial disparities in automated speech recognition. Proceedings of the National Academy of Sciences 117, 14(2020), 7684–7689.
[43]
Matt Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual Fairness. In Proceedings of the 31st International Conference on Neural Information Processing Systems.
[44]
Raian V Maretto, Leila MG Fonseca, Nathan Jacobs, Thales S Körting, Hugo N Bendini, and Leandro L Parente. 2020. Spatio-Temporal Deep Learning Approach to Map Deforestation in Amazon Rainforest. IEEE Geoscience and Remote Sensing Letters 18, 5 (2020), 771–775.
[45]
Alicia R Martin, Christopher R Gignoux, Raymond K Walters, Genevieve L Wojcik, Benjamin M Neale, Simon Gravel, Mark J Daly, Carlos D Bustamante, and Eimear E Kenny. 2017. Human Demographic History Impacts Genetic Risk Prediction across Diverse Populations. The American Journal of Human Genetics 100, 4 (2017), 635–649.
[46]
Tara C Matise, Jose Luis Ambite, Steven Buyske, Christopher S Carlson, Shelley A Cole, Dana C Crawford, Christopher A Haiman, Gerardo Heiss, Charles Kooperberg, Loic Le Marchand, 2011. The Next PAGE in Understanding Complex Traits: Design for the Analysis of Population Architecture Using Genetics and Epidemiology (PAGE) Study. American Journal of Epidemiology 174, 7 (2011), 849–859.
[47]
Prem Melville, Maytal Saar-Tsechansky, Foster Provost, and Raymond Mooney. 2004. Active feature-value acquisition for classifier induction. In Fourth IEEE International Conference on Data Mining (ICDM’04). IEEE, 483–486.
[48]
Prem Melville, Maytal Saar-Tsechansky, Foster Provost, and Raymond Mooney. 2005. An expected utility approach to active feature-value acquisition. In Fifth IEEE International Conference on Data Mining (ICDM’05). IEEE, 4–pp.
[49]
Melinda C Mills and Charles Rahal. 2019. A scientometric review of genome-wide association studies. Communications biology 2, 1 (2019), 1–11.
[50]
Alan Mishler, Edward H Kennedy, and Alexandra Chouldechova. 2021. Fairness in Risk Assessment Instruments: Post-Processing to Achieve Counterfactual Equalized Odds. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 386–400.
[51]
Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the Conference on Fairness, Accountability, and Transparency. 220–229.
[52]
Hamed Nilforoshan, Johann Gaebler, Ravi Shroff, and Sharad Goel. 2022. Causal Conceptions of Fairness and their Consequences. Preprint.
[53]
Alejandro Noriega-Campero, Michiel A Bakker, Bernardo Garcia-Bulle, and Alex’Sandy’ Pentland. 2019. Active fairness in algorithmic decision making. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 77–83.
[54]
Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan. 2019. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 6464 (2019), 447–453.
[55]
AJ Piergiovanni and Michael S. Ryoo. 2020. AViD Dataset: Anonymized Videos from Diverse Countries. In Advances in Neural Information Processing Systems (NeurIPS).
[56]
Alice B Popejoy and Stephanie M Fullerton. 2016. Genomics is failing on diversity. Nature News 538, 7624 (2016), 161.
[57]
Inioluwa Deborah Raji and Joy Buolamwini. 2019. Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products. In Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. 429–435.
[58]
Goce Ristanoski, Wei Liu, and James Bailey. 2013. Discrimination aware classification for imbalanced datasets. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 1529–1532.
[59]
Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A Smith. 2019. The Risk of Racial Bias in Hate Speech Detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 1668–1678.
[60]
Amr Sharaf and Hal Daumé III. 2020. Promoting fairness in learned models by learning to active learn under parity constraints. In Workshop on Real World Experiment Design and Active Learning. International Conference on Machine Learning.
[61]
Giorgio Sirugo, Scott M Williams, and Sarah A Tishkoff. 2019. The Missing Diversity in Human Genetic Studies. Cell 177, 1 (2019), 26–31.
[62]
Sahil Verma and Julia Rubin. 2018. Fairness definitions explained. In 2018 IEEE/ACM International Workshop on Software Fairness (FairWare). IEEE, 1–7.
[63]
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1171–1180.
[64]
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, Krishna P Gummadi, and Adrian Weller. 2017. From Parity to Preference-based Notions of Fairness in Classification. In Advances in Neural Information Processing Systems.

Cited By

View all
  • (2024)Fair active learning in low-data regimesProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702701(517-531)Online publication date: 15-Jul-2024
  • (2024)Fair classification with partial feedbackProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693017(23547-23576)Online publication date: 21-Jul-2024
  • (2024)Ethical trade-offs in AI for mental healthFrontiers in Psychiatry10.3389/fpsyt.2024.140756215Online publication date: 29-Aug-2024
  • Show More Cited By

Index Terms

  1. Adaptive Sampling Strategies to Construct Equitable Training Datasets
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency
        June 2022
        2351 pages
        ISBN:9781450393522
        DOI:10.1145/3531146
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 June 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Active learning
        2. artificial intelligence
        3. computer vision
        4. fairness
        5. machine learning
        6. polygenic risk scores
        7. representative data

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        FAccT '22
        Sponsor:

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)93
        • Downloads (Last 6 weeks)12
        Reflects downloads up to 17 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Fair active learning in low-data regimesProceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence10.5555/3702676.3702701(517-531)Online publication date: 15-Jul-2024
        • (2024)Fair classification with partial feedbackProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693017(23547-23576)Online publication date: 21-Jul-2024
        • (2024)Ethical trade-offs in AI for mental healthFrontiers in Psychiatry10.3389/fpsyt.2024.140756215Online publication date: 29-Aug-2024
        • (2024)Exploring Fairness-Accuracy Trade-Offs in Binary Classification: A Comparative Analysis Using Modified Loss FunctionsProceedings of the 2024 ACM Southeast Conference10.1145/3603287.3651192(148-156)Online publication date: 18-Apr-2024
        • (2024)A Sequential Experimental Design Method Based on Binary Response Prediction2024 10th International Symposium on System Security, Safety, and Reliability (ISSSR)10.1109/ISSSR61934.2024.00064(441-451)Online publication date: 16-Mar-2024
        • (2024)A Security Risk Taxonomy for Prompt-Based Interaction With Large Language ModelsIEEE Access10.1109/ACCESS.2024.345038812(126176-126187)Online publication date: 2024
        • (2024)Making It Possible for the Auditing of AI: A Systematic Review of AI Audits and AI AuditabilityInformation Systems Frontiers10.1007/s10796-024-10508-8Online publication date: 2-Jul-2024
        • (2024)Are Fair Machine Learning Models More Useful?HCI International 2024 – Late Breaking Papers10.1007/978-3-031-76827-9_3(38-53)Online publication date: 31-Dec-2024
        • (2023)Leveraging History to Predict Infrequent Abnormal Transfers in Distributed WorkflowsSensors10.3390/s2312548523:12(5485)Online publication date: 10-Jun-2023
        • (2023)Toward Operationalizing Pipeline-aware ML Fairness: A Research Agenda for Developing Practical Guidelines and ToolsEquity and Access in Algorithms, Mechanisms, and Optimization10.1145/3617694.3623259(1-11)Online publication date: 30-Oct-2023
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media