How Platform-User Power Relations Shape Algorithmic Accountability: A Case Study of Instant Loan Platforms and Financially Stressed Users in India

Divya Ramesh, Vaishnav Kameswaran, Ding Wang and Nithya Sambasivan

Accountability, a requisite for responsible AI, can be facilitated through transparency mechanisms such as audits and explainability. However, prior work suggests that the success of these mechanisms may be limited to Global North contexts; understanding the limita- tions of current interventions in varied socio-political conditions is crucial to help policymakers facilitate wider accountability. To do so, we examined the mediation of accountability in the existing interactions between vulnerable users and a ‘high-risk’ AI system in a Global South setting. We report on a qualitative study with 29 financially-stressed users of instant loan platforms in India. We found that users experienced intense feelings of indebtedness for the ‘boon’ of instant loans, and perceived huge obligations towards loan platforms. Users fulfilled obligations by accepting harsh terms and conditions, over-sharing sensitive data, and paying high fees to unknown and unverified lenders. Users demonstrated a dependence on loan platforms by persisting with such behaviors despite risks of harms such as abuse, recurring debts, discrimination, privacy harms, and self-harm to them. Instead of being enraged with loan plat- forms, users assumed responsibility for their negative experiences, thus releasing the high-powered loan platforms from accountability obligations. We argue that accountability is shaped by platform- user power relations, and urge caution to policymakers in adopting a purely technical approach to fostering algorithmic accountability. Instead, we call for situated interventions that enhance agency of users, enable meaningful transparency, reconfigure designer-user relations, and prompt a critical reflection in practitioners towards wider accountability. We conclude with implications for responsibly deploying AI in FinTech applications in India and beyond.


"There Is Not Enough Information": On the Effects of Explanations on Perceptions of Informational Fairness and Trustworthiness in Automated Decision-Making

Jakob Schoeffer, Niklas Kuehl and Yvette Machowski

Automated decision systems (ADS) are increasingly used for consequential decision-making. These systems often rely on sophisticated yet opaque machine learning models, which do not allow for understanding how a given decision was arrived at. This is not only problematic from a legal perspective, but non-transparent systems are also prone to yield unfair outcomes because their sanity is challenging to assess and calibrate in the first place. In this work, we conduct a human subject study to assess people's perceptions of informational fairness (i.e., whether people think they are given adequate information on and explanation of the process and its outcomes) and trustworthiness of an underlying ADS when provided with varying types of information about the system. More specifically, we instantiate an ADS in the area of automated loan approval and generate different explanations that are commonly used in the literature. We randomize the amount of information that study participants get to see by providing certain groups of people with the same information as others plus additional explanations. From our quantitative analyses, we observe that different amounts of information as well as people's (self-assessed) AI literacy significantly influence the perceived informational fairness, which, in turn, positively relates to perceived trustworthiness of the ADS. A comprehensive analysis of qualitative feedback further points to several surprising and potentially inadvertent effects of providing explanations, which system designers should be mindful of.


#FuckTheAlgorithm: algorithmic imaginaries and political resistance

Garfield Benjamin

This paper applies and extends the concept of algorithmic imaginaries in the context of political resistance. Focusing on the 2020 UK OfQual protests, the role of the "fuck the algorithm" chant is examined as an imaginary of resistance to confront power in sociotechnical systems. The protest is analysed as a turning point in algorithmic imaginaries amidst evolving uses of #FuckTheAlgorithm on social media as part of everyday practices of resistance.


A Data-driven analysis of the interplay between Criminiological theory and predictive policing algorithms

Adriane Chapman, Philip Grylls, Pamela Ugwudike, David Gammack and Jacqui Ayling

Previous studies have focused on the unfairness, biases and feedback loops that occur in predictive policing algorithms. These studies show how systemically and institutionally biased data leads to these feedback loops when predictive policing algorithms are applied in real life. We take a step back, and show that the choice in algorithm can be embedded in a specific criminological theory, and that the choice of a model on its own even without biased data can create biased feedback loops. By synthesizing “historic” data, in which we control the relationships between crimes, location and time, we show that the current predictive policing algorithms create biased feedback loops even with completely random data. We then review the process of creation and deployment of these predictive systems, and highlight when “good practices,” such as fitting a model to data, “go bad” within the context of larger system development and deployment. Using best practices from previous work on assessing and mitigating the impact of new technologies, processes, and infrastructure across the domains of environment, information privacy, data protection and human rights, we highlight where the design of these algorithms has broken down. The study also found that multidisciplinary analysis of such systems is vital for uncovering these issues and shows that any study of equitable AI should involve a systematic and holistic analysis of their design rationalities.


A Data-Driven Simulation of the New York State Foster Care System

Yuhao Du, Stefania Ionescu, Melanie Sage and Kenneth Joseph

Signed into law in 2018, the Family First Prevention Services Act (FFPSA) prioritizes keeping American youth out of foster care in the American child welfare system, while achieving racial equity is a well established long-term goal. However, agencies and local governments are still working toward the best way to achieve those goals. The present work introduces a pipeline which combines forensic social science analysis and data-driven simulation to help identify the pathway for achieving multiple goals and evaluate potential strategies under different assumptions about the existing foster care system. Specifically, we identify youth who might be diverted from foster care through forensic social science analysis, and using a system dynamics model informed by forensic data analysis, we build forecasts of youth entry and exits from foster care in New York state, and use it to evaluate an algorithmic intervention proposed in prior work. With respect to the latter contribution, we show that subtle changes within the already complex foster care system can have critical impacts on the performance and biases of the implemented algorithm with respect to the multiple goals. In light of our findings, we recommend machine learning practitioners in public sectors review policy rigorously to decide whether and how previous administrative datasets should be applied to train the automatic decision tools.


A Review of Taxonomies of Explainable Artificial Intelligence (XAI) Methods

Timo Speith

The recent surge in publications related to explainable artificial intelligence (XAI) has led to an almost insurmountable wall if one wants to get started or stay up to date with XAI. For this reason, articles and reviews that present taxonomies of XAI methods seem to be a welcomed way to get an overview of the field. Building on this idea, there is currently a trend of producing such taxonomies, leading to several competing approaches to construct them. In this paper, we will review recent approaches to constructing taxonomies of XAI methods and discuss general challenges concerning them as well as their individual advantages and limitations. Our review is intended to help scholars be aware of challenges current taxonomies face. As we will argue, when chartering the field of XAI, it may not be sufficient to rely on a single taxonomy. To amend this problem, we will discuss ways to integrate the discussed approaches.


ABCinML: Anticipatory Bias Correction in Machine Learning Applications

Abdulaziz Almuzaini, Chidansh Bhatt, David Pennock and Vivek Singh

Static models (i.e., train once, deploy forever) of machine learning (ML) rarely work in practical settings. Besides fluctuations in accuracy over time, they are likely to suffer from varying biases based on past injustices coded in human judgements and mismatches between past and emerging settings.Thus, multiple researchers have begun to explore ways to maintain algorithmic fairness over time. One line of work focuses on "dynamic learning" i.e., retraining after each batch,and the other on "robust learning" which tries to make the algorithms robust across all possible future challenges. Robust learning often yields (overly) conservative models and "dynamic learning" tries to reduce biases soon after they have occurred. We propose an anticipatory "dynamic learning" approach for correcting the algorithm to prevent bias before it occurs. Specifically, we make use of anticipations regarding the relative distributions of population subgroups (e.g.,relative ratio of male and female applicants) in the next cycle to identify the right parameters for an importance weighing fairness approach. Results from experiments over multiple real-world datasets suggest that this approach has a promise for anticipatory bias correction.


Accountability in an Algorithmic Society: Relationality, Responsibility, and Robustness in Machine Learning

A. Feder Cooper, Benjamin Laufer, Emanuel Moss, and Helen Nissenbaum

In 1996, philosopher Helen Nissenbaum issued a clarion call concerning the erosion of accountability in society due to the ubiquitous delegation of consequential functions to computerized systems. Using the conceptual framing of moral blame, Nissenbaum described four types of barriers to accountability that computerization presented: 1) ``many hands,'' the problem of attributing moral responsibility for outcomes caused by many moral actors; 2) ``bugs,'' a way software developers might shrug off responsibility by suggesting software errors are unavoidable; 3) ``computer as scapegoat,'' shifting blame to computer systems as if they were moral actors; and 4) ``ownership without liability,'' a free pass to the tech industry to deny responsibility for the software they produce. We revisit these four barriers in relation to the recent ascendance of data-driven algorithmic systems---technology often folded under the heading of machine learning (ML) or artificial intelligence (AI)---to uncover the new challenges for accountability that these systems present. We then look ahead to how one might construct and justify a moral, relational framework for holding responsible parties accountable, and argue that the FAccT community is uniquely well-positioned to develop such a framework to weaken the four barriers.


Accountable Data: The Politics and Pragmatics of Disclosure Datasets

Lindsay Poirier

This paper attends specifically to what I call "disclosure datasets" - tabular datasets produced in accordance with laws requiring various kinds of disclosure. For the purposes of this paper, the most significant defining feature of disclosure datasets is that they aggregate information produced and reported by the same institutions they are meant to hold accountable. Through a series of case studies, I specifically draw attention to two concerns with disclosure datasets: First, for disclosure datasets, there is often political and social mobilization around the definitions that determine reporting thresholds, which in turn implicates what observations end up in the dataset. Changes in reporting thresholds can be traced along changes in political party power as the aims to promote accountability through mandated disclosure often get pitted against the aims to reduce regulatory burden. Second, for disclosure datasets, the observational unit – what is ultimately being counted in the data – is often not a person, institution, or action but instead a form that the reporting institution is required by law to fill out. Forms infrastructure the information that ends up in the dataset in notable ways. This work contributes to recent calls to promote the transparency and accountability of data science work through improved inquiry into and documentation of the social lineages of source datasets. The analysis of disclosure datasets presented in this paper poses important questions regarding what ultimately gets documented in the data, along with the representativeness and usefulness of these accountability mechanisms.


Achieving Fairness via Post-Processing in Web-Scale Recommender Systems

Preetam Nandy, Cyrus DiCiccio, Divya Venugopalan, Heloise Logan, Kinjal Basu and Noureddine El Karoui

Building fair recommender systems is a challenging and crucial area of study due to its immense impact on society. We extended the definitions of two commonly accepted notions of fairness to recommender systems, namely equality of opportunity and equalized odds. These fairness measures ensure that equally ``qualified'' (or ``unqualified'') candidates are treated equally regardless of their protected attribute status (such as gender or race). We propose scalable methods for achieving equality of opportunity and equalized odds in rankings in the presence of position bias, which commonly plagues data generated from recommender systems. Our algorithms are model agnostic in the sense that they depend only on the final scores provided by a model, making them easily applicable to virtually all web-scale recommender systems. We conduct extensive simulations as well as real-world experiments to show the efficacy of our approach.


Adaptive Sampling Strategies to Construct Equitable Training Datasets

William Cai, Ro Encarnacion, Bobbie Chern, Sam Corbett-Davies, Miranda Bogen, Stevie Bergman and Sharad Goel

In domains ranging from computer vision to natural language processing, machine learning models have been shown to exhibit stark disparities, often performing worse for members of traditionally underserved groups. One factor contributing to these performance gaps is a lack of representation in the data the models are trained on. It is often unclear, however, how to operationalize representativeness in specific applications. Here we formalize the problem of creating equitable training datasets, and propose a statistical framework for addressing this problem. We consider a setting where a model builder must decide how to allocate a fixed data collection budget to gather training data from different subgroups. We then frame dataset creation as a constrained optimization problem, in which one maximizes a function of group-specific performance metrics based on (estimated) group-specific learning rates and costs per sample. This flexible approach incorporates preferences of model-builders and other stakeholders, as well as the statistical properties of the learning task. When data collection decisions are made sequentially, we show that under certain conditions this optimization problem can be efficiently solved even without prior knowledge of the learning rates. To illustrate our approach, we conduct a simulation study of polygenic risk scores on synthetic genomic data—an application domain that often suffers from non-representative data collection. We find that our adaptive sampling strategy outperforms several common data collection heuristics, including equal and proportional sampling, demonstrating the value of strategic dataset design for building equitable models.


Adversarial Scrutiny of Evidentiary Statistical Software

Rediet Abebe, Moritz Hardt, Angela Jin, John Miller, Ludwig Schmidt and Rebecca Wexler

The US criminal legal system increasingly relies on software outputs to convict and incarcerate people. In a vast number of cases each year, the government makes these consequential decisions based on evidence from software tools that the defense counsel cannot fully cross-examine or scrutinize. This offends the commitments of the adversarial criminal legal system, which relies on the defense’s ability to probe and test the prosecution’s case to seek truth and safeguard individual rights. At present, there are no technical standards to adversarially scrutinize output from software used as evidence at trial. This gap has led a variety of government bodies, advocacy groups, and other stakeholders to call for standards and programs to systematically examine the reliability of software tools. In this paper, we propose robust adversarial scrutiny as a framework for questioning evidence output from statistical software, which range from probabilistic genotyping tools, to environment audio detection, and toolmark analysis. Drawing on a large body of recent work in robust machine learning and algorithmic fairness, we define and operationalize this notion of robust adversarial scrutiny for defense use. We demonstrate how this framework both standardizes the process for scrutinizing such tools and empowers defense lawyers to examine reliability of these tools for instances most relevant to the case at hand. We further discuss existing structural and institutional challenges within the US criminal legal system which may create barriers for implementing this framework and close with a discussion on policy changes that could help address these concerns.


Affirmative Algorithms: Relational Equality as Algorithmic Fairness

Marilyn Zhang

Many statistical fairness notions have been proposed for algorith- mic decision-making systems, and especially public safety pretrial risk assessment (PSPRA) algorithms such as COMPAS. Most fairness notions equalize something between groups, whether it is false positive rates or accuracy. In fact, I demonstrate that most prominent notions have their basis in equalizing some form of accuracy. However, statistical fairness metrics often do not capture the substantive point of equality. I argue that equal accuracy is not only difficult to measure but also unsatisfactory for ensuring equal justice. In response, I introduce philosopher Elizabeth Anderson’s theory of relational equality as a fruitful alternative framework: to relate as equals, people need access to certain basic capabilities. I show that relational equality requires Affirmative PSPRA algo- rithms that lower risk scores for Black defendants. This is because fairness based on relational equality means considering the impact of PSPRA algorithms’ decisions on access to basic capabilities. This impact is racially asymmetric in an unjust society. I make three main contributions: (1) I illustrate the shortcomings of statistical fairness notions in their reliance on equalizing some form of accuracy; (2) I present the first comprehensive ethical defense of Affirmative PSPRA algorithms, based on fairness in terms of relational equality instead; and (3) I show that equalizing accuracy is neither sufficient nor necessary for fairness based on relational equality. Overall, this work serves narrowly as a reason to re-evaluate algorithmic fairness for PSPRA algorithms, and serves broadly as an example of how discussions of algorithmic fairness can benefit from egalitarian philosophical frameworks.


AI Ethics Statements - Analysis and Lessons Learnt from NeurIPS Broader Impact Statements

Carolyn Ashurst, Emmie Hine, Paul Sedille and Alexis Carlier

Ethics statements have been proposed as a mechanism to increase transparency and promote reflection on the societal impacts of published research. In 2020, the machine learning (ML) conference NeurIPS broke new ground by requiring that all papers include a broader impact statement. This requirement was removed in 2021, in favour of a checklist approach. The 2020 statements therefore provide a unique opportunity to learn from the broader impact experiment: to investigate the benefits and challenges of this and similar governance mechanisms, as well as providing an insight into how ML researchers think about the societal impacts of their own work. Such learning is needed as NeurIPS and other venues continue to question and adapt their policies. To enable this, we have created a dataset containing the impact statements from all NeurIPS 2020 papers, along with additional information such as affiliation type, location and subject area, and a simple visualisation tool for exploration. We also provide an initial quantitative analysis of the dataset, covering representation, engagement, common themes, and willingness to discuss potential harms alongside benefits. We investigate how these vary by geography, affiliation type and subject area. Drawing on these findings, we discuss the potential benefits and negative outcomes of ethics statement requirements, and their possible causes and associated challenges. These lead us to several lessons to be learnt from the 2020 requirement: (i) the importance of creating the right incentives, (ii) the need for clear expectations and guidance, and (iii) the importance of transparency and constructive deliberation. We encourage other researchers to use our dataset to provide additional analysis, to further our understanding of how researchers responded to this requirement, and to investigate the benefits and challenges of this and related mechanisms.


AI Opacity and Explainability in Tort Litigation

Henry Fraser, Rhyle Simcock and Aaron Snoswell

A spate of recent accidents and a lawsuit involving Tesla’s ‘self-driving’ cars highlights the growing need for meaningful accountability when harms are caused by AI systems. Tort (or civil liability) lawsuits are one important way for victims to redress harms. The prospect of tort liability may also prompt AI developers to take better precautions against safety risks. Tort claims of all kinds will be hindered by AI opacity: the difficulty of determining how and why complex AI systems make predictions. We address this problem by formulating and evaluating several options for mitigating AI opacity that combine expert evidence, legal argumentation, civil procedure, and Explainable AI approaches. We emphasise the need for explanations of AI systems in tort litigation to be attuned to the elements of legal ‘causes of action’ – the specific facts that must be proven to succeed in a lawsuit. We take a recent Australian case involving explainable AI evidence as a starting point from which to map contemporary Explainable AI approaches to elements of tortious causes of action, focusing on misleading conduct, negligence, and product liability for safety defects. Our work synthesizes law, legal procedure, and computer science literature to provide greater clarity on the opportunities and challenges for Explainable AI in civil litigation, and may prove helpful to potential litigants, to courts, and to illuminate key targets for regulatory intervention.


Algorithmic Fairness and Vertical Equity: Income Fairness with Tax Audit Models

Emily Black, Hadi Elzayn, Alexandra Chouldechova, Jacob Goldin and Daniel Ho

This study examines issues of algorithmic fairness in the context of systems that inform tax audit selection by the United States Internal Revenue Service (IRS). While the field of algorithmic fairness has developed primarily around notions of treating like individuals alike, we instead explore the concept of vertical equity---appropriately accounting for relevant differences across individuals---which is a central component of fairness in many public policy settings. Applied to the design of the U.S. individual income tax system, vertical equity relates to the fair allocation of tax and enforcement burdens across taxpayers of different income levels. Through a unique collaboration with the IRS, we use access to detailed, anonymized individual taxpayer microdata, risk-selected audits, and random audits from 2010-14 to study vertical equity in tax administration. In particular, we assess how the adoption of modern machine learning methods for selecting taxpayer audits may affect vertical equity. Our paper makes four contributions. First, we show how the adoption of more flexible machine learning (classification) methods---as opposed to simpler models---shapes vertical equity by shifting audit burdens from high to middle-income taxpayers. Second, given concerns about high audit rates of low-income taxpayers, we investigate how existing algorithmic fairness techniques would change the audit distribution. We find that such methods can mitigate some disparities across income buckets, but that these come at a steep cost to performance. Third, we show that the choice of whether to treat risk of underreporting as a classification or regression problem is highly consequential. Moving from a classification approach to a regression approach to predict the expected magnitude of underreporting shifts the audit burden substantially toward high income individuals, while increasing revenue. Last, we investigate the role of differential audit cost in shaping the distribution of audits. Audits of lower income taxpayers, for instance, are typically conducted by mail and hence pose much lower cost to the IRS. These results show that a narrow focus on return-on-investment can undermine vertical equity. Our results have implications for ongoing policy debates and the design of algorithmic tools across the public sector.


Algorithmic Tools in Public Employment Services: Towards a Jobseeker-Centric Perspective

Co-Winner: Distinguished Student Paper Award

Kristen Scott, Sonja Mei Wang, Milagros Miceli, Pieter Delobelle, Karolina Sztandar-Sztanderska and Bettina Berendt

Algorithmic and data-driven systems have been introduced to assist Public Employment Services (PES) in government agencies throughout the world. Their deployment has sparked public controversy and some of these systems have been removed from use or seen their roles significantly reduced as a consequence. Yet the implementation of such systems continues. In this paper, we use a participatory approach to determine a course forward for research and development in this area. Our investigation comprises two workshops: one fact-finding workshop with academics, system developers, the public sector, and civil-society organizations, the second a co-design workshop held with 13 unemployed migrants to Germany. Based on the discussion in the fact-finding workshop we identified challenges of existing PES (algorithmic) systems. From the co-design workshop we identified jobseekers' desiderata when contacting PES, namely the need for human contact, the expectation to receive genuine orientation, and the desire to be seen as a whole human being. We map these expectations to three design considerations for algorithmic systems for PCS, i.e., the importance of interpersonal interaction, jobseeker assessment as direction, and the challenge of mitigating misrepresentation. Finally, we argue that the limitations and risks of current systems cannot be addressed through minor adjustments but they rather require a more fundamental change to the role of PES.


Algorithms Off-Limits?: If digital Trade Law Restricts Access to Source Code of Software then Accountability will Suffer

Kristina Irion

International trade agreements are increasingly used to introduce an additional layer of protection for source code of software. The European Union (EU) entered into commitments on software source code in its recent bilateral trade agreements with Japan, Mexico and the United Kingdom. Supported by the EU, a new provision on source code of software is part of the ambitious set of new rules on trade-related aspects of electronic commerce currently negotiated by 86 WTO members. While the exact language varies between trade agreements, such a provision typically prohibits a party’s measure that require access to, or transfer of, the source code of software, subject to certain exceptions. To date our understanding of how such a provision inside trade law impacts on a party’s right to regulate software source code is limited. For determining the scope of a source code discipline the exact meaning of the term source code of software is decisive. While the EU negotiators consider computer algorithms and artificial intelligence being outside the scope of such a provision, several other negotiating parties seek to explicitly include “an algorithm expressed in that source code” in the wording of the provision. What worries pundits and rights advocates is that the source code provision could hamper future EU regulation in the field of artificial intelligence and algorithmic decision-making (ADM). This article will analyse how the software source code provision inside international trade law, in particular WTO law, would be interpreted and applied. Computer and machine learning algorithms are namely expressed in source code and also the interfaces of an algorithmic system are protected as source code of software. A particular focus of the article will be on the implications of such a source code provision for current and future EU policy that aims to ensure transparency and accountability of ADM and machine learning algorithms. The methodology used in this article is doctrinal legal, comparative and qualitative research. The article’s empirical basis is trade law agreements, academic literature, official documents, and grey literature from stakeholders. The article’s findings will be of particular relevance for the multidisciplinary participants of FAccT22. Considering that digitalization leads to more and more digital artefacts made of software source code, the source code provision may turn out too broad for EU domestic digital policies that need to build on interoperability, accountability, and verifiability of digital technologies.


An Algorithmic Framework for Bias Bounties

Ira Globus-Harris, Michael Kearns and Aaron Roth

Notions of fair machine learning that seek to control various kinds of error across protected groups generally are cast as constrained optimization problems over a fixed model class. For all such problems, tradeoffs arise: asking for various kinds of technical fairness requires compromising on overall error, and adding more protected groups increases error rates across all groups. Our goal is to ``break though'' such accuracy-fairness tradeoffs, also known as Pareto frontiers. We develop a simple algorithmic framework that allows us to deploy models and then revise them dynamically when groups are discovered on which the error rate is suboptimal. Protected groups do not need to be specified ahead of time: At any point, if it is discovered that there is some group on which our current model is performing substantially worse than optimally, then there is a simple update operation that improves the error on that group without increasing either overall error, or the error on any previously identified group. We do not restrict the complexity of the groups that can be identified, and they can intersect in arbitrary ways. The key insight that allows us to break through the tradeoff barrier is to dynamically expand the model class as new high error groups are identified. The result is provably fast convergence to a model that cannot be distinguished from the Bayes optimal predictor --- at least by the party tasked with finding high error groups. We explore two instantiations of this framework: as a ``bias bug bounty'' design in which external auditors are invited (and monetarily incentivized) to discover groups on which our current model's error is suboptimal, and as an algorithmic paradigm in which the discovery of groups on which the error is suboptimal is posed as an optimization problem. In the bias bounty case, when we say that a model cannot be distinguished from Bayes optimal, we mean by any participant in the bounty program. We provide both theoretical analysis and experimental validation.


An Outcome Test of Discrimination for Ranked Lists

Jonathan Roth, Guillaume Saint-Jacques and Yinyin Yu

This paper extends Becker (1957)'s outcome test of discrimination to settings where a (human or algorithmic) decision-maker produces a ranked list of candidates. Ranked lists are particularly relevant in the context of online platforms that produce search results or feeds, and also arise when human decisionmakers express ordinal preferences over a list of candidates. We show that non-discrimination implies a system of moment inequalities, which intuitively impose that one cannot permute the position of a lower-ranked candidate from one group with a higher-ranked candidate from a second group and systematically improve the objective. Moreover, we show that that these moment inequalities are the only testable implications of non-discrimination when the auditor observes only outcomes and group membership by rank. We show how to statistically test the implied inequalities, and validate our approach in an application using data from LinkedIn.


Are "Intersectionally Fair" AI Algorithms Really Fair to Women of Color? A Philosophical Analysis

Youjin Kong

A growing number of studies on fairness in artificial intelligence (AI) use the notion of intersectionality to measure AI fairness. Most of these studies take intersectional fairness to be a matter of statistical parity among intersectional subgroups: an AI algorithm is "intersectionally fair" if the probability of the outcome is roughly the same across all subgroups defined by different combinations of the protected attributes. This paper identifies and examines three fundamental problems with this dominant interpretation of intersectional fairness in AI. First, the dominant approach is so preoccupied with the intersection of attributes/categories (e.g., race, gender) that it fails to address the intersection of oppression (e.g., racism, sexism), which is more central to intersectionality as a critical framework. Second, the dominant approach faces a dilemma between infinite regress and fairness gerrymandering: it either keeps splitting groups into smaller subgroups or arbitrarily selects protected groups. Lastly, the dominant view fails to capture what it really means for AI algorithms to be fair, in terms of both distributive and non-distributive fairness. I distinguish a strong sense of AI fairness from a weak sense that is prevalent in the literature, and conclude by envisioning paths towards strong intersectional fairness in AI.


Assessing Annotator Identity Sensitivity via Item Response Theory: A Case Study in a Hate Speech Corpus

Pratik Sachdeva, Renata Barreto, Claudia von Vacano and Chris Kennedy

Annotators, by labeling data samples, play an essential role in the production of machine learning datasets. Their role is increasingly prevalent for more complex tasks such as hate speech or disinformation classification, where labels may be particularly subjective, as evidenced by low inter-annotator agreement statistics. When interpreting annotation instructions during labeling, annotators may exhibit systematic variation, or observable differences in labeling patterns when grouped by their self-reported demographic identities, such as race, gender, etc. As this labeling variation can play a role in the patterns machine learning algorithms learn from the data, quantifying and characterizing identity-related annotator bias is of paramount importance for fairness and accountability in machine learning. In this work, we utilize item response theory (IRT), a methodological approach developed for measurement theory, to quantify annotator bias. IRT models can be constructed to incorporate diverse factors that influence a label on a specific data sample, such as the data sample itself, the annotator, and the labeling instrument's wording and response options. An IRT model captures the contributions of these facets to the label via a latent-variable probabilistic model, thereby allowing the direct quantification of annotator bias. As a case study, we examine a hate speech corpus containing over 50,000 social media comments from Reddit, YouTube, and Twitter, rated by 10,000 annotators on 10 components of hate speech (e.g., sentiment, respect, violence, dehumanization). We leverage three different IRT techniques which are complementary in that they quantify bias from different perspectives: separated measurements, annotator-level interactions, and group-level interactions. We used these techniques to assess whether an annotator's racial identity is associated with their rating of comments that reference different racial identities. We find that, after controlling for the estimated hatefulness of social media comments, annotators tend to be more sensitive when rating comments targeting a group they identify with. Specifically, annotators were more likely to rate comments targeting their own racial identity as possessing elements of hate speech. For example, we find that Black annotators are about 1.4 times more likely to rate comments targeting Black identity as possessing elements of hate speech, relative to comments targeting white identity. This contrasts with white annotators, who we found to be 0.97 times as likely. Our results identify a correspondence between annotator identity and the target identity of hate speech comments, and provide a set of tools that can assess annotator identity bias in machine learning datasets at large.


At The Tensions of South and North: Critical Roles of Global South Stakeholders in AI Governance

Marie-Therese Png

This paper examines critical and cross-geographic perspectives of harm reduction strategies in AI governance. It calls for those working in AI governance, as well as relevant areas of international trade law, intellectual property, technical standards and certification, and human rights to substantively engage with elements of the Global South discourse that are in tension with the dominant discourse. It aims to present a landscape of AI governance for and from the Global South - advanced by critical and decolonial-informed practitioners and scholars - and contrast this with the dominant AI governance discourse led out of Global North institutions. By doing so, it identifies gaps in the dominant AI governance discourse around interpretations of justice, rights and geopolitical representation, bridging these gaps with separate but relevant discourses of technology and power, localisation, and reparation led by Global South aligned thinkers and actors. By contrasting these two discourses, this paper discusses what key differences and tensions might mean substantively for the building of AI governance processes. This paper opens with the growing popularity of Inclusive AI governance, introducing the paradox of participation - wherein inclusion can exist while structural harms persist. It then presents a brief digest of AI “for and from the Global South”, enumerating several critical concerns expressed by the Global South discourse, but neglected by the dominant AI discourse led by Global North institutions. These critical concerns include digital sovereignty as relevant to low and middle income countries, infrastructural and regulatory monopolies, harms associated with the labour and material supply chains of AI technologies, beta testing, and commercial exploitation. The following section argues that Global South actors play a key role in restructuring AI governance, proposing three roles of Global South actors - 1. As challenging functions to exclusionary governance mechanisms, 2. Providing legitimate expertise in the interpretation and localisation of risks - which includes a whole-systems and historic view, and 3. Providing a source of alternative governance mechanisms - e.g.: South-South solidarity, co-governance, democratic accountability, and a political economy of resistance. The final section of this paper proposes that key differences between the Global South and dominant Global North discourses can be explained in part by historic power dynamics. Here, this paper describes the coloniality of power in AI governance, and recasts popular AI Governance frameworks, such as the Fourth Industrial Revolution, in a historic light.


Attribute Privacy: Framework and Mechanisms

Wanrong Zhang, Olga Ohrimenko and Rachel Cummings

Ensuring the privacy of training data is a growing concern since many machine learning models are trained on confidential and potentially sensitive data. Much attention has been devoted to methods for protecting individual privacy during analyses of large datasets. However in many settings, global properties of the dataset may also be sensitive (e.g., mortality rate in a hospital rather than presence of a particular patient in the dataset). In this work, we depart from individual privacy to initiate the study of attribute privacy, where a data owner is concerned about revealing sensitive properties of a whole dataset during analysis. We propose definitions to capture attribute privacy in two relevant cases where global attributes may need to be protected: (1) properties of a specific dataset and (2) parameters of the underlying distribution from which dataset is sampled. We also provide two efficient mechanisms for specific data distributions and one general but inefficient mechanism that satisfy attribute privacy for these settings. We base our results on a novel use of the Pufferfish framework to account for correlations across attributes in the data, thus addressing ``the challenging problem of developing Pufferfish instantiations and algorithms for general aggregate secrets'' that was left open by \cite{kifer2014pufferfish}.


Auditing for Gerrymandering by Identifying Disenfranchised Individuals

Jerry Lin, Carolyn Chen, Marc Chmielewski, Samia Zaman and Brandon Fain

Gerrymandering is the practice of drawing congressional districts to advantage or disadvantage particular electoral outcomes or population groups. We study the problem of computationally auditing a districting for evidence of gerrymandering. Our approach is novel in its emphasis on identifying individual voters disenfranchised by packing and cracking in local fine-grained geographic regions. We define a local score based on comparison with a representative sample of alternative districtings and use simulated annealing to algorithmically generate a witness districting to show that the score can be substantially reduced by simple local alterations. Unlike commonly studied metrics for gerrymandering such as proportionality and compactness, our framework is inspired by the legal context for voting rights in the United States. That is, to demonstrate illegal gerrymandering, an audit must demonstrate that individuals were harmed rather than political parties. We demonstrate the use of our framework to analyze the congressional districting of the state of North Carolina in 2016, identifying a substantial number of geographically localized disenfranchised individuals. The individuals tend disproportionately to be Democrats in the central and north-eastern parts of the state. Our simulated annealing algorithm is able to generate a witness districting with a roughly 50\% reduction in the number of disenfranchised individuals, suggesting that the 2016 districting was not predetermined by North Carolina's spatial structure.


Automating Care: Online Food Delivery Work During the CoVID-19 Crisis in India

Anubha Singh and Tina Park

On March 23, 2020, the Government of India (GoI) announced one of the strictest nationwide lockdowns in the world to curb the spread of novel SARS-CoV-2. The country came to a standstill overnight and the service industry, including small businesses and restaurants, took a massive financial hit. The GoI, motivated by economic concerns and informed by the latest available scientific knowledge about the spread of the virus, legally declared certain jobs and services as “essential.” Including among these essential jobs were online food delivery workers. However, the unknown nature of the virus and its spread also deepened anxiety among the general public, quickly turning to distrust towards any “outside” contact with goods and people. In the hopes of (re)building consumer trust, food delivery platforms Zomato and Swiggy began “innovating” and providing digital solutions to exhibit care towards their customers. Referencing popular knowledge available at the time about CoVID’s symptoms and spread, some of these innovations included: (1) sharing delivery workers’ live temperatures alongside the workers’ profile inside the app; (2) mandating the use of the controversial contact tracing app Aarogya Setu for the workers; (3) monitoring workers’ usage of masks through random selfie requests; and (4) sharing specific worker vaccination details on the app for customers to view, including vaccination date and the vaccine’s serial number. Such invasive data gathering infrastructures to address public health threats have long focused on the surveillance of laborers, migrants, and the bodies of other marginalized communities. Framed as public health management, such biometric and health data gathering is treated as a necessary feature of caring for the well-being of the general public. However, such datafication practices - ones which primarily focus on the extraction of data from one specific community in order to mollify the concerns of another - normalizes the false perception that disease is transmitted unidirectionally: from worker to the consumer. By centering food delivery workers’ experiences during the pandemic, this paper looks at the delivery platforms’ practices that employ a combination of scientific knowledge and technology in a way that justifies harmful and unethical datafication practices. By examining the normalization of such surveillance in the name of care and recovery, this paper aims to examine how new regimes of care are manufactured and legitimized through an operationalization of scientific knowledge extracted from politically and economically marginalized bodies.


BCIs and human rights: Brave new rights for a brave new world

Marietjie Botes

Digital health applications include a wide range of wearable, implantable, injectable and ingestible digital medical devices. Many of these devices use machine learning algorithms to assist medical prognosis and decision-making. One of the most compelling digital medical device developments is brain-computer interfaces (BCIs) which entails the connecting of a person’s brain to a computer, or to another device outside the human body. BCIs allow bidirectional communication and control between the human brain and the outside world by exporting brain data or altering brain activity. Although being marveled at for its clinical promises, this technological advancement also raises novel ethical, legal, social and technical implications (ELSTI). Debates in this regard centers around patient autonomy, equity, trustworthiness in healthcare, data protection and security, risks of dehumanization, the limitations of machine learning-based decision-making, and the influence that BCIs have on what it means to be human and human rights. Since the adoption of the Universal Declaration of Human Rights (UDHR) after World War II, the landscape that give rise to these human rights has evolved enormously. Human life and humans’ role in society are being transformed and threatened by technologies that were never imagined at the time the UDHR was adopted. BCIs, in particular, harbor the greatest possibility of social and individual disruption through its capability to record, interpret, manipulate, or alter brain activity that may potential alter what it means to be human and how we control humans in future. Cutting edge technological innovations that increasingly blur the lines between human and computer beg the rethinking and extension of existing human rights to remain relevant in a digitized world. In this paper sui generis human rights such as mental privacy, the right to identity or self, agency or free will, fair access to cognitive augmentation, and the protection against algorithmic bias and discrimination will be discussed and how regulatory framework must be established to act as technology enablers, whilst ensuring fairness, accountability, and transparency in sociotechnical systems.


Behavioral Use Licensing for Responsible AI

Danish Contractor, Daniel McDuff, Julia Haines, Jenny Lee, Christopher Hines, Brent Hecht, Nicholas Vincent and Hanlin Li

With the growing reliance on artificial intelligence (AI) for many different applications, the sharing of code, data, and models is important to ensure the replicability and democratization of scientific knowledge. Many high-profile venues expect code and models to be submitted and released with papers. Furthermore, developers often want to release these assets to encourage development of technology that leverages their frameworks and services. A number of organizations have expressed concerns about inappropriate or irresponsible use of AI and have proposed ethical guidelines for the use of AI. While such guidelines can help set norms and shape policy, they are not easily enforceable. In this paper we advocate the use of licensing to enable legally enforceable behavioral use conditions on software and data and provide several case studies that demonstrate the feasibility of behavioral use licensing. We envision how licensing may be implemented in accordance with existing responsible AI guidelines. Furthermore, by using such licenses, developers provide a signal to the AI community and governmental bodies that they are taking responsibility for their technologies.


Best vs. All: Equity and Accuracy of Standardized Test Score Reporting

Mingzi Niu, Sampath Kannan, Aaron Roth and Rakesh Vohra

We study a game theoretic model of standardized testing for college admissions. Students are of two types; High and Low. There is a college that would like to admit the High type students. Students take a potentially costly standardized exam which provides a noisy signal of their type. The students come from two populations, which are identical in talent (i.e. the type distribution is the same), but differ in their access to resources: the higher resourced population can at their option take the exam multiple times, whereas the lower resourced population can only take the exam once. We study two models of score reporting, which capture existing policies used by colleges. The first policy (sometimes known as ``super-scoring'') allows students to report the max of the scores they achieve. The other policy requires that all scores be reported. We find in our model that requiring that all scores be reported results in superior outcomes in equilibrium, both from the perspective of the college (the admissions rule is more accurate), and from the perspective of equity across populations: a student's probability of admission is independent of their population, conditional on their type. In particular, the false positive rates and false negative rates are identical in this setting, across the highly and poorly resourced student populations. This is the case despite the fact that the more highly resourced students can---at their option---either report a more accurate signal of their type, or pool with the lower resourced population under this policy.


Beyond Fairness: Reparative Algorithms to Address Historical Injustices of Housing Discrimination in the US

Wonyoung So, Pranay Lohia, Rakesh Pimplikar, A.E. Hosoi and Catherine D'Ignazio

Fairness in Machine Learning (ML) has mostly focused on interrogating the fairness of a particular decision point with assumptions made that the people represented in the data have been fairly treated throughout history. However, fairness cannot be ultimately achieved if such assumptions are not valid. This is the case for mortgage lending discrimination in the US, which should be critically understood as the result of historically accumulated injustices that were enacted through public policies and private practices including redlining, racial covenants, exclusionary zoning, and predatory inclusion, among others. With the erroneous assumptions of historical fairness in ML, Black borrowers with low income and low wealth are considered as a given condition in a lending algorithm, thus rejecting loans to them would be considered a “fair” decision even though Black borrowers were historically excluded from homeownership and wealth creation. To emphasize such issues, we introduce case studies using contemporary mortgage lending data as well as historical census data in the US. First, we show that historical housing discrimination has differentiated each racial group’s baseline wealth which is a critical input for algorithmically determining mortgage loans. The second case study estimates the cost of housing reparations in the algorithmic lending context to redress historical harms because of such discriminatory housing policies. Through these case studies, we envision what reparative algorithms would look like in the context of housing discrimination in the US. This work connects to emerging scholarship on how algorithmic systems can contribute to redressing past harms through engaging with reparations policies and programs.


Bias in Automated Speaker Recognition

Wiebke Toussaint and Aaron Yi Ding

Automated speaker recognition uses data processing to identify speakers by their voices. Today, automated speaker recognition technologies are deployed on billions of smart devices for supporting services such as in call centres. Despite their wide-scale deployment and the known sources of bias as found in face recognition and natural language processing, bias in automated speaker recognition has not been systematically studied. Focusing on speaker verification, a core task in automated speaker recognition, this paper presents an in-depth empirical and analytical study of bias in the speaker verification machine learning development workflow. Drawing on an established framework for understanding sources of harm in machine learning, we show that bias exists at every development stage in the well-known VoxCeleb Speaker Recognition Challenge, including model building, implementation, and data generation. Most affected are female speakers and non-US nationalities, who experience significant performance degradation. Leveraging the insights from our findings, we make practical recommendations for mitigating bias in automated speaker recognition, and outline future research directions.


Can Machines Help Us Answering Question 16 in Datasheets, and in turn Reflecting on Inappropriate Content?

Patrick Schramowski, Christopher Tauchmann and Kristian Kersting

Large datasets underlying much of current machine learning raise serious issues concerning inappropriate content such as offensive, insulting, threatening, or might otherwise cause anxiety. This calls for increased dataset documentation, e.g., using datasheets. They, among other topics, encourage to reflect on the composition of the datasets. So far, this documentation, however, is done manually and therefore can be tedious and error-prone, especially for large image datasets. Here we ask the arguably "circular" question of whether a machine can help us reflect on inappropriate content, answering Question 16 in Datasheets. To this end, we propose to use the information stored in pre-trained transformer models to assist us in the documentation process. Specifically, prompt-tuning based on a dataset of socio-moral values steers CLIP to identify potentially inappropriate content, therefore reducing human labor. We then document the inappropriate images found using word clouds, based on captions generated using a vision-language model. The documentations of two popular, large-scale computer vision datasets---ImageNet and OpenImages---produced this way suggest that machines can indeed help dataset creators to answer Question 16 on inappropriate image content.


A Framework for Deprecating Datasets: Standardizing Documentation, Identification, and Communication

Alexandra Sasha Luccioni, Frances Corry, Hamsini Sridharan, Mike Ananny, Jason Schultz and Kate Crawford

Datasets are central to training machine learning (ML) models. The ML community has recently made significant improvements todata stewardship and documentation practices across the model development life cycle. However, the act of deprecating, or deleting, datasets has been largely overlooked, and there are currently no standardized approaches for structuring this stage of the datasetlife cycle. In this paper, we study the practice of dataset deprecation in ML, identify several cases of datasets that continued tocirculate despite having been deprecated, and describe the different technical, legal, ethical, and organizational issues raised by such continuations. We then propose a Dataset Deprecation Framework that includes considerations of risk, mitigation of impact, appealmechanisms, timeline, post-deprecation protocols, and publication checks that can be adapted and implemented by the ML community.Finally, we propose creating a centralized, sustainable repository system for archiving datasets, tracking dataset modifications ordeprecations, and facilitating practices of care and stewardship that can be integrated into research and publication processes.


Causal Inference Struggles with Agency on Online Platforms

Smitha Milli, Luca Belli and Moritz Hardt

Online platforms regularly conduct randomized experiments to understand how changes to the platform causally affect various outcomes of interest. However, experimentation on online platforms has been criticized for having, among other issues, a lack of meaningful oversight and user consent. As platforms give users greater agency, it becomes possible to conduct observational studies in which users self-select into the treatment of interest as an alternative to experiments in which the platform controls whether the user receives treatment or not. In this paper, we conduct four large-scale within-study comparisons on Twitter aimed at assessing the effectiveness of observational studies derived from user self-selection on online platforms. In a within-study comparison, treatment effects from an observational study are assessed based on how effectively they replicate results from a randomized experiment with the same target population. We test the naive difference in group means estimator, exact matching, regression adjustment, and inverse probability of treatment weighting while controlling for plausible confounding variables. In all cases, all observational estimates perform poorly at recovering the ground-truth estimate from the analogous randomized experiments. In all cases except one, the observational estimates have the opposite sign of the randomized estimate. Our results suggest that observational studies derived from user self-selection are a poor alternative to randomized experimentation on online platforms. In discussing our results, we postulate "Catch-22"s that suggest that the success of causal inference in these settings may be at odds with the original motivations for providing users with greater agency.


Characterizing Properties and Trade-offs of Centralized Delegation Mechanisms in Liquid Democracy

Brian Brubach, Audrey Ballarin and Heeba Nazeer

Liquid democracy is a form of transitive delegative democracy that has received a flurry of scholarly attention from the computer science community in recent years. In its simplest form, every agent starts with one vote and may have other votes assigned to them via delegation from other agents. They can choose to delegate all votes assigned to them to another agent or vote directly with all votes assigned to them. However, many proposed realizations of liquid democracy allow for agents to express their delegation/voting preferences in more complex ways (e.g., a ranked list of potential delegates) and employ a centralized delegation mechanism to compute the final vote tally. In doing so, centralized delegation mechanisms can make decisions that affect the outcome of a vote and where/whether agents are able to delegate their votes. Much of the analysis thus far has focused on the ability of these mechanisms to make a correct choice. We extend this analysis by introducing and formalizing other important properties of a centralized delegation mechanism in liquid democracy with respect to crucial features such as accountability, transparency, explainability, fairness, and user agency. In addition, we evaluate existing methods in terms of these properties, show how some prior work can be augmented to achieve desirable properties, prove impossibility results for achieving certain sets of properties simultaneously, and highlight directions for future work.


CounterFAccTual: How FAccT Undermines Its Organizing Principles

Ben Gansky and Sean McDonald

This essay joins recent scholarship in arguing that FAccT’s fundamental framing of the potential to achieve the normative conditions for justice through bettering the design of algorithmic systems is counterproductive to achieving said justice in practice. Insofar as the FAccT community’s research tends to prioritize design-stage interventions, it ignores the fact that the majority of the contextual factors that practically determine FAccT outcomes happen in the implementation and impact stages of AI/ML lifecycles. We analyze an emergent and widely-cited movement within the FAccT community for attempting to honor the centrality of contextual factors in shaping social outcomes, a set of strategies we term ‘metadata maximalism’. Symptomatic of design-centered approaches, metadata maximalism abstracts away its reliance on institutions and structures of justice that are, by every observable metric, already struggling (where not failing) to provide accessible, enforceable rights. These justice infrastructures, moreover, are currently wildly under-equipped to manage the disputes arising from digital transformation and machine learning. The political economy of AI/ML implementation provides further obstructions to realizing rights. Data and software supply chains, in tandem with intellectual property protections, introduce structural sources of opacity. Where duties of care to vulnerable persons should reign, profit incentives are given legal and regulatory primacy. Errors are inevitable and inextricable from the development of machine learning systems. In the face of these realities, FAccT programs, including metadata maximalism, tend to project their efforts in a fundamentally counter-factual universe: one in which functioning institutions and processes for due diligence in implementation and for redress of harms are working and ready to interoperate with. Unfortunately, in our world, these institutions and processes have been captured by the interests they are meant to hold accountable, intentionally hollowed-out, and/or were never designed to function in today’s sociotechnical landscape. Continuing to produce (fair! accountable! transparent!) data-enabled systems that operate in high-impact areas, irrespective of this landscape’s radically insufficient paths to justice, given the unavoidability of errors and/or intentional misuse in implementation, and the exhaustively-demonstrated disproportionate distribution of resulting harms onto already-marginalized communities, is a choice - a choice to be CounterFAccTual.


Counterfactual Shapley Additive Explanations

Emanuele Albini, Jason Long, Danial Dervovic and Daniele Magazzeni

Feature attributions are a common paradigm for model explanations due to their simplicity in assigning a single numeric score for each input feature to a model. In the actionable recourse setting, wherein the goal of the explanations is to improve outcomes for model consumers, it is often unclear how feature attributions should be correctly used. With this work, we aim to strengthen and clarify the link between actionable recourse and feature attributions. Concretely, we propose a variant of SHAP, Counterfactual SHAP (CF-SHAP), that incorporates counterfactual information to produce a background dataset for use within the marginal (a.k.a. interventional) Shapley value framework. We motivate the need within the actionable recourse setting for careful consideration of background datasets when using Shapley values for feature attributions with numerous synthetic examples. Moreover, we demonstrate the efficacy of CF-SHAP by proposing and justifying a quantitative score for feature attributions, counterfactual-ability, showing that as measured by this metric, CF-SHAP is superior to existing methods when evaluated on public datasets using tree ensembles.


Evaluation Gaps in Machine Learning Practice

Ben Hutchinson, Negar Rostamzadeh, Christina Greer, Katherine Heller and Vinodkumar Prabhakaran

Forming a reliable judgement of a machine learning (ML) model’s appropriateness for an application ecosystem is critical for its responsible use, and requires considering a broad range of factors including harms, benefits, and responsibilities. In practice, however, evaluations of ML models frequently focus on only a narrow range of decontextualized predictive behaviours. We examine the evaluation gaps between the idealized breadth of evaluation concerns and the observed narrow focus of actual evaluations. Through an empirical study of papers from recent high-profile conferences in the Computer Vision and Natural Language Processing communities, we demonstrate a general focus on a handful of evaluation methods. By considering the metrics and test data distributions used in these methods, we draw attention to which properties of models are centered in the field, revealing the properties that are frequently neglected or sidelined during evaluation. By studying these properties, we demonstrate the machine learning discipline’s implicit assumption of a range of commitments which have normative impacts; these include commitments to consequentialism, abstractability from context, the quantifiability of impacts, the limited role of model inputs in evaluation, and the equivalence of different failure modes. Shedding light on these assumptions enables us to question their appropriateness for ML system contexts, pointing the way towards more contextualized evaluation methodologies for robustly examining the trustworthiness of ML models.


Critical Tools for Machine Learning: Working with Intersectional Critical Concepts in Machine Learning Systems Design

Goda Klumbyte, Claude Draude and Alex Taylor

This paper investigates how intersectional critical theoretical concepts from social sciences and humanities research can be worked with in machine learning systems design. It does so by presenting a case study of a series of speculative design workshops, conducted in 2021, that were driven by such critical theoretical concepts. These workshops drew on critical intersectional feminist methodologies to construct interdisciplinary interventions in the design of machine learning systems, towards more inclusive, accountable, and contextualized systems design. The concepts of “situating/situated knowledges”, "figuration", "diffraction" and “critical fabulation/speculation” were taken up as theoretical and methodological tools for concept-led design workshops. This paper presents the design and results of the workshops and highlights tensions and possibilities with regards to interdisciplinary machine learning systems design towards more inclusive, contextualized, and accountable systems. It discusses the role that critical theoretical concepts can play in a design process and shows how such concepts can work as methodological tools that nonetheless require an open-ended experimental space to function. It presents insights and discussion points regarding what it means to work with critical intersectional knowledge that is inextricably connected to its historical and socio-political roots, and how this re-figures what it might mean to design fair and accountable systems.


CrowdWorkSheets: Accounting for Individual and Collective Identities Underlying Crowdsourced Dataset Annotation

Razvan Amironesei, Dylan Baker, Emily Denton, Mark Díaz, Ian Kivlichan, Vinodkumar Prabhakaran and Rachel Rosen

Human annotations play a crucial role in machine learning (ML) research and development. However, the ethical considerations around the processes and decisions that go into building ML datasets have not received nearly enough attention. In this paper, we survey an array of literature that provides insights into ethical considerations around crowdsourced dataset annotation. We synthesize these insights, and lay out the challenges in this space along two layers: (1) who the annotator is, and how the annotators' lived experiences can impact their annotations, and (2) the relationship between the annotators and the crowdsourcing platforms, and what that relationship affords them. Finally, we introduce a novel framework, CrowdWorkSheets, for dataset developers to facilitate transparent documentation of key decisions points at various stages of the ML data pipeline: task formulation, selection of annotators, platform and infrastructure choices, dataset analysis and evaluation, and dataset release and maintenance.


Data augmentation for fairness-aware machine learning: Preventing algorithmic bias in law enforcement systems

Ioannis Pastaltzidis, Nikolaos Dimitriou, Katherine Quezada-Tavárez, Stergios Aidinlis, Thomas Marquenie, Agata Gurzawska and Dimitrios Tzovaras

Researchers and practitioners in the fairness community have highlighted the ethical and legal challenges of using biased datasets in data-driven systems, with algorithmic bias being a major concern. Despite the rapidly growing body of literature on fairness in algorithmic decision-making, there remains a paucity of fairness scholarship on machine learning algorithms for the real-time detection of crime. This contribution presents an approach for fairness-aware machine learning to mitigate the algorithmic bias / discrimination issues posed by the reliance on biased data when building law enforcement technology. Our analysis is based on RWF-2000, which has served as the basis for violent activity recognition tasks in data-driven law enforcement projects. We reveal issues of overrepresentation of minority subjects in violence situations that limit the external validity of the dataset for real-time crime detection systems and propose data augmentation techniques to rebalance the dataset. The experiments on real world data show the potential to create more balanced datasets by synthetically generated samples, thus mitigating bias and discrimination concerns in law enforcement applications.


Data Cards: Purposeful and Transparent Documentation for Responsible AI

Mahima Pushkarna, Andrew Zaldivar and Oddur Kjartansson

As we move towards large-scale models capable of numerous downstream tasks, the complexity of understanding multi-modal datasets that give nuance to models rapidly increases. A clear and thorough understanding of a dataset's origins, development, intent, ethical considerations and evolution becomes a necessary step for the responsible and informed deployment of models, especially those in people-facing contexts and high-risk domains. However, the burden of this understanding often falls on the intelligibility, conciseness, and comprehensiveness of its often inadequate documentation. Moreover, consistency and comparability across documentation of all datasets involved suggests that it should be treated as a user-centric product in and of itself. In this paper, we propose Data Cards for encouraging transparent, purposeful and human-centered documentation of datasets within the practical contexts of industry and research. Data Cards are structured summaries of essential facts about various aspects of machine learning datasets needed by stakeholders across a dataset's lifecycle for responsible artificial intelligence development. These summaries provide explanations of processes and rationales that shape the data and consequently the models—such as upstream sources, data collection and annotation methods; training and evaluation methods, intended use, or decisions affecting model performance. We also present frameworks that ground Data Cards in real-world utility and human-centricity. Using two case studies, we report on desirable characteristics that support adoption across domains, organizational structures, and audience groups. Finally, we present lessons learned from deploying over 20 Data Cards.


Data Governance in the Age of Large-Scale Data-Driven Language Technology

Yacine Jernite, Huu Nguyen, Stella Biderman, Anna Rogers, Maraim Masoud, Valentin Danchev, Samson Tan, Alexandra Sasha Luccioni, Nishant Subramani, Isaac Johnson, Gérard Dupont, Jesse Dodge, Kyle Lo, Zeerak Talat, Dragomir Radev, Aaron Gokaslan, Somaieh Nikpoor, Peter Henderson, Rishi Bommasani and Margaret Mitchell

The recent emergence and adoption of machine learning technology, and specifically of Large Language Models, has drawn attention to the need for systematic and transparent management of language data. This work proposes an approach to global language data governance that attempts to organize data management amongst stakeholders, values, and rights. Our proposal is informed by prior work on distributed governance that accounts for human values and grounded by an international research collaboration that brings together researchers and practitioners from 60 countries. The framework we present is a multi-party international governance structure focused on language data, and incorporating technical and organizational tools needed to support its work.


De-biasing "bias" measurement

Kristian Lum, Yunfeng Zhang and Amanda Bower

When a model's performance differs across socially or culturally relevant groups--like race, gender, or the intersections of many such groups--it is often called "biased." While much of the work in algorithmic fairness over the last several years has focused on developing various definitions of model fairness (the absence of model ``bias") and eliminating such ``bias," much less work has gone into rigorously measuring it. In practice, it important to have high quality, human digestible measures of model performance disparities and associated uncertainty quantification about them that can serve as inputs into multi-faceted decision-making processes. In this paper, we show both mathematically and through simulation that many of the metrics used to measure ``bias" are themselves statistically biased estimators of the underlying quantities they purport to represent. We argue that this can cause misleading conclusions about the relative ``bias" along different sensitive variables, especially in cases where some sensitive variables consist of categories with few members. We propose the ``double-corrected" variance estimator, which provides unbiased estimates and uncertainty quantification of variance of model performance using the bootstrap. It is conceptually simple and easily implementable without statistical software package or numerical optimization. We demonstrate the utility of this approach through simulation and show on a real dataset that while biased estimators of model ``bias" indicate statistically significant between-group model performance disparities, when appropriately accounting for statistical bias in the estimator, the estimated model bias is no longer statistically significant.


Decision Time: Normative Dimensions of Algorithmic Speed

Daniel Susser

Existing discussions about automated decision-making focus primarily on its inputs and outputs, raising questions about data collection and privacy on one hand and accuracy and fairness on the other. Less attention has been devoted to critically examining the temporality of decision-making processes—the speed at which automated decisions are reached. In this paper, I identify four dimensions of algorithmic speed that merit closer analysis. Duration (how much time it takes to reach a judgment), timing (when automated systems intervene in the activity being evaluated), frequency (how often evaluations are performed), and lived time (the human experience of algorithmic speed) are interrelated, but distinct, features of automated decision-making. Choices about the temporal structure of automated decision-making systems have normative implications, which I describe in terms of "disruption," "displacement," "re-calibration," and "temporal fairness." Values like accuracy, fairness, accountability, and legitimacy hang in the balance. As computational tools are increasingly tasked with making judgments about human activities and practices, the designers of decision-making systems will have to reckon, I argue, with when—and how fast—judgments ought to be rendered. Though computers are capable of reaching decisions at incredible speeds, failing to account for the temporality of automated decision-making risks misapprehending the costs and benefits automation promises.


Demographic-Reliant Algorithmic Fairness: Characterizing the Risks of Demographic Data Collection in the Pursuit of Fairness

Mckane Andrus and Sarah Villeneuve

Most proposed algorithmic fairness techniques require access to data on a “sensitive attribute” or “protected category” (such as race, ethnicity, gender, or sexuality) in order to make performance comparisons and standardizations across groups, however the this data is largely unavailable in practice, hindering the widespread adoption of algorithmic fairness. Prior work has highlighted a number of organizational, legal, and privacy risks associated with the collection and use of demographic data which constrain organizations' algorithmic fairness efforts. To date most organizations have sought to mitigate the legal and privacy risks behind collecting demographic data and conducting fairness assessments through methods such as privacy preserving data processing. However, efforts to enable algorithmic fairness are done without consideration of broader social impacts. Through this paper, we challenge the core assumption of demographic-based algorithmic fairness techniques which posits that discrimination can be overcome with smart enough technical methods and sufficient data. As a result, these techniques largely ignore broader questions of data governance and systemic oppression when categorizing individuals for the purpose of fairer algorithmic processing. In this work, we explore under what conditions demographic data should be collected and used to enable algorithmic fairness methods by characterizing a range of social risks to individuals and communities. For the risks to individuals we consider the unique privacy risks associated with the sharing of sensitive attributes likely to be the target of fairness analysis, the possible harms stemming from miscategorizing and misrepresenting individuals in the data collection process, and the use of sensitive data beyond data subjects' expectations. Looking more broadly, the risks to entire groups and communities include the expansion of surveillance infrastructure in the name of fairness, misrepresenting and mischaracterizing what it means to be part of a demographic group or to hold a certain identity, and ceding the ability to define for themselves what constitutes biased or unfair treatment. We argue that, by confronting these questions before and during the collection of demographic data, algorithmic fairness methods are more likely to actually mitigate harmful treatment disparities without reinforcing systems of oppression.


Designing Up with Value-Sensitive Design: Building a Field Guide for Ethical Machine Learning Development

Karen Boyd

If "studying up," or researching powerful actors in a social system, can offer insight into the workings and effects of power in social systems, this paper argues that "designing up" will give researchers and designers a tool to intervene. This paper offers a conception of "designing up," applies the structure of Value Sensitive Design (VSD) to accomplish it, and submits an example of a tool designed to support ethical sensitivity, especially particularization and judgment. The designed artifact is a field guide for ethical mitigation strategies that uses tool profiles and filters to aid machine learning (ML) engineers as they build understanding of an ethical issue they have recognized and as they match the particulars of their problem to a technical ethical mitigation. This guide may broaden its users' awareness of potential ethical issues, important features of ethical issues and their mitigations, and the breadth of available mitigations. Additionally, it may encourage ethical sensitivity in future ML projects. Feedback from ML engineers and technology ethics researchers rendered several usability improvements and ideas for future development.


Disclosure by Design: Designing information disclosures to support meaningful transparency and accountability

Chris Norval, Kristin Cornelius, Jennifer Cobbe and Jatinder Singh

There is a strong push for organisations to become more transparent and accountable for their undertakings. Towards this, various transparency regimes oblige organisations to disclose certain information to relevant stakeholders (individuals, regulators, etc). This information intends to empower and support the monitoring, oversight, scrutiny and challenge of organisational practices. Importantly, however, these disclosures are of limited benefit if they are not meaningful for their recipients. Yet, in practice, the disclosures of tech/data-driven organisations are often highly technical, fragmented, and therefore of limited utility to all but experts. This undermines their effectiveness, works to disempower, and ultimately hinders broader transparency aims. This paper argues for a paradigm shift towards reconceptualising disclosures as `interfaces' -- designed for the needs, expectations and requirements of the recipients they serve to inform. In making this case, and to provide a practical way forward, we demonstrate one potential methodology for specifying, designing, and deploying more effective information disclosures. Focusing on data protection disclosures, we illustrate and explore how designing disclosures as interfaces can better support greater oversight of organisational data and practices, and thus better align with broader transparency and accountability aims.


Disentangling Research Ethics in Machine Learning

Carolyn Ashurst, Solon Barocas, Rosie Campbell and Inioluwa Deborah Raji

Machine learning (ML) has rapidly evolved from a niche academic field to a ubiquitous technology with significant impacts on individuals, groups, wider society and the environment. In growing recognition of the catalogue of harms that can result from both deployed systems and research outputs, the ML research community has had to re-evaluate what it means to do ML research ethically and responsibly. However, there remains much confusion around what ethics means in this research context. For some, it is about duty of care to research subjects. For others, it is about research best practice (such as reproducibility), or anticipation of potential harms from future deployed systems. In this piece, we aim to disentangle the different components of ML research ethics. By doing so, we hope to enable more effective discussion of how the research community should respond to its new responsibilities, and to enable those developing governance mechanisms to more clearly articulate which components of research ethics are targeted by the mechanism.


Don't let Ricci v. DeStefano Hold You Back: A Bias-Aware Legal Solution to the Hiring Paradox

Jad Salem, Deven Desai and Swati Gupta

Companies that try to address inequality in employment face a hiring paradox. Failing to address workforce imbalance can result in legal sanctions and scrutiny, but proactive measures to address these issues might result in the same legal conflict. Recent run-ins of Microsoft and Wells Fargo with the Labor Department's Office of Federal Contract Compliance Programs (OFCCP) are not isolated and are likely to persist. To add to the confusion, existing scholarship on Ricci v. DeStefano often deems solutions to this paradox impossible. Circumventive practices such as the 4/5ths rule further illustrate tensions between too little action and too much action. In this work, we give a powerful way to solve this hiring paradox that tracks both legal and algorithmic challenges. We unpack the nuances of Ricci v. DeStefano and extend the legal literature arguing that certain algorithmic approaches to employment are allowed by introducing the legal practice of banding to evaluate candidates. We thus show that a bias-aware technique can be used to diagnose and mitigate ``built-in'' headwinds in the employment pipeline. We use the machinery of partially ordered sets to handle the presence of uncertainty in evaluations data. This approach allows us to move away from treating ``people as numbers'' to treating people as individuals---a property that is sought after by Title VII in the context of employment.


Don’t Throw it Away! The Utility of Unlabeled Data in Fair Decision Making

Miriam Rateike, Ayan Majumdar, Olga Mineeva, Krishna P. Gummadi and Isabel Valera

Decision making algorithms, in practice, are often trained on data that exhibits a variety of biases. Decision-makers often aim to take decisions based on some ground-truth target that is assumed or expected to be unbiased, i.e., equally distributed across socially salient groups. In many practical settings, the ground-truth cannot be directly observed, and instead, we have to rely on a biased proxy measure of the ground-truth, i.e., biased labels, in the data. In addition, data is often selectively labeled, i.e., even the biased labels are only observed for a small fraction of the data that received a positive decision. To overcome label and selection biases, recent work proposes to learn stochastic, exploring decision policies via i) online training of new policies at each time-step and ii) enforcing fairness as a constraint on performance. However, the existing approach uses only labeled data, disregarding a large amount of unlabeled data, and thereby suffers from high instability and variance in the learned decision policies at different times. In this paper, we propose a novel method based on a variational autoencoder for practical fair decision-making. Our method learns an unbiased data representation leveraging both labeled and unlabeled data and uses the representations to learn a policy in an online process. Using synthetic data, we empirically validate that our method converges to the optimal (fair) policy according to the ground-truth with low variance. In real-world experiments, we further show that our training approach not only offers a more stable learning process but also yields policies with higher fairness as well as utility than previous approaches.


DualCF: Efficient Model Extraction Attack from Counterfactual Explanations

Yongjie Wang, Hangwei Qian and Chunyan Miao

Cloud service providers have launched Machine-Learning-as-a-Service (MLaaS) platforms to allow users to access large-scale cloud-based models via APIs. In addition to prediction outputs, these APIs can also provide other information in a more human-understandable way, such as counterfactual explanations (CF). However, such extra information inevitably causes the cloud models to be more vulnerable to extraction attacks that aim to steal the internal functionality of models in the cloud. Due to the black-box nature of cloud models, however, a vast number of queries are inevitably required by existing attack strategies before the substitute model achieves high fidelity. In this paper, we propose a novel simple yet efficient querying strategy to greatly enhance the querying efficiency. This is motivated by our observation that current querying strategies suffer from decision boundary shift issue induced by taking far-distant queries and close-to-boundary CFs into substitute model training. We then propose DualCF strategy to circumvent the above issues, which is achieved by taking not only CF but also counterfactual explanation of CF (CCF) as pairs of training samples for the substitute model. Extensive and comprehensive experimental evaluations are conducted on both synthetic and real-world datasets. The experimental results favorably illustrate that DualCF can produce a high-fidelity model with fewer queries efficiently and effectively.


Dynamic Privacy Budget Allocation Improves Data Efficiency of Differentially Private Gradient Descent

Junyuan Hong, Zhangyang Wang and Jiayu Zhou

Protecting privacy in learning while maintaining the model performance has become increasingly critical in many applications that involve sensitive data. A popular private learning framework is differentially private learning composed of many privatized gradient iterations by noising and clipping. Under the privacy constraint, it has been shown that the dynamic policies could improve the final iterate loss, namely the quality of published models. In this talk, we will introduce these dynamic techniques for learning rate, batch size, noise magnitude and gradient clipping. Also, we will discuss how the dynamic policy could change the convergence bounds which further provides insight of the impact of dynamic methods.


Enforcing Group Fairness in Algorithmic Decision Making: Utility Maximization Under Sufficiency

Joachim Baumann, Anikó Hannák and Christoph Heitz

Binary decision making classifiers are not fair by default. Fairness requirements are an additional element to the decision making rationale, which is typically driven by maximizing some utility function. In that sense, algorithmic fairness can be formulated as a constrained optimization problem. This paper contributes to the discussion on how to implement fairness, focusing on the fairness concepts of positive predictive value (PPV) parity, false omission rate (FOR) parity, and sufficiency (which combines the former two). We show that group-specific threshold rules are optimal for PPV parity and FOR parity, similar to well-known results for other group fairness criteria. However, depending on the underlying population distributions and the utility function, we find that sometimes an upper-bound threshold rule for one group is optimal: utility maximization under PPV parity (or FOR parity) might thus lead to selecting the individuals with the smallest utility for one group, instead of selecting the most promising individuals. This result is counter-intuitive and in contrast to the analogous solutions for statistical parity and equality of opportunity. We also provide a solution for the optimal decision rules satisfying the fairness constraint sufficiency. We show that more complex decision rules are required and that this leads to within-group unfairness for all but one of the groups. We illustrate our findings based on simulated and real data.


Equi-explanation Maps: Concise and Informative Global Summary Explanations

Tanya Chowdhury, Razieh Rahimi and James Allan

In this work, we propose to summarize the model logic of a blackbox in order to generate concise and informative global explanations. We propose equi-explanation maps, a new explanation data-structure that presents the region of interest as a union of equi-explanation subspaces along with their explanation vectors. We then propose E-Map, a method to generate equi-explanation maps. We demonstrate the broad utility of our approach by generating equi-explanation maps for various binary classification models (Logistic Regression, SVM, MLP, and XGBoost) on the UCI Heart disease dataset and the Pima Indians diabetes dataset. Each subspace in our generated map is the union of $d$-dimensional hyper-cuboids which can be compactly represented for the sake of interpretability. For each of these subspaces we present linear explanations assigning weights to each explanation feature. We justify the use of equi-explanation maps in comparison to other global explanation methods by evaluating in terms of interpretability, fidelity, and informativeness. A user study further corroborates the use of equi-explanation maps to generate compact and informative global explanations.


Equitable Public Bus Network Optimization for Social Good: A Case Study of Singapore

David Tedjopurnomo, Zhifeng Bao, Farhana Choudhury, Hui Luo and A. K. Qin

Public bus transport is a major backbone of many cities' socioeconomic activities. Thus, the topic of public bus network optimization has received substantial attention. We find that most of the literature focused on improving only the efficiency and service satisfaction of the bus network, neglecting the important equity factors. Focusing solely on efficiency and service satisfaction may cause public transport resources to be shifted away from underrepresented areas with disadvantaged demographics, compounding the equity problem. In this work, we perform a case study using Singapore's public bus network as an example. We describe the difficulties in designing an equitable public bus network, propose several choices of equity metrics, discuss their advantages and disadvantages, and perform some experiments to assess each metric's real-life impact. For our experiments, We have curated and combined Singapore's public transit data, road network data, census area boundaries data, and demographics data into a unified dataset. Our objective is not only to explore this important yet relatively unexplored problem but also to inspire more discussion and research on equitable public transport.


Ethical Concerns and Perceptions of Consumer Neurotechnology from Lived Experiences of Mental Workload Tracking

Serena Midha, Max Wilson and Sarah Sharples

With rapid growth in the development of consumer neurotechnology, it is imperative to consider the ethical implications that this might have in order to minimise consumer harm. Whilst ethical and legal guidelines for commercialisation have previously been suggested, we aimed to further this discussion by investigating the ethical concerns held by potential end users of consumer neurotechnology. 19 participants who had previously experienced mental workload tracking in their daily lives were interviewed about their ethical concerns and perceptions of this type of future neurotechnology. An Interpretive Phenomenological Analysis (IPA) approach identified three superordinate themes. These related to concerns surrounding privacy, data validity and misinterpretation, and personal identity.The findings provide further validation for previous research and highlight further ethical considerations that should be factored into the commercialisation of neurotechnology.


Evidence for Hypodescent in Visual Semantic AI

Robert Wolfe, Mahzarin Banaji and Aylin Caliskan

We test a state-of-the-art multimodal "visual semantic" model, OpenAI's CLIP ("Contrastive Language Image Pretraining"), for the rule of hypodescent, or one-drop rule, whereby multiracial people are more likely to be assigned a racial or ethnic label corresponding to a minority or disadvantaged racial or ethnic group than to the equivalent majority or advantaged group. A face morphing experiment grounded in psychological research demonstrating hypodescent indicates that, at the midway point of 1,000 series of morphed images, CLIP associates 69.7% of Black-White female images with a "Black" text label over a "White" text label, and similarly prefers "Latina" (75.8%) and "Asian" (89.1%) text labels at the midway point for Latina-White female and Asian-White female morphs, reflecting hypodescent. Additionally, assessment of the underlying cosine similarities in the model reveals that association with "White" is strongly correlated with association with "person," with Pearson's rho as high as .82, p < 10^-90 over a 21,000-image morph series, indicating that a "White" person corresponds to the default representation of a person in CLIP. Finally, we show that the valence association of an image strongly correlates with association with the "Black" text label in CLIP, with Pearson's rho = .48, p < 10^-90 for 21,000 Black-White multiracial male images, and rho = .41, p < 10^-90 for Black-White multiracial female images. CLIP is trained by an American company on English-language text gathered using data collected from an American website (Wikipedia), and our findings demonstrate that CLIP is not neutral. Rather, it embeds the values of American racial hierarchy, reflecting the implicit and explicit beliefs about hierarchy that are present in human minds. To support this discovery, we contextualize our findings within the history of and psychology of hypodescent. Overall, the data demonstrate the threats to fairness, accountability, and transparency inherent in visual semantic AI, and suggest that AI supervised using natural language will, unless checked, learn biases that reify and even enhance racial hierarchies.


Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits

Wesley Hanwen Deng, Manish Nagireddy, Michelle Seng Ah Lee, Jatinder Singh, Zhiwei Steven Wu, Kenneth Holstein and Haiyi Zhu

Recent years have seen the development of many open-source ML fairness toolkits aimed at helping ML practitioners assess and address unfairness in their systems. However, there has been little research investigating how ML practitioners actually use these toolkits in practice. In this paper, we conducted the first in-depth empirical exploration of how industry practitioners (try to) work with existing fairness toolkits. In particular, we conducted think-aloud interviews to understand how participants learn about and use fairness toolkits, and explored the generality of our findings through an anonymous online survey. We identified several opportunities for fairness toolkits to better address practitioner needs and scaffold them in using toolkits effectively and responsibly. Based on these findings, we highlight implications for the design of future open-source fairness toolkits that can support practitioners in better contextualizing, communicating, and collaborating around ML fairness efforts.


Exploring the Role of Grammar and Word Choice in Bias Toward African American English (AAE) in Hate speech Classification

Camille Harris, Matan Halevy, Ayanna Howard, Amy Bruckman and Diyi Yang

Language usage on social media varies widely even within the context of American English. Despite this, the majority of natural language processing systems are trained only on “Standard American English,” or SAE the dialect most prominent among white Americans. For hate speech classification, prior work has shown that African American English (AAE) is more likely to be misclassified as hate speech. This has harmful implications for Black social media users as it reinforces and exacerbates existing notions of anti-Black racism. While past work has highlighted the relationship between AAE and hate speech classification, no work has explored the linguistic characteristics of AAE that lead to misclassification. Our work uses Twitter datasets for AAE dialect and hate speech classifiers to explore the fine-grained relationship between specific characteristics of AAE such as word choice and grammatical features and hate speech predictions. We further investigate these biases by removing profanity and examining the influence of four aspects of AAE grammar that are distinct from SAE. Results show that removing profanity accounts for a roughly 20 to 30 percent reduction in the percentage of samples classified as 'hate' 'abusive' or 'offensive,' and that similar classification patterns are observed regardless of grammar categories.


FAccT-Check on AI regulation: Systematic Evaluation of AI Regulation on the Example of the Proposed Legislation on the Use of AI in the Public Sector in the German Federal State of Schleswig-Holstein

Katharina Simbeck

In the framework of the current discussions about regulating Artificial Intelligence (AI) and machine learning (ML), the small Federal State of Schleswig-Holstein in Northern Germany hurries ahead and proposes legislation on the Use of AI in the public sector. The legislation aims on the one hand to enable the use of AI in the public sector by creating a legal framework and to limit its potential discriminatory effect on the other hand. Contrary to the European AI Act, which is valid for all companies and organizations in Europe, the Schleswig-Holstein “IT Deployment Law” (ITDL) would therefore only apply to public administrations and agencies in the federal state. The proposed legislation addresses several AI risks, including fairness and transparency, and mitigates them with approaches quite different from the proposed European AI Act (AIA). In this paper, the proposed legislation will be systematically reviewed and discussed with regards to its definition of AI, risk handling, fairness, accountability, and transparency.


FADE: FAir Double Ensemble Learning for Observable and Counterfactual Outcomes

Alan Mishler and Edward H. Kennedy

Methods for building fair predictors often involve tradeoffs between fairness and accuracy and between different fairness criteria, but the nature of these tradeoffs varies. Recent work seeks to characterize these tradeoffs in specific problem settings, but these methods often do not accommodate users who wish to improve the fairness of an existing benchmark model without sacrificing accuracy, or vice versa. These results are also typically restricted to observable accuracy and fairness criteria. We develop a flexible framework for fair ensemble learning that allows users to efficiently explore the fairness-accuracy space or to improve the fairness or accuracy of a benchmark model. Our framework can simultaneously target multiple observable or counterfactual fairness criteria, and it enables users to combine a large number of previously trained and newly trained predictors. We provide theoretical guarantees that our estimators converge at fast rates. We apply our method on both simulated and real data, with respect to both observable and counterfactual accuracy and fairness criteria. We show that, surprisingly, multiple unfairness measures can sometimes be minimized simultaneously with little impact on accuracy, relative to unconstrained predictors or existing benchmark models.


Pareto-Improving Data-Sharing∗

Ronen Gradwohl and Moshe Tennenholtz

We study the effects of data sharing between firms on prices, profits, and consumer welfare. Although indiscriminate sharing of consumer data decreases firm profits due to the subsequent increase in competition, selective sharing can be beneficial. We focus on data-sharing mechanisms that are fair, simultaneously increasing firm profits and every consumer's welfare. We show that such mechanisms exist, and identify one that maximizes firm profits and one that maximizes consumer welfare.


Fair ranking: a critical review, challenges, and future directions

Gourab K Patro, Lorenzo Porcaro, Laura Mitchell, Qiuyue Zhang, Meike Zehlike and Nikhil Garg

Ranking, recommendation, and retrieval systems are widely used in online platforms and other societal systems, including e-commerce, media-streaming, admissions, gig platforms, and hiring. In the recent past, a large “fair ranking” research literature has been developed around making these systems fair to the individuals, providers, or content that are being ranked. Most of this literature defines fairness for a single instance of retrieval, or as a simple additive notion for multiple instances of retrievals over time. This work provides a critical overview of this literature, detailing the often context-specific concerns that such an approach misses: the gap between high ranking placements and true provider utility, spillovers and compounding effects over time, induced strategic incentives, and the effect of statistical uncertainty. We then provide a path forward for a more holistic and impact-oriented fair ranking research agenda, including methodological lessons from other fields and the role of the broader stakeholder community in overcoming data bottlenecks and designing effective regulatory environments.


Fair Representation Clustering with Several Protected Classes

Zhen Dai, Yury Makarychev and Ali Vakilian

We study the problem of fair $k$-median where each cluster is required to have a fair representation of individuals from different groups. In the fair representation $k$-median problem, we are given a set of points $X$ in a metric space. Each point $x\in X$ belongs to one of $\ell$ groups. Further, we are given fair representation parameters $\alpha_j$ and $\beta_j$ for each group $j\in [\ell]$. We say that a $k$-clustering $C_1, \cdots, C_k$ fairly represents all groups if the number of points from group $j$ in cluster $C_i$ is between $\alpha_j |C_i|$ and $\beta_j |C_i|$ for every $j\in[\ell]$ and $i\in [k]$. The goal is to find a set $\sC$ of $k$ centers and an assignment $\phi: X\rightarrow \sC$ such that the clustering defined by $(\sC, \phi)$ fairly represents all groups and minimizes the $\ell_1$-objective $\sum_{x\in X} d(x, \phi(x))$. We present an $O(\log k)$-approximation algorithm that runs in time $n^{O(\ell)}$. Note that the known algorithms for the problem either (i) violate the fairness constraints by an additive term or (ii) run in time that is exponential in both $k$ and $\ell$. We also consider an important special case of the problem where $\alpha_j = \beta_j = \frac{f_j}{f}$ and $f_j, f \in \mathbb{N}$ for all $j\in [\ell]$. For this special case, we present an $O(\log k)$-approximation algorithm that runs in $(kf)^{O(\ell)}\log n + \poly(n)$ time.


Fairness for AUC via Feature Augmentation

Hortense Fong, Vineet Kumar, Anay Mehrotra and Nisheeth K. Vishnoi

We study fairness in the context of classification where the performance is measured by the area under the curve (AUC) of the receiver operating characteristic. AUC is commonly used when both Type I (false positive) and Type II (false negative) errors are important. However, the same classifier can have significantly varying AUCs for different protected groups and, in real-world applications, it is often desirable to reduce such cross-group differences. We address the problem of how to select additional features to most greatly improve AUC for the disadvantaged group. Our results establish that the unconditional variance of features does not inform us about AUC fairness but class-conditional variance does. Using this connection, we develop a novel approach, fairAUC, based on feature augmentation (adding features) to mitigate bias between identifiable groups. We evaluate fairAUC on synthetic and real-world (COMPAS) datasets and find that it significantly improves AUC for the disadvantaged group relative to benchmarks maximizing overall AUC and minimizing bias between groups.


Fairness Indicators for Systematic Assessments of Visual Feature Extractors

Priya Goyal, Adriana Romero Soriano, Caner Hazirbas, Levent Sagun and Nicolas Usunier

Does everyone equally benefit from computer vision systems? Answers to this question become more and more important as computer vision systems are deployed at large scale, and can spark major concerns when they exhibit vast performance discrepancies between people from various demographic and social backgrounds. Systematic diagnosis of fairness, harms, and biases of computer vision systems is an important step towards building socially responsible systems. To initiate an effort towards standardized fairness audits, we propose three fairness indicators, which aim at quantifying harms and biases of visual systems. Our indicators use existing publicly available datasets collected for fairness evaluations, and focus on three main types of harms and bias identified in the literature, namely harmful label associations, disparity in learned representations of social and demographic traits, and biased performance on geographically diverse images from across the world. We define precise experimental protocols applicable to a wide range of computer vision models. These indicators are part of an ever-evolving suite of fairness probes and are not intended to be a substitute for a thorough analysis of the broader impact of the new computer vision technologies. Yet, we believe it is a necessary first step towards (1) facilitating the widespread adoption and mandate of the fairness assessments in computer vision research, and (2) tracking progress towards building socially responsible models. To study the practical effectiveness and broad applicability of our proposed indicators to any visual system, we apply them to “off-the-shelf” models built using widely adopted model training paradigms which vary in their ability to whether they can predict labels on a given image or only produce the embeddings. We also systematically study the effect of data domain and model size. The results of our fairness indicators on these systems suggest that blatant disparities still exist, which highlight the importance on the relationship between the context of the task and contents of a datasets. The code will be released to encourage the use of indicators.


Fairness-aware Model-agnostic Positive and Unlabeled Learning

Co-Winner: Distinguished Paper Award

Ziwei Wu and Jingrui He

With the increasing application of machine learning in high-stake decision-making problems, potential algorithmic bias towards people from certain social groups poses negative impacts on individuals and our society at large. In the real-world scenario, many such problems involve positive and unlabeled data such as medical diagnosis, criminal risk assessment and recommender systems. For instance, in medical diagnosis, only the diagnosed diseases will be recorded (positive) while others will not (unlabeled). Despite the large amount of existing work on fairness-aware machine learning in the (semi-)supervised and unsupervised settings, the fairness issue is largely under-explored in the aforementioned Positive and Unlabeled Learning (PUL) context, where it is usually more severe. In this paper, to alleviate this tension, we propose a fairness-aware PUL method named FAIRPUL. In particular, for binary classification over individuals from two populations, we aim to achieve similar true positive rates and false positive rates in both populations as our fairness metric. Based on the analysis of the optimal fair classifier for PUL, we design a model-agnostic post-processing framework, leveraging both the positive examples and unlabeled ones. Our framework is proven to be statistically consistent in terms of both the classification error and the fairness metric. Experiments on the synthetic and real-world data sets demonstrate that our framework outperforms state-of-the-art in both PUL and fair classification.


Fast online ranking with fairness of exposure

Nicolas Usunier, Virginie Do and Elvis Dohmatob

As recommender systems become essential for sorting and prioritizing the content available online, recommendation policies have a growing impact on the opportunities or revenue of their items producers. For instance, they decide which recruiter a resume is recommended to, or to whom and how much a music track, video or news article is being exposed. This calls for recommendation policies that not only maximize (a proxy of) user satisfaction, but also guarantee some notion of fairness in the exposure of items or groups of items. Formally, such policies are usually obtained by maximizing a concave objective function in the space of randomized rankings. When the total exposure of an item is defined as the sum of its exposure over users, the optimal rankings of every users become coupled, which makes the optimization process challenging. Existing approaches to find these rankings either solve the global optimization problem in a batch setting, i.e., for all users at once, which makes them inapplicable at scale, or are based on heuristics that have weak theoretical guarantees. In this paper, we propose the first efficient online algorithm to optimize concave objective functions in the space of rankings which applies to every concave and smooth objective function, such as the ones found for fairness of exposure. Based on online variants of the Frank-Wolfe algorithm, we show that our algorithm is computationally fast, generating rankings on-the-fly with computation cost dominated by the sort operation, memory efficient, and has strong theoretical guarantees, with a regret that decreases as $O(1/\sqrt{t})$ where $t$ is the number of time steps. In other words, compared to baseline policies that only maximize user-side performance, our algorithm allows to incorporate complex fairness of exposure criteria in the recommendation policies with negligible computational overhead. We present experiments on artificial music and movie recommendation tasks using and MovieLens datasets which suggest that in practice, the algorithm rapidly reaches good performances on three different objectives representing different fairness of exposure criteria.


Female, white, 27? Bias Evaluation on Data and Algorithms for Affect Recognition in Faces

Jaspar Pahl, Ines Rieger, Anna Möller, Thomas Wittenberg and Ute Schmid

Nowadays, Artificial Intelligence (AI) algorithms show a strong performance for many use cases, making them desirable for real-world scenarios where the algorithms provide high-impact decisions. However, one major drawback of AI algorithms is their susceptibility to bias and resulting unfairness. This has a huge influence for their application, as they have a higher failure rate for certain subgroups. In this paper, we focus on the field of affective computing and particularly on the detection of bias for facial expressions. Depending on the deployment scenario, bias in facial expression models can have a disadvantageous impact and it is therefore essential to evaluate the bias and limitations of the model. In order to analyze the metadata distribution in affective computing datasets, we annotate several benchmark training datasets, containing both Action Units and categorical emotions, with age, gender, ethnicity, glasses, and beards. We show that there is a significantly skewed distribution, particularly for ethnicity and age. Based on this metadata annotation, we evaluate two trained state-of-the-art affective computing algorithms. Our evaluation shows that the strongest bias is in age, with the best performance for persons under 34 and a sharp decrease for older persons. Furthermore, we see an ethnicity bias with varying direction depending on the algorithm, a slight gender bias and worse performance for facial parts occluded by glasses.


Flipping the Script on Criminal Justice Risk Assessment: An actuarial model for assessing the risk the federal sentencing system poses to defendants

Mikaela Meyer, Aaron Horowitz, Erica Marshall and Kristian Lum

In the criminal justice system, algorithmic risk assessment instruments are used to predict the risk a defendant poses to society; examples include the risk of recidivating or the risk of failing to appear at future court dates. However, defendants are also at risk of harm from the criminal justice system. To date, there exists no risk assessment instrument that considers the risk the system poses to the individual. We develop a risk assessment instrument that “flips the script.” Using data about U.S. federal sentencing decisions, we build a risk assessment instrument that predicts the likelihood an individual will receive an especially lengthy sentence given factors that should be legally irrelevant to the sentencing decision. To do this, we develop a two-stage modeling approach. Our first-stage model is used to determine which sentences were “especially lengthy.” We then use a second-stage model to predict the defendant’s risk of receiving a sentence that is flagged as especially lengthy given factors that should be legally irrelevant. The factors that should be legally irrelevant include, for example, race, court location, and other socio-demographic information about the defendant. Our instrument achieves comparable predictive accuracy to risk assessment instruments used in pretrial and parole contexts. We discuss the limitations of our modeling approach and use the opportunity to highlight how traditional risk assessment instruments in various criminal justice settings also suffer from many of the same limitations and embedded value systems of their creators.


Four Years of FAccT: A Reflexive, Mixed-Methods Analysis of Research Contributions, Shortcomings, and Future Prospects

Benjamin Laufer, Sameer Jain, A. Feder Cooper, Jon Kleinberg and Hoda Heidari

Fairness, Accountability, and Transparency (FAccT) for socio-technical systems has been a thriving area of research in recent years. An ACM conference bearing the same name has been the central venue for various related lines of work to come together, provide peer feedback to one another, and publish their contributions. This reflexive study aims to shed light on FAccT’s activities to date and identify major gaps and opportunities for translating contributions to broader positive impact. To this end, we utilize a mixed-methods research design. On the qualitative front, we develop a protocol for reviewing and coding prior FAccT papers, tracing the distribution of topics, methods, datasets, and disciplinary roots of FAccT papers. We also design and administer a questionnaire to reflect the voices of FAccT community members and affiliates on a wide range of topics. On the quantitative front, we use the data associated with prior FAccT publications (e.g., their text and citation network) to provide further suggestive evidence about topics and values represented in FAccT. We organize the findings from our analysis into four main dimensions: the themes present in FAccT scholarship, the values that underpin the work, the impact of the contributions both in academic circles and on the broader society, and the practices and informal norms of the community that has formed around FAccT. Finally, our work identifies several suggestions on directions for change, as voiced by community members.


From Demo to Design in Teaching Machine Learning

Karl-Emil Kjær Bilstrup, Magnus Kaspersen, Ira Assent, Simon Enni and Marianne Graves Petersen

The prevalence of AI and machine learning (ML) technologies in digital ecosystems has led to a push for AI literacy, giving everybody, including K-12 students, the necessary knowledge and abilities to engage critically with these new technologies. While there is an increasing focus on designing tools and activities for teaching machine learning, most tools sidestep engaging with the complexity and trade-offs inherent in the design of ML models in favor of demonstrating the power and functionality of the technology. In this paper, we investigate how a design perspective can inform the design of educational tools and activities for teaching machine learning. Through a literature review, we identify 34 tools and activities for teaching ML, and using a design perspective on ML system development, we examine strengths and limitations in how they engage students in the complex design considerations linked to the different components of machine learners. Based on this work, we suggest directions for furthering AI literacy through adopting a design approach in teaching ML.


Gender and Racial Bias in Visual Question Answering Datasets

Yusuke Hirota, Yuta Nakashima and Noa Garcia

Vision-and-language tasks have increasingly drawn more attention as a means to evaluate human-like reasoning in machine learning models. A popular task in the field is Visual Question Answering (VQA), which aims to answer questions about an images' visual content. In the context of fairness, VQA models have been shown to exploit language bias, which makes them learn the superficial correlation between questions and answers. This problem may be causing VQA models to learn harmful stereotypes, if societal bias is present in the training data. For this reason, we investigate gender and racial bias in five VQA datasets. Through a thoughtful analysis, we find that the distribution of answers is highly different between questions about women and men. We also find various detrimental gender-stereotypical question and answer correlations. Likewise, in the case of race, we identify that specific attributes in terms of race are underrepresented, and discriminatory samples appear in the datasets. Our findings suggest that there are dangers associated to using VQA datasets without considering and dealing with the harmful stereotypes, and we propose potential solutions to alleviate the problem before, during, and after dataset collection process.


German AI Start-Ups and “Ethical AI”: Using Social Practice as Basis for Assessing and Implementing Socio-Technical Innovation

Mona Sloane and Janina Zakrzewski

The current AI ethics discourse focuses on developing computational interpretations of ethical concerns, normative frameworks, and concepts for socio-technical innovation. There is less emphasis on understanding how AI practitioners socially organize to operationalize ethical concerns. This is particularly true for AI start-ups, despite their significance as a conduit for the cultural production of innovation and progress, especially in the US American and European context. This gap in empirical research intensifies the risk of a disconnect between scholarly research, innovation and application. This risk materializes acutely as mounting pressures to identify and mitigate the potential harms of AI systems have created an urgent need to rapidly assess and implement AI socio-technical innovation focused on fairness, accountability, and transparency. In this paper, we address this need. Building on social practice theory, we propose a framework that allows AI researchers, practitioners, and regulators to systematically analyze existing social practices of “ethical AI” to define appropriate strategies for effectively implementing socio-technical innovations. We argue that this approach is needed, because socio-technical innovation “sticks” better if it sustains the cultural meaning of socially shared (ethical) AI practices, rather than breaking it. By doing so, it creates pathways for technical and socio-technical innovations to be integrated into already existing routines. Against that backdrop, our contributions are threefold: (1) we introduce a practice-based approach for understanding “ethical AI”; and (2) we present empirical findings from our study on the operationalization of “ethics” in German AI start-ups to underline that AI ethics and social practices must be understood in their specific cultural and historical contexts; and (3) based on our empirical findings, suggest that “ethical AI” practices can be broken down into principles, needs, narratives, materializations, and cultural genealogies to form a useful backdrop for considering socio-technical innovations. We conclude with critical reflections and practical implications of our work, as well as recommendations for future research.


GetFair: Generalized Fairness Tuning of Classification Models

Sandipan Sikdar, Florian Lemmerich and Markus Strohmaier

We present GetFair, a novel framework for tuning fairness of classi-fication models. The fair classification problem deals with training models for a given classification task where data points have sensitive attributes. The goal of fair classification models is to not only generate accurate classification results but also to prevent discrimination against sub populations (i.e., individuals with a specific value for the sensitive attribute). Existing methods for enhancing fairness of classification models however are often specifically designed for a particular fairness metric or a classifier model. They may also not be suitable for scenarios with incomplete training data or where optimizing for multiple fairness metrics is important. GetFair represents a general solution to this problem. The GetFair approach works in the following way: First, a given classifier is trained on training data without any fairness objective. This is followed by a reinforcement learning inspired tuning procedure which updates the parameters of the learnt model on a given fairness objective. This disentangles classifier training from fairness tuning which makes our framework more general and allows for adopting any parameterized classifier model. Because fairness metrics are designed as reward functions during tuning, GetFair generalizes across any fairness metric. We demonstrate the generalizability of GetFair via evaluation over a benchmark suite of datasets, classification models and fairness metrics. In addition, GetFair can also be deployed in settings where the training data is incomplete or the classifier needs to be tuned on multiple fairness metrics. GetFair not only contributes a flexible method to the repertoire of tools available for enhancing fairness of classification models, it also seamlessly adapts to settings where existing fair classification methods may not be suitable or applicable.


Goodbye Tracking? Impact of iOS App Tracking Transparency and Privacy Labels

Konrad Kollnig, Anastasia Shuba, Max Van Kleek, Reuben Binns and Nigel Shadbolt

Tracking is a highly privacy-invasive data collection practice that has been ubiquitous in mobile apps for many years due to its role in supporting advertising-based revenue models. In defence of user privacy, Apple introduced two significant changes with iOS 14: App Tracking Transparency (ATT), a mandatory opt-in system for enabling tracking on iOS, and Privacy Nutrition Labels, which disclose what kinds of data each app processes. So far, the impact of these changes on individual privacy and control has not been well understood. This paper addresses this gap by analysing two versions of 1,759 iOS apps from the UK App Store: one version from before iOS 14 and one that has been updated to comply with the new rules. We find that Apple’s new policies, as promised, prevent the collection of the Identifier for Advertisers (IDFA), an identifier used to facilitate cross-app user tracking. Smaller data brokers, who used to engage in some of the most invasive data practices, will now face higher challenges in tracking users – a positive development for privacy. However, the number of tracking libraries has – on average – roughly stayed the same in the studied apps. Many apps still collect device information that can be used to track users at a group level (cohort tracking) or identify individuals probabilistically (fingerprinting). We find real-world evidence of apps computing and agreeing on a fingerprinting-derived identifier through the use of server-side code, thereby violating Apple’s policies and exposing the limits of what ATT can do against tracking on iOS. This is especially concerning because we explicitly refused opt-in to tracking in our study, and consent is a legal requirement for tracking under EU and UK data protection law. We find that Apple itself engages in some forms of tracking and exempts invasive data practices like first-party tracking and credit scoring from its new rules, and that the new Privacy Nutrition Labels were often inaccurate. This is in conflict with the company’s marketing claims and the resulting expectations of many iOS users. Overall, our observations suggest that, while Apple’s changes make tracking individual users more difficult, they motivate a counter-movement, and reinforce existing market power of gatekeeper companies with access to large troves of first-party data. Making the privacy properties of apps transparent through large-scale analysis remains a difficult target for independent researchers, and a key obstacle to meaningful, accountable and verifiable privacy protections.


Healthsheet: development of a transparency artifact for health datasets

Negar Rostamzadeh, Diana Mincu, Subhrajit Roy, Andrew Smart, Lauren Wilcox, Mahima Pushkarna, Jessica Schrouff, Razvan Amironesei, Nyalleng Moorosi and Katherine Heller

Machine learning (ML) approaches have demonstrated promising results in a wide range of healthcare applications. Data plays a crucial role in developing ML-based healthcare systems that directly affect people's lives. Many of the ethical issues surrounding the use of ML in healthcare stem from structural inequalities underlying the way we collect, use, and handle data. Developing guidelines to improve documentation practices regarding the creation, use, and maintenance of ML healthcare datasets is therefore of critical importance. In this work, we introduce Healthsheet, a contextualized adaptation of the original datasheet questionnaire~\cite{gebru2018datasheets} for health-specific applications. Through a series of semi-structured interviews, we adapt the datasheets for healthcare data documentation. As part of the Healthsheet development process and to understand the obstacles researchers face in creating datasheets, we worked with three publicly-available healthcare datasets as our case studies, each with different types of structured data: Electronic health Records (EHR), clinical trial study data, and smartphone-based performance outcome measures. Our findings from the interviewee study and case studies show that 1) datasheets should be contextualized for healthcare, 2) despite incentives to adopt accountability practices such as datasheets, there is a lack of consistency in the broader use of these practices 3) How the ML for health community views datasheets and particularly Healthsheets as diagnostic tool to surface the limitations and strength of datasets and 4) the relative importance of different fields in the datasheet to healthcare concerns.


How are ML-Based Online Content Moderation Systems Actually Used? Studying Community Size, Local Activity, and Disparate Treatment

Leijie Wang and Haiyi Zhu

Machine learning-based predictive systems are increasingly used to assist online groups and communities in various content moderation tasks. However, there are limited quantitative understandings of whether and how different groups and communities use such predictive systems differently according to their community characteristics. In this research, we conducted a field evaluation of how content moderation systems are used in 17 Wikipedia language communities. We found that 1) larger communities tend to use predictive systems to identify the most damaging edits, while smaller communities tend to use them to identify any edit that could be damaging; 2) predictive systems are used less in content areas where there are more local editing activities; 3) predictive systems have mixed effects on reducing disparate treatment between anonymous and registered editors across communities of different characteristics. Finally, we discuss the theoretical and practical implications for future human-centered moderation algorithms.


How Different Groups Prioritize Ethical Values for Responsible AI

Maurice Jakesch, Zana Bucinca, Saleema Amershi and Alexandra Olteanu

Private companies, public sector organizations, and academic groups have outlined ethical values they consider important for responsible artificial intelligence (RAI) technologies. While their recommendations converge on a set of core values, little is known about the values a more representative public considers important for the AI technologies they interact with and might be affected by. We conducted a survey examining how individuals perceive and prioritize RAI values across three groups: a representative sample of the US population (N=743), a convenience sample of crowdworkers (N=755), and a sample of AI practitioners (N=165). Our results show that participants in the representative population sample, particularly women and black respondents, find RAI values more important than AI practitioners do. However, while AI practitioners often prioritize fairness, representative respondents deem systems safety, performance, and privacy most important. Surprisingly, liberals, rather than participants from minoritized groups or participants reporting experiences with discrimination, were more likely to prioritize fairness over performance than other groups.


How Explainability Contributes to Trust in AI

Andrea Ferrario and Michele Loi

We provide a philosophical explanation of the relation between artificial intelligence (AI) explainability and trust in AI, providing a case for expressions, such as “explainability fosters trust in AI,” that commonly appear in the literature. This explanation considers the justification of the trustworthiness of an AI with the need to monitor it during its use. We discuss the latter by referencing an account of trust, called “trust as anti-monitoring,” that different authors contributed developing. We focus our analysis on the case of medical AI systems, noting that our proposal is compatible with internalist and externalist justifications of trustworthiness of medical AI and recent accounts of warranted contractual trust. We propose that “explainability fosters trust in AI” if and only if it fosters justified and warranted paradigmatic trust in AI, i.e., trust in the presence of the justified belief that the AI is trustworthy, which, in turn, causally contributes to rely on the AI in the absence of monitoring. We argue that our proposed approach can intercept the complexity of the interactions between physicians and medical AI systems in clinical practice, as it can distinguish between cases where humans hold different beliefs on the trustworthiness of the medical AI and exercise varying degrees of monitoring on them. Finally, we apply our account to user’s trust in AI, where, we argue, explainability does not contribute to trust. By contrast, when considering public trust in AI as used by a human, we argue, it is possible for explainability to contribute to trust. Our account can explain the apparent paradox that in order to trust AI, we must trust AI users not to trust AI completely. Summing up, we can explain how explainability contributes to justified trust in AI, without leaving a reliabilist framework, but only by redefining the trusted entity as an AI-user dyad.


Human Interpretation of Saliency-based Explanation Over Text

Hendrik Schuff, Alon Jacovi, Heike Adel, Yoav Goldberg and Ngoc Thang Vu

While a lot of research in explainable AI focuses on producing effective explanations, less work is devoted to the question of how people understand and interpret the explanation. In this work, we focus on this question through a study of saliency-based explanations over textual data. Feature-attribution explanations of text models aim to communicate which parts of the input text were more influential than others towards the model decision. Many current explanation methods, such as gradient-based or Shapley value-based methods, provide measures of importance which are well-understood mathematically. But how does a person receiving the explanation (the explainee) comprehend it? And does their understanding match what the explanation attempted to communicate? We empirically investigate the effect of various factors of the input, the feature-attribution explanation, and visualization procedure, on laypeople's interpretation of the explanation. We query crowdworkers for their interpretation on tasks in English and German, and fit a GAMM model to their responses considering the factors of interest. We find that people often mis-interpret the explanations: superficial and unrelated factors, such as word length, influence the explainees' importance assignment despite the explanation communicating importance directly. We then show that some of this distortion can be attenuated: we propose a method to adjust saliencies based on model estimates of over- and under-perception, and explore bar charts as an alternative to heatmap saliency visualization. We find that both approaches can attenuate the distorting effect of specific factors, leading to better-calibrated understanding of the explanation.


Human-Algorithm Collaboration: Achieving Complementarity and Avoiding Unfairness

Kate Donahue, Alexandra Chouldechova and Krishnaram Kenthapadi

Most of machine learning research focuses on predictive accuracy: given a task, create a machine learning model (or algorithm) that maximizes accuracy. In many settings, however, the final prediction or decision of a system is under the control of a human, who uses an algorithm's output along with their own personal expertise in order to produce a combined prediction. The ultimate goal of such collaborative systems is complementarity: that is, to produce higher accuracy than either the human or algorithm alone. However, experimental results have shown that even in carefully-designed systems, complementary performance can be elusive. Our work provides three key contributions. First, we provide a theoretical framework for modeling simple human-algorithm systems and demonstrate that multiple prior analyses can be expressed within it. Next, we use this model to prove conditions where complementarity is impossible, and give constructive examples of where complementarity is achievable. Finally, we discuss the implications of our findings, especially with respect to the fairness of a classifier. In sum, these results deepen our understanding of key factors influencing the combined performance of human-algorithm systems, giving insight into how algorithmic tools can best be designed for collaborative environments.


Imagining new futures beyond predictive systems in child welfare: A qualitative study with impacted stakeholders

Logan Stapleton, Min Hun Lee, Diana Qing, Marya Wright, Alexandra Chouldechova, Ken Holstein, Zhiwei Steven Wu and Haiyi Zhu

Child welfare agencies across the United States are turning to data-driven technologies in an effort to improve decision making through more systematic use of their administrative data. While some prior work has explored impacted stakeholders’ concerns surrounding existing uses of data-driven predictive risk models (PRMs), less work has explored stakeholder perspectives regarding whether such tools ought to be used in the first place and, if so, how they ought to be designed. In this work, we conducted a set of seven design workshops with 35 stakeholders have been impacted by child welfare or who work in it to understand their beliefs and concerns around PRMs and to engage them in imagining new uses of data and technologies in the child welfare system. We found that participants worried current PRMs may perpetuate or exacerbate existing problems in child welfare. Participants ideated new ways of using data and data-driven tools to better support impacted communities and suggested paths to mitigate possible harms of these tools. Participants also suggested low-tech or no-tech alternatives to PRMs to address problems in child welfare. Our study sheds light on how researchers and designers can work to develop data-driven tools in child welfare systems in solidarity with impacted families and communities in the future.


Imperfect Inferences: A Practical Assessment

Aaron Rieke, Vincent Southerland, Dan Svirsky and Mingwei Hsu

Measuring racial disparities is challenging, especially when demographic labels are unavailable. Recently, some researchers and advocates have argued that companies should infer race and other demographic factors to help them understand and address discrimination. Others have been more skeptical about such an approach, emphasizing the inaccuracy of racial inferences, critiquing the conceptualization of demographic categories themselves, and expressing concern that working with demographic data will encourage algorithmic tweaks that fail to fully address complex social problems. We conduct a novel empirical analysis that informs this debate, using a dataset of self-reported demographic information provided by users of the ride-hailing service Uber who consented to share this information for research purposes. As a threshold matter, we show how this data reflects the enduring power of racism in society. We find differences by race across nearly every outcome we tracked. For example, among self-reported African-American riders, we see racial differences on factors from iOS use to local pollution levels. We then turn to a practical assessment of racial inference methodologies, with two key findings. First, every inference method we tested has significant errors, miscategorizing people relative to their self-reports (even as the self-reports themselves suffer from selection bias). But second, and most importantly, the inference methods worked, in that they reliably confirmed directional racial disparities. Our analysis also suggests that inference methods informed by a particular factor provide a more accurate measurement of racial disparities related to that factor. Disparities that are geographic in nature might be best captured by inferences that rely on geography; discrimination based on a person’s name might be best detected by inferences that rely on names. Hence, when measuring disparities, the choice of inference or racial proxy is an important part of the analysis design. In conclusion, our analysis shows that common racial inference methods have real and practical utility in shedding light on aggregate, directional disparities. While the recent literature has identified notable challenges regarding the collection and use of this data, these challenges should not be seen as dispositive.


Interactive Model Cards: A Human-Centered Approach to Documentation for Large Language Models

Anamaria Crisan, Margaret Drouhard, Jesse Vig and Nazneen Rajani

Deep learning models for natural language processing (NLP) are increasingly adopted and deployed by analysts without formal training in NLP or machine learning (ML). However, the documentation intended to convey the model’s details and appropriate use is tailored primarily to individuals with ML or NLP expertise. To address this gap, we conduct a design inquiry into interactive model cards, which augment traditionally static model cards with affordances for exploring model documentation and interacting with the models themselves. Our investigation consists of an initial conceptual study with experts in ML, NLP, and AI Ethics, followed by a separate evaluative study with non-expert analysts who use ML models in their work. Using a semi-structured interview format coupled with a think-aloud protocol, we collected feedback from a total of 30 participants who engaged with different versions of standard and interactive model cards. Through a thematic analysis of the collected data, we identified several conceptual dimensions that summarize the strengths and limitations of standard and interactive model cards, including: stakeholders; design; guidance; understandability & interpretability; sensemaking & skepticism; and trust & safety. Our findings demonstrate the importance of carefully considered design and interactivity for orienting and supporting non-expert analysts using deep learning models, along with a need for consideration of broader sociotechnical contexts and organizational dynamics. We have also identified design elements, such as language, visual cues, and warnings, among others, that support interactivity and make non-interactive content accessible. We summarize our findings as design guidelines and discuss their implications for a human-centered approach towards AI/ML documentation.


Interdisciplinarity, Gender Diversity, and Network Structure Predict the Centrality of AI Organizations

Madalina Vlasceanu, Miroslav Dudik and Ida Momennejad

Artificial intelligence (AI) research plays an increasingly important role in society, impacting key aspects of human life. From face recognition algorithms aiding national security in airports, to software that advises judges in criminal cases, and medical staff in healthcare, AI research is shaping critical facets of our experience in the world. But who are the people and institutional bodies behind this influential research? What are the predictors of influence of AI researchers and research organizations? We study this question using social network analysis, in an exploration of the structural characteristics, i.e., network topology, of research organizations that shape modern AI. In a sample of 149 organizations with 9,987 affiliated authors of published papers in a major AI conference (NeurIPS)and two major conferences that specifically focus on societal impacts of AI (FAccT and AIES), we find that both industry and academic research organizations with influential authors are more interdisciplinary, more gender diverse, more hierarchical, and less clustered, even when controlling for the size of the organizations. Here, authors’ betweenness centrality in co-authorship networks is used as a measure of their influence. We also find that gender minorities (e.g., women) have less influence in the AI community, determined as lower betweenness centrality in co-authorship networks. These results suggest that while diversity adds significant value to AI research organizations, the individuals contributing to the increased diversity are marginalized in the AI field. We discuss these results in the context of current events with important societal implications.


Is calibration a fairness requirement? An argument from the point of view of moral philosophy and decision theory.

Michele Loi and Christoph Heitz

In this paper, we provide a moral analysis of two criteria of statistical fairness debated in the machine learning literature: 1) calibration between groups and 2) equality of false positive and false negative rates between groups. In our paper, we focus on moral arguments in support of either measure. The conflict between group calibration vs. false positive and false negative rate equality is one of the core issues in the debate about group fairness definitions among practitioners. For any thorough moral analysis, the meaning of the term “fairness” has to made explicit and defined properly. For our paper, we equate fairness with (non-)discrimination, which is a legitimate understanding in the discussion about group fairness. More specifically, we equate it with “prima facie wrongful discrimination” in the sense this is used in Prof. Lippert-Rasmussen’s treatment of this definition. In this paper, we argue that a violation of group calibration may be unfair in some cases, but not unfair in others. Our argument analyzes in great detail two specific hypothetical examples of usage of predictions in decision making. The most important practical implication is that between-group calibration is defensible as a bias standard in some cases but not others; we show this by referring to examples in which the violation of between-group calibration is discriminatory, and others in which it is not. This is in line with claims already advanced in the literature, that algorithmic fairness should be defined in a way that is sensitive to context. The most important practical implication is that arguments based on examples in which fairness requires between-group calibration, or equality in the false-positive/false-negative rates, do no generalize. For it may be that group calibration is a fairness requirement in one case, but not in another.


It’s Just Not That Simple: An Empirical Study of the Accuracy-Explainability Trade-off in Machine Learning for Public Policy

Andrew Bell, Ian Solano-Kamaiko, Oded Nov and Julia Stoyanovich

To achieve high accuracy in machine learning (ML) systems, practitioners often use complex ``black-box'' models that are not easily understood by humans. The opacity of such models has resulted in public concerns about their use in high-stakes contexts and given rise to two conflicting arguments about the nature --- and even the existence --- of the accuracy-explainability trade-off. One side postulates that model accuracy and explainability are inversely related, leading practitioners to use black-box models when high accuracy is important. The other side of this argument holds that the accuracy-explainability trade-off is rarely observed in practice and consequently, that simpler interpretable models should always be preferred. Both sides of the argument operate under the assumption that some types of models, such as low-depth decision trees and linear regression are more explainable, while others such as neural networks and random forests, are inherently opaque. Our main contribution is an empirical quantification of the trade-off between model accuracy and explainability in two real-world policy contexts. We quantify explainability in terms of how well a model is understood by a human-in-the-loop (HITL) using a combination of objectively measurable criteria, such as a human's ability to anticipate a model's output or identify the most important feature of a model, and subjective measures, such as a human's perceived understanding of the model. Our key finding is that explainability is not directly related to whether a model is a black-box or interpretable and is more nuanced than previously thought. We find that black-box models may be as explainable to a HITL as interpretable models and identify two possible reasons: (1) that there are weaknesses in the intrinsic explainability of interpretable models and (2) that more information about a model may confuse users, leading them to perform worse on objectively measurable explainability tasks. In summary, contrary to both positions in the literature, we neither observed a direct trade-off between accuracy and explainability nor found interpretable models to be superior in terms of explainability. It's just not that simple!


Justice in Misinformation Detection Systems: An Analysis of Algorithms, Stakeholders, and Potential Harms

Terrence Neumann, Maria De-Arteaga and Sina Fazelpour

Faced with the scale and surge of misinformation on social media, many platforms and fact-checking organizations have turned to algorithms for automating key parts of misinformation detection pipelines. While offering a promising solution to the challenge of scale, the ethical and societal risks associated with algorithmic misinformation detection are not well-understood. In this paper, we employ and extend upon the notion of informational justice to develop a framework for explicating issues of justice relating to representation, participation, distribution of benefits and burdens, and credibility in the misinformation detection pipeline. Drawing on the framework: (1) we show how injustices materialize for stakeholders across three algorithmic stages in the pipeline; (2) we suggest empirical measures for assessing these injustices; and (3) we identify potential sources of these harms. This framework should help researchers, policymakers, and practitioners reason about potential harms or risks associated with these algorithms and provide conceptual guidance for the design of algorithmic fairness audits in this domain.


Keep your friends close and your counterfactuals closer: Improved learning from closest rather than plausible counterfactual explanations in an abstract setting

Ulrike Kuhl, André Artelt and Barbara Hammer

Counterfactual explanations (CFEs) highlight what changes to a model’s input would have changed its prediction in a particular way. CFEs have gained considerable traction as a psychologically grounded solution for explainable artificial intelligence (XAI). Recent innovations introduce the notion of computational plausibility for automatically generated CFEs, enhancing their robustness by exclusively creating plausible explanations. However, practical benefits of such a constraint on user experience and behavior is yet unclear. In this study, we evaluate objective and subjective usability of computationally plausible CFEs in an iterative learning design targeting novice users. We rely on a novel, game-like experimental design, revolving around an abstract scenario. Our results show that novice users actually benefit less from receiving computationally plausible rather than closest CFEs that produce minimal changes leading to the desired outcome. Responses in a post-game survey reveal no differences in terms of subjective user experience between both groups. Following the view of psychological plausibility as comparative similarity, this may be explained by the fact that users in the closest condition experience their CFEs as more psychologically plausible than the computationally plausible counterpart. In sum, our work highlights a little-considered divergence of definitions of computational plausibility and psychological plausibility, critically confirming the need to incorporate human behavior, preferences and mental models already at the design stages of XAI approaches.


Language variation and algorithmic bias: understanding algorithmic bias in British English automatic speech recognition

Nina Markl

All language is characterised by variation which language users employ to construct complex social identities and express social meaning. Speech and language technologies (SLTs) (re)produce structural oppression in the form of algorithmic bias in similar ways to other machine learning technologies when they perform worse for some language communities. So far, less attention has been paid to this predictive bias in the context of automatic speech recognition, especially outside the US context. I present quantitative and qualitative results highlighting performance gaps of British English commercial automatic speech recognition systems for first and second language speakers of English, and speakers of different varieties of British English. Using knowledge and theories from sociolinguistics, I explore why speech ang language technologies perform significantly worse for already marginalised populations, such as second-language speakers and speakers of stigmatised varieties of English in the British Isles. I also consider the allocative and representational harms SLTs can cause and argue that harms can arise even in systems which do not exhibit predictive bias, narrowly defined as differential performance between groups. This raises the question whether addressing or "fixing" this predictive bias is actually always equivalent to mitigating the harms algorithmic systems can cause, in particular to marginalised communities.


Learning Resource Allocation Policies from Observational Data with an Application to Homeless Services Delivery

Aida Rahmattalabi, Phebe Vayanos, Kathryn Dullerud and Eric Rice

We study the problem of learning, from observational data, fair and interpretable policies that effectively match heterogeneous individuals to scarce resources of different types. We model this problem as a multi-class multi-server queuing system where both individuals and resources arrive stochastically over time. Each individual, upon arrival, is assigned to a queue where they wait to be matched to a resource. The resources are assigned in a first come first served (FCFS) fashion according to an eligibility structure that encodes the resource types that serve each queue. We propose a methodology based on techniques in modern causal inference to construct the individual queues as well as learn the matching outcomes and provide a mixed-integer optimization (MIO) formulation to optimize the eligibility structure. The MIO problem maximizes policy outcome subject to wait time and fairness constraints. It is very flexible, allowing for additional linear domain constraints. We conduct extensive analyses using synthetic and real-world data. In particular, we evaluate our framework using data from the U.S. Homeless Management Information System (HMIS). We obtain wait times as low as an FCFS policy while improving the rate of exit from homelessness for underserved or vulnerable groups (7% higher for the Black individuals and 15% higher for those below 17 years old) and overall.


Learning to Break Deep Perceptual Hashing: The Use Case NeuralHash

Lukas Struppek, Dominik Hintersdorf, Daniel Neider and Kristian Kersting

Apple recently revealed its deep perceptual hashing system NeuralHash to detect child sexual abuse material (CSAM) on user devices before files are uploaded to its iCloud service. Public criticism quickly arose regarding the protection of user privacy and the system's reliability. In this paper, we present the first comprehensive empirical analysis of deep perceptual hashing based on NeuralHash. Specifically, we show that current deep perceptual hashing may not be robust. An adversary can manipulate the hash values by applying slight changes in images, either induced by gradient-based approaches or simply by performing standard image transformations, forcing or preventing hash collisions. Such attacks permit malicious actors easily to exploit the detection system: from hiding abusive material to framing innocent users, everything is possible. Moreover, using the hash values, inferences can still be made about the data stored on user devices. In our view, based on our results, deep perceptual hashing in its current form is generally not ready for robust client-side scanning and should not be used from a privacy perspective.


Learning to Limit Data Collection via Scaling Laws: An Interpretation of GDPR's Data Minimization

Divya Shanmugam, Fernando Diaz, Samira Shabanian, Michele Finck and Asia Biega

Modern machine learning systems are increasingly characterized by extensive personal data collection, despite the diminishing returns and increasing societal costs of such practices. In response, the European Union's General Data Protection Regulation (GDPR) instated the legal obligation of data minimization, or the responsibility to process an adequate, relevant, and limited amount of personal data in relation to a processing purpose. However, the principle has seen limited adoption due to the lack of technical interpretation. In this work, we build on literature in machine learning and law to propose FIDO, a Framework for I nhibiting Data Overcollection. FIDO learns to limit data collection based on an interpretation of data minimization tied to system performance. Concretely, FIDO provivdes a data collection stopping criterion by iteratively updating an estimate of the performance curve, or relationship between dataset size and performance, as data is acquired. FIDO estimates the performance curve via a piecewise power law technique that models distinct phases of an algorithm's performance throughout data collection separately. Empirical experiments show that the framework produces accurate performance curves and data collection stopping criteria across datasets and feature acquisition algorithms. We further demonstrate that many other families of curves systematically overestimate the return on additional data. Results and analysis from our investigation offer deeper insights into the relevant considerations when designing a data minimization framework, including the impacts of active feature acquisition on individual users and the feasability of user-specific data minimization. We conclude with practical recommendations for the implementation of data minimization.


Limits and Possibilities of "Ethical AI" in Open Source: A Case Study of Deepfakes

David Widder, Dawn Nafus, Laura Dabbish and James Herbsleb

Open source software communities are a significant site of AI development, but “Ethical AI” discourses largely focus on the problems that arise in software produced by private companies. Design, policy and tooling interventions to encourage “Ethical AI’ based on studies in private companies risk being ill-suited for an open source context, which operates under radically different organizational structures, cultural norms, and incentives. In this paper, we show that significant and understudied harms and possibilities originate from differing practices of transparency and accountability in the open source community. We conducted an interview-based ethnographic case study of an AI-enabled open source Deepfake project to understand how members of that community reason about the ethics of their work. We found that notions of the “Freedom 0” to use code without any restriction, alongside beliefs about technology neutrality and technological inevitability, were central to how community members framed their responsibilities, and the actions they believed were and were not available to them. We also show how commitments to radical transparency in open source allow great ethical scrutiny for harms wrought by implementation bugs, but allows harms through (mis)use to proliferate, requiring a deeper toolbox for disincentivizing harmful use. We discuss how an assumption of control over downstream uses is often implicit in discourses of “Ethical AI”, but outline alternative possibilities for action in cases such as open source where this assumption may not hold.


Limits of individual consent and models of distributed consent in online social networks

Juniper Lovato, Antoine Allard, Randall Harp, Jeremiah Onaolapo and Laurent Hébert-Dufresne

Personal data is not discrete in socially-networked digital environments. A user who consents to allow access to their profile can expose the personal data of their network connections to non-consented access. Therefore, the traditional consent model (informed and individual) is not appropriate in social networks where informed consent may not be possible for all users affected by data processing and where information is distributed across users. Here, we outline the adequacy of consent for data transactions. Informed by the shortcomings of individual consent, we introduce both a platform-specific model of ``distributed consent'' and a cross-platform model of a ``consent passport.'' In both models, individuals and groups can coordinate by giving consent conditional on that of their network connections. We simulate the impact of these distributed consent models on the observability of social networks and find that low adoption would allow macroscopic subsets of networks to preserve their connectivity and privacy.


Locality of Technical Objects and the Role of Structural Interventions for Systemic Change

Efrén Cruz Cortés, Sarah Rajtmajer and Debashis Ghosh

Technical objects, like algorithms, exhibit causal capacities both in terms of their internal makeup and the position they occupy in relation to other objects and processes within a system. At the same time, systems encompassing technical objects interact with other systems themselves, producing a multi-scale structural composition. In the framework of fair artificial intelligence, typical causal inference interventions focus on the internal workings of technical objects (fairness constraints), and often forsake structural properties of the system. However, these interventions are often not sufficient to capture forms of discrimination and harm at a systemic level. To complement this approach we introduce the notion of locality and define structural interventions. We compare the effect of structural interventions on a system compared to local, structure-preserving interventions on technical objects. We focus on comparing interventions on generating mechanisms (representing social dynamics giving rise to discrimination) with constraining algorithms to satisfy some measure of fairness. This framework allows us to identify bias outside the algorithmic stage and propose joint interventions on social dynamics and algorithm design. We show how, for a model of financial lending, structural interventions can drive the system towards equality even when algorithmic interventions are unable to do so. This suggests that the responsibility of decision makers extends beyond ensuring that local fairness metrics are satisfied to an ecosystem that fosters equity for all.


Making the Unaccountable Internet: The Changing Meaning of Accounting in the Early ARPANET

A. Feder Cooper and Gili Vidan

Contemporary concerns over the governance of technological systems often run up against compelling narratives about technical (in)feasibility of designing mechanisms for accountability. While in recent FAccT literature these concerns have been deliberated predominantly in relation to machine learning, other significant shifts in technological innovation in the past have also presented circumstances in which computer scientists have needed to un-muddle what it means to design (un)accountable systems. Notably, one such a compelling narrative can frequently be found in canonical histories of the Internet that highlight how its original designers’ commitment to the end-to-end architectural principle precluded other features from being implemented, resulting in the fast-growing, generative, but ultimately unaccountable network we have today. This paper offers a critique of such technologically essentialist notions of accountability and the characterization of the “unaccountable Internet” as an unintended consequence. We use historical methods and STS analytical frameworks to recover a rich debate about the nature of accountability in the early days of the ARPANET, the Internet’s predecessor. In particular, we explore the changing meaning of accounting and its relationship to accountability in a selected corpus of requests for comments (RFCs) concerning the ARPANET’s design from the 1970s and 80s. Accountability appeared in early writings about the ARPANET as a possible design goal that would provide a detailed account of resource use. But as other top-level goals emerged as the defining features of the network, accountability was repeatedly de-emphasized. We examine how accountability changed from a notion of “accounting for” resource usage to meaning “accountable to” various social actors. We argue that it ultimately became clear that accounting (and accountability) was a fundamental question of policy concerning early Internet architecture—and that in the late 1980s the de facto policy of unaccountability began to threaten the network’s functionality and reliability. Later, the perception of the Internet as unaccountable in a much broader societal sense began to solidify, first as a celebrated feature and then as a form of design oversight--a bug. Recovering this history is not only important for understanding the processes that shaped the Internet, but also serves as a starting point for unpacking the complicated political choices that are involved in designing accountability mechanisms for other technological systems today.


Markedness in Visual Semantic AI

Robert Wolfe and Aylin Caliskan

We evaluate a state-of-the-art multimodal "visual semantic" model, OpenAI's CLIP ("Contrastive Language Image Pretraining"), for biases related to the marking of age, gender, and race or ethnicity. Given the option to label an image as "a photo of a person" or to select any of seven labels which marks the race or ethnicity of a person, CLIP chooses the "person" label 47.9% of the time for individuals perceived to be White according to the FairFace computer vision dataset, compared with 5.0% or less for people perceived to be Black, East Asian, Southeast Asian, Indian, or Latino or Hispanic. The model is also more likely to rank the unmarked "person" label higher than any gender label for people perceived to be Male (26.7% of the time) vs. people perceived to be Female (15.2% of the time). Perceived age affects whether an individual is marked by the model: people perceived to be Female are more likely to be marked based on gender at younger ages (under 20), but less likely to be marked with an age label, while those over the age of 40 are much more likely to be marked based on age than people perceived to be Male. We trace our results back to the CLIP embedding space by examining the self-similarity for each social group, where higher self-similarity denotes greater attention directed by CLIP to the shared characteristics i.e., age, race, or gender) of the social group. As age increases, the self-similarity of representations of people perceived to be Female increases at ahigher rate than for people perceived to be Male, with the disparity most pronounced at the "more than 70" age range. Six of the ten least self-similar social groups are people perceived to be White and Male, while all ten of the most self-similar social groups are people perceived to be under the age of 10 or over the age of 70. Existing biases of self-similarity and markedness between Male and Female gender groups are further exacerbated when the groups compared are people perceived to be White and Male and people perceived to be Black and Female. CLIP is an English-language model trained on internet content gathered based on data from an American website (Wikipedia), and our results indicate that CLIP reflects the biases of the language and society which produced the data on which it was trained.


Marrying Fairness and Explainability in Supervised Learning

Przemyslaw Grabowicz, Nicholas Perello and Aarshee Mishra

Machine learning algorithms that aid human decision-making may inadvertently discriminate against certain protected groups. We formalize direct discrimination as a direct causal effect of the protected attributes on the decisions, while induced discrimination as a change in the causal influence of non-protected features associated with the protected attributes. The measurements of marginal direct effect (MDE) and SHapley Additive exPlanations (SHAP) reveal that state-of-the-art fair learning methods can induce discrimination via association or reverse discrimination in synthetic and real-world datasets. To inhibit discrimination in algorithmic systems, we propose to nullify the influence of the protected attribute on the output of the system, while preserving the influence of remaining features. We introduce and study post-processing methods achieving such objectives, finding that they yield relatively high model accuracy, prevent direct discrimination, and diminishes various disparity measures, e.g., demographic disparity.


Measuring Fairness of Rankings under Noisy Sensitive Information

Azin Ghazimatin, Matthäus Kleindessner, Chris Russell, Ziawasch Abedjan and Jacek Golebiowski

Metrics commonly used to assess group fairness in ranking require the knowledge of group membership labels (e.g., whether a job applicant is male or female). Obtaining accurate group membership labels, however, may be costly, operationally difficult, or even infeasible. Where it is not possible to obtain these labels, one common solution is to use proxy labels in their place, which are typically predicted by machine learning models. However, proxy labels are susceptible to systematic biases, and using them for fairness estimation can thus lead to unreliable assessments. We investigate the problem of measuring group fairness in ranking for a suite of divergence-based metrics in the presence of proxy labels. We show that under certain assumptions, fairness of a ranking can reliably be measured from the proxy labels. We formalize two assumptions and provide a theoretical analysis for either of them that shows how the true metric values can be derived from the estimates based on proxy labels. We prove that without such assumptions fairness assessment based on proxy labels is impossible. Through extensive experiments on both synthetic and real datasets, we demonstrate the effectiveness of our proposed methods for recovering reliable fairness assessments.


Measuring the Carbon Intensity of AI in Cloud Instances

Jesse Dodge, Taylor Prewitt, Remi Tachet des Combes, Erika Odmark, Roy Schwartz, Emma Strubell, Alexandra Sasha Luccioni, Noah A. Smith, Nicole DeCario and Will Buchanan

The advent of cloud computing has provided people around the world with unprecedented access to computational power and enabled rapid growth in technologies such as machine learning, the computational demands of which come with a high energy cost and a commensurate increase in carbon footprint. As a result, recent scholarship has called for better estimates of the impact of AI on greenhouse gas emissions. However, data scientists today do not have easy or reliable access to measurements of this information, which precludes consideration of how to reduce the costs (computational, electricity, environmental) associated with machine learning workloads. We argue that cloud providers presenting information about software carbon intensity to users is a fundamental stepping stone towards minimizing emissions. In this paper, we provide a framework for measuring software carbon intensity, and propose to measure operational carbon emissions by using location-based marginal emissions data per energy unit. We provide measurements of operational software carbon intensity for a set of modern models covering natural language processing (NLP) and computer vision applications, including four sizes of DenseNet models trained on MNIST, pretraining and finetuning of BERT-small, pretraining of a 6.1 billion parameter language model, and five sizes of Vision Transformer. We confirm previous results that the geographic region of the data center plays a significant role in the carbon intensity for a given cloud instance. We also present new results showing that the time of day has meaningful impact on operational software carbon intensity. We then evaluate a suite of approaches for reducing emissions in the cloud: using cloud instances in different geographic regions, using cloud instances at different times of day, and dynamically pausing cloud instances when the marginal carbon intensity is above a certain threshold. We find that choosing an appropriate region can have the largest impact, but emissions can be reduced by the other methods as well. Finally, we conclude with recommendations for how machine learning practitioners can use software carbon intensity information to reduce environmental impact.


Mind the Gap: Autonomous Systems, the Responsibility Gap, and Moral Entanglement

Trystan Goetze

When a computer system causes harm, who is responsible? This question has renewed significance given the proliferation of autonomous systems enabled by modern artificial intelligence techniques. The view I develop explains why computing professionals are ethically required to take responsibility for the systems they design, even when they are not blameworthy for the harms these systems may cause. After an introductory section, §2 introduces the topic of responsibility for autonomous systems by way of an example: lethal autonomous weapons systems (LAWS). The basic problem is that it is difficult to determine who is to blame when LAWS cause unethical outcomes, such as civilian deaths. §3 gets more precise about the sense of responsibility at issue and the nature of the responsibility gap. After distinguishing ten different senses of “responsibility”, I narrow the focus to three senses that together form a classical notion of moral responsibility: causal responsibility, attributability, and accountability. The responsibility gap is then analyzed as arising from three factors: (i) the causal distance between the designers of autonomous systems and those systems’ behaviour, (ii) the absence of direct human control over how these systems behave, and (iii) the unpredictability of how autonomous systems will behave when deployed in new situations. §4 discusses previous approaches to bridging the responsibility gap: (i) rethinking our concepts of moral responsibility, (ii) blaming the autonomous systems themselves, (iii) advancing projects in machine ethics, and (iv) enforcing professional standards and regulations in computing. I criticize each in turn. But in general, they fail to address the conceptual problem at issue in the responsibility gap, offering workarounds instead. As such, each carries some degree of arbitrariness. §5 presents a philosophical solution to the responsibility gap, drawing on a recent account of vicarious moral responsibility, that is, instances where an agent is responsible for the behaviour of someone or something else. On this view, when one’s agency or identity place one in a morally significant relationship with someone else, one becomes “morally entangled” with them. For example, parents are morally entangled with their children because of their identities qua parents. This special relationship generates moral reasons that obligate a parent to respond to their children’s unethical behaviour in a distinctive way, which can include apologizing or seeking to make things right. The same holds for the creators of autonomous systems, which I argue in §6 by applying my account to the example of LAWS. §7 concludes.


Minimax Demographic Group Fairness in Federated Learning

Afroditi Papadaki, Natalia Martinez, Martin Bertran, Guillermo Sapiro and Miguel Rodrigues

Federated learning is an increasingly popular paradigm that enables a large number of entities to collaboratively learn better models. In this work, we study minimax group fairness in federated learning scenarios where different participating entities may only have access to a subset of the population groups during the training phase. We formally analyze how our proposed group fairness objective differs from existing federated learning fairness criteria that impose similar performance across participants instead of demographic groups. We provide an optimization algorithm -- FedMinMax -- for solving the proposed problem that provably enjoys the performance guarantees of centralized learning algorithms. We experimentally compare the proposed approach against other state-of-the-art methods in terms of group fairness in various federated learning setups, showing that our approach exhibits competitive or superior performance.


Model Explanations with Differential Privacy

Neel Patel, Reza Shokri and Yair Zick

Black-box machine learning models are used in critical decision-making domains, giving rise to several calls for more algorithmic transparency. The drawback is that model explanations can leak information about the data used to generate them, thus undermining data privacy. To address this issue, we propose differentially private algorithms to construct feature-based model explanations. We design an adaptive differentially private gradient descent algorithm, that finds the minimal privacy budget required to produce accurate explanations. It reduces the overall privacy loss on explanation data, by adaptively reusing past differentially private explanations. It also amplifies the privacy guarantees with respect to the training data. Finally, we evaluate the implications of differentially private models and our privacy mechanisms on the quality of model explanations.


Model Multiplicity: Opportunities, Concerns, and Solutions

Emily Black, Manish Raghavan and Solon Barocas

Recent scholarship has brought attention to the fact that there often exist multiple models for a given prediction task with equal accuracy that differ in their individual-level predictions or aggregate properties. This phenomenon---which we call model multiplicity---leads to exciting opportunities through the flexibility it introduces into the model selection process. By demonstrating that there are many different ways of making equally accurate predictions, multiplicity gives practitioners the freedom to prioritize other values in their model selection process without having to abandon their commitment to maximizing accuracy. For example, it may often be possible to satisfy fairness properties on machine learning models at no cost to accuracy, as researchers have shown in increasingly many contexts. However, multiplicity also brings to light a concerning truth: model selection on the basis of accuracy alone---the default procedure in many deployment scenarios---fails to consider what might be meaningful differences between equally accurate models. This means that such a selection process effectively becomes an arbitrary choice. This obfuscation of the differences between models on axes of behavior other than accuracy---such as fairness, robustness, and interpretability---may lead to unnecessary trade-offs, or could even be leveraged to mask discriminatory behavior. Beyond this, the reality that multiple models exist with different outcomes for the same individuals leads to a crisis in justifiability of model decisions: why should an individual be subject to an adverse model outcome if there exists an equally accurate model that treats them more favorably? In this work we address the question, how do we take advantage of the flexibility model multiplicity provides, while addressing the concerns with justifiability that it may raise?


Models for Classifying AI Systems: the Switch, the Ladder, and the Matrix

Jakob Mökander, Margi Sheth, David Watson and Luciano Floridi

Organisations that design and deploy systems based on artificial intelligence (AI) increasingly commit themselves to high-level, ethical principles. However, there still exists a gap between principles and practices in AI ethics. A major obstacle to implementing AI governance frameworks effectively is the lack of a well-defined material scope. The question to which systems and processes these additional layers of governance ought to apply remains unanswered. Of course, there is no single answer. Different AI systems pose different ethical challenges. Moreover, there is a three-way trade-off between how precise a classification is, how easy it is to apply it, and how generalisable it is. Nevertheless, pragmatic problem-solving demands that things should be sorted so that their grouping will promote successful actions for some specific end. In this article, we review and compare previous attempts to classify AI systems for the practical purpose of implementing AI governance frameworks. We find that most attempts to classify AI systems use one of three mental models: the Switch, i.e., a binary approach according to which systems either are or are not considered AI systems depending on their characteristics; the Ladder, i.e., a risk-based approach that classifies systems according to the ethical risks they pose; and the Matrix, i.e., a multi-dimensional classification of systems that take various aspects into account, such as context, data input, or decision-model. Each of these models for classifying AI systems comes with its own set of strengths and weaknesses. By conceptualising different ways of classifying AI systems into simple mental models, we hope that this article may help organisations that need to demarcate a material scope for their AI governance frameworks. The ultimate goal is to provide organisations that design, deploy, or regulate AI systems with the conceptual tools needed to implement AI governance frameworks in practice.


Models for understanding and quantifying feedback in societal systems

Lydia Reader, Pegah Nohkiz, Cathleen Power, Neal Patwari, Suresh Venkatasubramanian and Sorelle Friedler

When it comes to long-term fairness in decision-making settings, many studies have focused on closed systems with a specific appointed decision-maker and certain engagement rules in place. However, if the objective is to achieve equity in a broader societal system, studying the system in isolation is insufficient. In a societal system, neither a singular decision maker nor defined agent behavior rules exist. Additionally, analysis of societal systems can be complicated by the presence of feedback, in which historical and current inequities influence future inequity. In this paper, we present a model to quantify feedback in social systems so that the long-term effects of a policy or decision process may be investigated, even when the feedback mechanisms are not individually characterized. We explore the dynamics of real social systems and find that many examples of feedback are qualitatively similar in their temporal characteristics. Using a key idea in feedback systems theory, namely proportional-integral-derivative (PID) feedback and control, we propose a model to quantify three types of feedback. We illustrate how different components of the PID capture analogous aspects of societal dynamics such as the persistence of current inequity, the cumulative effects of long-term inequity, and the response to the speed at which society is changing. Our model does not attempt to describe underlying systems or capture individual actions. It is a system-based approach to study inequity in feedback loops, and as a result unlocks a direction to study social systems that would otherwise be almost impossible to model and can only be observed. Our framework helps elucidate the ability of fair policies to produce and sustain equity in the long-term.


Multi Stage Screening: Enforcing Fairness and Maximizing Efficiency in a Pre-Existing Pipeline

Kevin Stangl, Avrim Blum and Ali Vakilian

Consider an actor making selection decisions (e.g. hiring) using a series of classifiers, which we term a sequential screening process. The early stages (e.g. resume screen, coding screen, phone interview) filter out some of the data-points and in the final stage, an expensive but accurate test (e.g. a full interview) is applied to those individuals that make it to the final stage, which determines true positives with high accuracy. Since the final stage is expensive, if there are multiple groups with different fractions of positives in them at the penultimate stage (even if a slight gap), then the firm may naturally only choose to apply the final (interview) stage to the group of highest precision, which would be clearly unfair to the other groups. Given this concern, we consider the goal of requiring Equality of Opportunity (qualified members of each group have the same chance of reaching the interview stage) via modification of the probabilities of promotion through the screening process at each stage based on performance at the previous stage. We exhibit algorithms for satisfying Equal Opportunity over the selection process and maximizing precision (the fraction of interviews that yield qualified candidates) as well as linear combinations of precision and recall (recall determines the number of applicants needed per hire) at the end of the final stage. We also present examples showing that the solution space is non-convex, which motivate our combinatorial exact and approximation algorithms (exact algorithm is Fixed-Parameter-Tractable and approximation algorithm is an FPTAS) for maximizing the linear combination of precision and recall. Finally, we discuss the `price of fairness' in several realistic models, including models where the decision-maker is or is not allowed to use group membership in its decision process.


Multi-disciplinary fairness considerations in machine learning for clinical trials

Isabel Chien, Nina Deliu, Richard Turner, Adrian Weller, Sofia Villar and Niki Kilbertus

While interest in the application of machine learning to improve healthcare has grown tremendously in recent years, a number of barriers prevent deployment in medical practice. A notable concern is the potential to exacerbate entrenched biases and existing health disparities in society. The area of fairness in machine learning seeks to address these issues of equity; however, appropriate approaches are context-dependent, necessitating domain-specific consideration. We focus on clinical trials, i.e., research studies conducted on humans to evaluate medical treatments. Clinical trials are a relatively under-explored application in machine learning for healthcare, in part due to complex ethical, legal, and regulatory requirements and high costs. Our aim is to provide a multi-disciplinary assessment of how fairness for machine learning fits into the context of clinical trials research and practice. We start by reviewing the current ethical considerations and guidelines for clinical trials and examine their relationship with common definitions of fairness in machine learning. We examine potential sources of unfairness in clinical trials, providing concrete examples, and discuss the role machine learning might play in either mitigating potential biases or exacerbating them when applied without care. Particular focus is given to adaptive clinical trials, which may employ machine learning. Finally, we highlight concepts that require further investigation and development, and emphasize new approaches to fairness that may be relevant to the design of clinical trials.


Multiaccurate Proxies for Downstream Fairness

Emily Diana, Michael Kearns, Aaron Roth, Wesley Gill, Krishnaram Kenthapadi and Saeed Sharifi-Malvajerdi

We study the problem of training a model that must obey demographic fairness conditions when the sensitive features are not available at training time -- in other words, how can we train a model to be fair by race when we don't have data about race? We adopt a fairness pipeline perspective, in which an "upstream" learner that does have access to the sensitive features will learn a proxy model for these features from the other attributes. The goal of the proxy is to allow a general "downstream" learner -- with minimal assumptions on their prediction task -- to be able to use the proxy to train a model that is fair with respect to the true sensitive features. We show that obeying multiaccuracy constraints with respect to the downstream model class suffices for this purpose, provide sample- and oracle efficient-algorithms and generalization bounds for learning such proxies, and conduct an experimental evaluation. In general, multiaccuracy can be much easier to satisfy than classification accuracy, and can be satisfied even when the sensitive features are hard to predict.


Net benefit, calibration, threshold selection, and training objectives for algorithmic fairness in healthcare

Stephen Pfohl, Yizhe Xu, Agata Foryciarz, Nikolaos Ignatiadis, Julian Genkins and Nigam Shah

A growing body of work uses the paradigm of algorithmic fairness to frame the development of techniques to anticipate and proactively mitigate the introduction or exacerbation of health inequities that may follow from the use of model-guided decision-making. We evaluate the interplay between measures of model performance, fairness, and the expected utility of decision-making to offer practical recommendations for the operationalization of algorithmic fairness principles for the development and evaluation of predictive models in healthcare. We conduct an empirical case-study via development of models to estimate the ten-year risk of atherosclerotic cardiovascular disease to inform statin initiation in accordance with clinical practice guidelines. We demonstrate that approaches that incorporate fairness considerations into the model training objective typically do not improve model performance or confer greater net benefit for any of the studied patient populations compared to the use of standard learning paradigms followed by threshold selection concordant with patient preferences, evidence of intervention effectiveness, and model calibration. These results hold when the measured outcomes are not subject to differential measurement error across patient populations and threshold selection is unconstrained, regardless of whether differences in model performance metrics, such as in true and false positive error rates, are present. In closing, we argue for focusing model development efforts on developing calibrated models that predict outcomes well for all patient populations while emphasizing that such efforts are complementary to transparent reporting, participatory design, and reasoning about the impact of model-informed interventions in context.


NeuroView-RNN: It's About Time

Cj Barberan, Sina Alemmohammad, Naiming Liu, Randall Balestriero and Richard Baraniuk

Recurrent Neural Networks (RNNs) are important tools for processing sequential data such as time-series or video. Interpretability is defined as the ability to be understood by a person and is different from explainability, which is the ability to be explained in a mathematical formulation. A key interpretability issue with RNNs is that it is not clear how each hidden state per time step contributes to the decision-making process in a quantitative manner. We propose NeuroView-RNN as a family of new RNN architectures that explains how all the time steps are used for the decision-making process. Each member of the family is derived from a standard RNN architecture by concatenation of the hidden steps into a global linear classifier. The global linear classifier has all the hidden states as the input, so the weights of the classifier have a linear mapping to the hidden states. Hence, from the weights, NeuroView-RNN can quantify how important each time step is to a particular decision. As a bonus, NeuroView-RNN also offers higher accuracy in many cases compared to the RNNs and their variants. We showcase the benefits of NeuroView-RNN by evaluating on a multitude of diverse time-series datasets.


News from Generative Artificial Intelligence is Believed Less

Chiara Longoni, Andrey Fradkin, Luca Cian and Gordon Pennycook

Artificial Intelligence (AI) can generate text virtually indistinguishable from text written by humans. A key question, then, is whether people believe news generated by AI as much as news generated by humans. AI is viewed as lacking human motives and emotions, suggesting that people might view news written by AI as more accurate. By contrast, two pre-registered experiments on representative U.S. samples (N=4,034) showed that people rated news written by AI as less accurate than news written by humans. People were more likely to incorrectly rate news written by AI (vs. a human) as inaccurate when it was actually true, and more likely to correctly rate it as inaccurate when it was indeed false. Our findings are important given the increasing adoption of AI in news generation, and the associated ethical and governance pressures to disclose it use and address standards of transparency and accountability.


Normative Logics of Algorithmic Accountability

Joseph Donia

The relevance of algorithms in contemporary life is most often appreciated when they ‘fail’—either because they did not perform as expected, or because they led to outcomes that were later determined to be unacceptable. As a result, academic, policy, and public discourse has increasingly emphasized accountability as a desirable, if not elusive, feature of system design, and component of effective governance. Accountability, however, is a versatile concept that has been operationalized in a number of ways across different use-contexts, policy settings, and research disciplines. While accountability is often framed as a normative good, it is unclear exactly what kind of normative work it is expected to do, and how it is expected to do it. Informed by perspectives from critical data studies and science and technology studies, this article introduces five normative logics underpinning discussions of algorithmic accountability that appear in the academic research literature: (1) accountability as verification, (2) accountability as representation, (3) accountability as social licence, (4) accountability as fiduciary duty, and (5) accountability as legal compliance. These normative logics, and the resulting rules, codes, and practices that constitute an emerging set of algorithmic accountability regimes, are especially discussed in terms of the presumed agency of actors involved. The article suggests that implicit assumptions characterizing each of ‘algorithms’ and ‘accountability’ are highly significant for each other, and that more explicit acknowledgement of this fact can lead to improved understanding of the diverse knowledge claims and practical goals associated with different logics of algorithmic accountability, and relatedly, the agency of different actors to pursue it in its different forms.


On the Existence of Simpler Machine Learning Models

Lesia Semenova, Cynthia Rudin and Ronald Parr

It is almost always easier to find an accurate-but-complex model than an accurate-yet-simple model. Finding optimal sparse, accurate models of various forms (linear models with integer coefficients, decision sets, rule lists, decision trees) is generally NP-hard, sometimes with no polynomial-time approximation. We often do not know whether the search for a simpler model will be worthwhile, and thus we do not go to the trouble of searching for one. In this work, we ask an important practical question: can accurate-yet-simple models be proven to exist, or shown likely to exist, before explicitly searching for them? We hypothesize that there is an important reason that simple-yet-accurate models often do exist. This hypothesis is that the size of the Rashomon set is often large, where the Rashomon set is the set of almost-equally-accurate models from a function class. If the Rashomon set is large, it contains numerous accurate models, and perhaps at least one of them is the simple model we desire. In this work, we formally present the Rashomon ratio as a new gauge of simplicity for a learning problem, depending on a function class and a dataset. The Rashomon ratio is the ratio of the volume of the set of accurate models to the volume of the hypothesis space, and it is different from standard complexity measures from statistical learning theory. Insight from studying the Rashomon ratio provides an easy way to check whether a simpler model might exist for a problem before finding it, namely whether several different machine learning methods achieve similar performance on the data. In that sense, the Rashomon ratio is a powerful tool for understanding why and when an accurate-yet-simple model might exist.


On the Fairness of Machine-Assisted Human Decisions

Bryce McLaughlin, Jann Spiess and Talia Gillis

When machine-learning algorithms are deployed in high-stakes decisions, we want to ensure that their deployment leads to fair and equitable outcomes. This concern has motivated a fast-growing literature that focuses on diagnosing and addressing disparities in machine predictions.However, many machine predictions are deployed to assist in decisions where a human decision-maker retains the ultimate decision authority. In this article, we therefore consider how properties of machine predictions affect the resulting human decisions. We show in a formal model that the inclusion of a biased human decision-maker can revert common relationships between the structure of the algorithm and the qualities of resulting decisions. Specifically, we document that excluding information about protected groups from the prediction may fail to reduce, and may even increase, ultimate disparities. While our concrete results rely on specific assumptions about the data, algorithm, and decision-maker, they show more broadly that any study of critical properties of complex decision systems, such as the fairness of machine-assisted human decisions, should go beyond focusing on the underlying algorithmic predictions in isolation.


On the Power of Randomization in Fair Classification and Representation

Sushant Agarwal and Amit Deshpande

Fair classification and fair representation learning are two important problems in fair machine learning. Fair classification asks for a classifier that maximizes accuracy subject to fairness constraints on a given data distribution. Fair representation maps a given data distribution to a new distribution in the representation space such that all classifiers on the representation satisfy fairness, and thus, reducing our goal to finding a classifier of maximum accuracy on the representation. In this paper, we examine the power of randomization in both these problems to minimize the loss of accuracy that results when we impose fairness constraints. Previous work on fair classification has characterized the optimal deterministic fair classifiers on a given data distribution that maximize accuracy subject to fairness constraints, e.g., Demographic Parity (DP), Equal Opportunity (EO). We extend this to characterize the optimal randomized fair classifiers, and show that they can surpass their deterministic counterparts in accuracy. We also show how the optimal randomized fair classifier that we characterize can be obtained as a solution to a convex optimization problem. Recent work has provided techniques to construct randomized fair representations for a given data distribution such that any classifier on this representation satisfies Demographic Parity. However, the classifiers on these fair representations either come with no or weak accuracy guarantees when compared to the optimal fair classifier on the original data distribution. We improve on these works and construct DP-fair (and EO-fair) representations that have provably optimal accuracy and suffer no accuracy loss compared to the optimal DP-fair (and EO-fair) classifier on the original data distribution.


Measuring Representational Harms in Image Captioning

Angelina Wang, Solon Barocas, Kristen Laird and Hanna Wallach

Image captioning is a complex task that involves incorporating both computer vision and natural language processing technologies. Measuring ``bias'' in this domain is correspondingly complex due to the variety of ways it can manifest. In this work, we describe our process for concretely operationalizing a set of potential harms, carefully laying out our assumptions and limitations. Thus, rather than reporting one single, aggregate measure of ``bias,'' we present a diverse set of analyses. By empirically undertaking the task of measuring the harms in an automated image captioning model on two popular datasets, our work demonstrates the need for grounding such analyses in concrete harms, and the challenges that future work will need to contend with in such measurement endeavors.


People are not coins: Morally distinct types of predictions necessitate different fairness constraints

Eleonora Viganó, Corinna Hertweck, Christoph Heitz and Michele Loi

A recent paper (Hedden 2021) has argued that most of the group fairness constraints discussed in the machine learning literature are not necessary conditions for the fairness of predictions, and hence that there are no genuine fairness metrics. This is proven by discussing a special case of a fair prediction. In our paper, we show that Hedden’s argument does not hold for the most common kind of predictions used in data science, which are about people and based on data from similar people; we call these “human-group-based practices”. We argue that there is a morally salient distinction between human-group-based practices and those that are based on data of only one person, which we call “human-individual-based practices”. Thus, what may be a necessary condition for the fairness of human-group-based practices may not be a necessary condition for the fairness of human-individual-based practices, on which Hedden’s argument is based. Accordingly, the group fairness metrics discussed in the machine learning literature may still be relevant for most applications of prediction-based decision making.


Post-Hoc Explanations Fail to Achieve their Purpose in Adversarial Contexts

Sebastian Bordt, Michele Finck, Eric Raidl and Ulrike von Luxburg

Existing and planned legislation stipulates various obligations to provide information about machine learning algorithms and their functioning, often interpreted as obligations to ``explain''. Many researchers suggest using post-hoc explanation algorithms for this purpose. In this paper, we combine legal, philosophical and technical arguments to show that post-hoc explanation algorithms are unsuitable to achieve the law's objectives. Indeed, most situations where explanations are requested are adversarial, meaning that the explanation provider and receiver have opposing interests and incentives, so that the provider might manipulate the explanation for her own ends. We show that this fundamental conflict cannot be resolved because of the high degree of ambiguity of post-hoc explanations in realistic application scenarios. As a consequence, post-hoc explanation algorithms are unsuitable to achieve the transparency objectives inherent to the legal norms. Instead, there is a need to more explicitly discuss the objectives underlying ``explainability'' obligations as these can often be better achieved through other mechanisms. There is an urgent need for a more open and honest discussion regarding the potential and limitations of post-hoc explanations in adversarial contexts, in particular in light of the current negotiations about the European Union's draft Artificial Intelligence Act.


Predictability and Surprise in Large Generative Models

Deep Ganguli, Danny Hernandez, Liane Lovitt, Amanda Askell, Yuntao Bai, Anna Chen, Tom Conerly, Nova Dassarma, Dawn Drain, Nelson Elhage, Sheer El Showk, Stanislav Fort, Zac Hatfield-Dodds, Tom Henighan, Scott Johnston, Andy Jones, Nicholas Joseph, Jackson Kernian, Shauna Kravec, Ben Mann, Neel Nanda, Kamal Ndousse, Catherine Olsson, Daniela Amodei, Tom Brown, Jared Kaplan, Sam McCandlish, Christopher Olah, Dario Amodei and Jack Clark

Large-scale pre-training has recently emerged as a technique for creating highly capable, general-purpose, generative models such as GPT-3, Megatron-Turing NLG, Gopher, and many others. In this paper, we highlight a counterintuitive property of such models and discuss the policy implications of this property. Namely, large generative models have an unusual combination of highly predictable general performance (as embodied in their "scaling laws"), and highly unpredictable specific capabilities, inputs, and outputs. We believe that the former drives rapid development of such models while the latter makes it difficult to anticipate their consequences. We go through examples of how this combination can lead to socially harmful behavior with examples from the literature and real world observations, and we also add two novel experiments to illustrate our point about harms from unpredictability. Furthermore, we discuss how large-scale generative model deployment has unfolded so far, and analyze what this tells us about the various motivations and challenges faced by model developers. We conclude with a list of possible interventions the AI community may take to increase the chance of these models having a beneficial impact. We intend this paper to be useful to policymakers who want to understand and regulate AI systems, technologists who care about the potential policy impact of their work, and academics who want to analyze, critique, and potentially develop large-scale models.


Prediction as Extraction of Discretion

Sun-ha Hong

I argue that data-driven predictions are primarily instruments for systematic extraction of discretionary power – the practical capacity to make everyday decisions and define one’s situation. This approach puts the focus squarely on prediction as social reality, where rhetoric and expectation work in tandem with technical and epistemic functions. I argue that this unequal distribution of discretionary power is normal and fundamental to the technology, rather than isolated cases of bias or error. Synthesising critical observations across anthropology, history of technology and critical data studies, the paper examines the interplay of prediction and discretion in two contemporary domains: (1) criminality and (2) productivity. I argue that making human behaviour more predictable for the client of prediction (the manager, the corporation, the police officer) often means making life and work more unpredictable for the target of prediction (the employee, the applicant, the citizen).


Stop the Spread: A Contextual Integrity Perspective on the Appropriateness of COVID-19 Vaccination Certificates

Shikun Zhang, Yan Shvartzshnaider, Yuanyuan Feng, Helen Nissenbaum, and Norman Sadeh

We present an empirical study exploring how privacy influences the acceptance of vaccination certificate (VC) deployments across different realistic usage scenarios. The study employed the privacy framework of Contextual Integrity, which has been shown to be particularly effective to capture people's privacy expectations across different contexts. We use a vignette methodology, where we selectively manipulate salient contextual parameters understood to potentially impact people's attitudes towards VCs. We surveyed 890 participants from a demographically-stratified sample of the US population to gauge the acceptance and overall attitudes towards possible VC deployments to enforce vaccination mandates and the different information flows VCs might entail. Analysis of results collected at part of this study are used to derive general normative observations about different possible VC practices and to provide guidance for the possible VC deployments in different contexts.


Promoting Ethical Awareness in Communication Analysis: Investigating Potentials and Limits of Visual Analytics for Intelligence Applications

Maximilian T. Fischer, Simon David Hirsbrunner, Wolfgang Jentner, Matthias Miller, Daniel A. Keim and Paula Helm

Digital systems for analyzing human communication data have become prevalent in recent years. This may be related to the increasing abundance of data that can be harnessed but can hardly be managed in analog form. Intelligence analysis of communications data in investigative journalism, criminal intelligence, and law presents a particularly interesting case, as they must take into account the often highly sensitive properties of the underlying operations and data. At the same time, these are areas where increasingly automated, sophisticated approaches and tailored systems can be particularly useful and relevant, especially in terms of Big Data manageability, but also poses dangers. In addition to privacy concerns, these relate to uncertain or poor data quality, leading to discrimination and potentially misleading insights. Other problems relate to a lack of transparency and traceability, making it difficult to accurately identify problems and determine appropriate remedial strategies. Visual Analytics combines machine learning methods with interactive visual interfaces to enable human sense- and decision-making. This technique can be key for designing and operating meaningful interactive communication analysis systems that consider these ethical challenges. In this interdisciplinary work, a joint endeavor of computer scientists, ethicists, and scholars in Science & Technology Studies, we investigate and evaluate opportunities and risks involved in using Visual Analytics approaches for communication analysis in intelligence applications in particular. We introduce, at first, the common technological systems used in communication analysis, with a special focus on intelligence analysis in criminal investigations, further discussing the domain-specific ethical implications and risks involved. We then make the case how tailored Visual Analytics approaches may reduce and mitigate the described problems, both theoretically and through practical examples. Offering interactive analysis capabilities and what-if explorations while facilitating guidance, provenance generation, and bias awareness (through nudges, for example) can improve analysts' understanding of their data, increasing trustworthiness, accountability, and generating knowledge. We show that finding VA design solutions for ethical issues is not a mere optimization task with an ideal final solution. Design solutions for specific ethical problems (e.g., privacy) often trigger new ethical issues (e.g., accountability) in other areas. Balancing out and negotiating these trade-offs has, as we argue, to be considered as an integral aspect of the design process. Finally, our work identifies existing gaps and highlights research opportunities, further describing how our results can be transferred to other domains. With this contribution, we aim at informing more ethically-aware approaches to communication analysis in intelligence operations.


Promoting Fairness in Learned Models by Learning to Active Learn under Parity Constraints

Amr Sharaf, Hal Daumé III and Renkun Ni

Machine learning models can have consequential effects, and disparities in error rate can lead to harms suffered more by some groups than others. Past algorithmic approaches mitigate such disparities for fixed training data; we ask: what if we can gather more data? We develop a meta-learning algorithm for parity-constrained active learning that learns a policy to decide which labels to query so as to maximize accuracy subject to parity constraints, using forward-backward splitting at the meta-learning level. Empirically,our approach outperforms alternatives by a large margin.


Providing Item-side Individual Fairness for Deep Recommender Systems

Xiuling Wang and Wendy Hui Wang

Recent advent of deep learning techniques have reinforced the development of new recommender systems. Although these systems have been demonstrated as efficient and effective, the issue of item popularity bias in these recommender systems has raised serious concerns. While most of the existing works focus on group fairness at item side, individual fairness at item side is left largely unexplored.To address this issue, in this paper, first, we define a new notion of individual fairness from the perspective of items, namely(\alpha,\beta)-fairness, to deal with item popularity bias in recommendations. In particular,(\alpha,\beta)-fairness requires that similar items should receive similar coverage in the recommendations, where \alpha and \beta control item similarity and coverage similarity respectively, and both item and coverage similarity metrics are defined as task specific for deep recommender systems. Next, we design two bias mitigation methods, namely embedding-based re-ranking(ER) and greedy substitution(GS), for deep recommender systems. ER is an in-processing mitigation method that equips(\alpha,\beta)-fairness as a constraint to the objective function of the recommendation algorithm, while GS is a post-processing approach that accepts the biased recommendations as the input, and substitutes high-coverage items with low-coverage ones in the recommendations to satisfy(\alpha,\beta)-fairness. We evaluate the performance of both mitigation algorithms on two real-world datasets and a set of state-of-the-art deep recommender systems. Our results demonstrate that both ER and GS outperform the existing minimum-coverage (MC) mitigation solutions [27, 33] in terms of both fairness and accuracy of recommendations. Furthermore, ER delivers the best trade-off between fairness and recommendation accuracy among a set of alternative mitigation methods, includingGS, the hybrid of ER and GS, and the existing MC solutions [27, 33].


Rational Shapley Values

David Watson

Explaining the predictions of opaque machine learning algorithms is an important and challenging task, especially as complex models are increasingly used to assist in high-stakes decisions such as those arising in healthcare and finance. Most popular tools for post-hoc explainable artificial intelligence (XAI) are either insensitive to context (e.g., feature attributions) or difficult to summarize (e.g., counterfactuals). In this paper, I introduce rational Shapley values, a novel XAI method that synthesizes and extends these seemingly incompatible approaches in a rigorous, flexible manner. I leverage tools from decision theory and causal modeling to formalize and implement a pragmatic approach that resolves a number of known challenges in XAI. By pairing the distribution of random variables with the appropriate reference class for a given explanation task, I illustrate through theory and experiments how user goals and knowledge can inform and constrain the solution set in an iterative fashion. The method compares favorably to state of the art XAI tools in a range of quantitative and qualitative comparisons.


REAL ML: Recognizing, Exploring, and Articulating Limitations in Machine Learning Research

Jessie J. Smith, Saleema Amershi, Solon Barocas, Hanna Wallach and Jennifer Wortman Vaughan

Transparency around limitations can improve the rigor of scientific research, help to ensure appropriate interpretation of research findings, and make research claims more credible. Despite these benefits, the machine learning (ML) research community lacks well-developed norms or standards around disclosing limitations. To address this gap, we conducted an iterative design process with 30 ML and ML-adjacent researchers to develop REAL ML, a set of guided activities to help ML researchers recognize, explore, and articulate the limitations of their own research. We explore researchers' perceptions of limitations and the challenges that they face when recognizing, exploring, and articulating limitations in ML research. While some of these challenges are addressed with REAL ML, we highlight additional challenges that require broader shifts in community norms to address.


Regulating Facial Processing Technologies: Tensions Between Legal and Technical Considerations in the Application of Illinois BIPA

Rui-Jie Yew and Alice Xiang

The development and deployment of facial processing technologies (FPT) have garnered increasing controversy in recent years. Several states and cities in the U.S. have banned the use of facial recognition by law enforcement and governments. However, FPT are still being developed and used in a wide variety of contexts, in which they primarily are regulated by biometric information privacy laws. In recent years, there has been a proliferation of biometric information privacy laws passed by several U.S. jurisdictions that encompass FPT as well as other biometric technologies. Among these biometric information privacy laws, the 2008 Illinois Biometric Information Privacy Act (BIPA) has generated a significant amount of litigation. Yet, with most lawsuits reaching settlements before there can be meaningful clarifications of relevant technical intricacies and legal definitions, there remains a great degree of uncertainty as to how exactly these policies apply to FPT. What we have found through applications of BIPA in FPT litigation so far, however, points to potential disconnects between technical and legal communities. This paper analyzes what we know based on BIPA court proceedings and highlights important considerations that are not being captured in BIPA or courts’ application of it. These factors are relevant for (i) reasoning about biometric information privacy laws as a governing mechanism for FPT, (ii) assessing the potential harms of FPT technology, and (iii) incentivizing the mitigation of these harms. By illuminating these considerations, we hope to empower courts and lawmakers to take a more nuanced approach to regulating FPT and developers to better understand the current U.S. legal landscape.


Reliable and Safe Use of Machine Translation in Medical Settings

Nikita Mehandru, Samantha Robertson and Niloufar Salehi

Language barriers between patients and clinicians contribute to disparities in quality of care. Machine Translation (MT) tools are widely used in healthcare settings, but even small mistranslations can have life-threatening consequences. We study how MT is currently used in medical settings through a qualitative interview study with 20 clinicians--physicians, surgeons, nurses, and midwives. We find that clinicians face challenges stemming from lack of time and resources, cultural barriers, and medical literacy rates, as well as accountability in cases of miscommunication. Clinicians have devised strategies to aid communication in the face of language barriers including back translation, non-verbal communication, and testing patient understanding. We propose design implications for machine translation systems including combining neural MT with pre-translated medical phrases, integrating translation support with multimodal communication, and providing interactive support for testing mutual understanding.


Robots Enact Malignant Stereotypes

Andrew Hundt, William Agnew, Severin Kacianka, Vicky Zeng and Matthew Gombolay

Stereotypes, bias, and discrimination have been extensively documented in Machine Learning (ML) methods such as Computer Vision (CV) [18, 80], Natural Language Processing (NLP) [6], or both, in the case of large image and caption models such as OpenAI CLIP [14]. In this paper, we evaluate how ML bias manifests in robots that physically and autonomously act within the world. We audit one of several recently published CLIP-powered robotic manipulation methods, presenting it with objects that have pictures of human faces on the surface which vary across race and gender, alongside task descriptions that contain terms associated with common stereotypes. Our experiments definitively show robots acting out toxic stereotypes with respect to gender, race, and scientifically-discredited physiognomy, at scale. Furthermore, the audited methods are less likely to recognize Women and People of Color. Our interdisciplinary sociotechnical analysis synthesizes across fields and applications such as Science Technology and Society (STS), Critical Studies, History, Safety, Robotics, and AI. We find that robots powered by large datasets and Dissolution Models (sometimes called “foundation models”, e.g. CLIP) that contain humans risk physically amplifying malignant stereotypes in general; and that merely correcting disparities will be insufficient for the complexity and scale of the problem. Instead, we recommend that robot learning methods that physically manifest stereotypes or other harmful outcomes be paused, reworked, or even wound down when appropriate, until outcomes can be proven safe, effective, and just. Finally, we discuss comprehensive policy changes and the potential of new interdisciplinary research on topics like Identity Safety Assessment Frameworks and Design Justice to better understand and address these harms.


Seeing without Looking: Analysis Pipeline for Child Sexual Abuse Datasets

Camila Laranjeira da Silva, João Macedo, Sandra Avila and Jefersson dos Santos

The online sharing and viewing of Child Sexual Abuse Material (CSAM) are growing fast, such that human experts can no longer handle the manual inspection. However, the automatic classification of CSAM is a challenging field of research, largely due to the inaccessibility of target data that is — and should forever be — private and in sole possession of law enforcement agencies. To aid researchers in drawing insights from unseen data and safely providing further understanding of CSAM images, we propose an analysis template that goes beyond the statistics of the dataset and respective labels. It focuses on the extraction of automatic signals, provided both by pre-trained machine learning models, e.g., object categories and pornography detection, as well as image metrics such as luminance and sharpness. Only aggregated statistics of sparse signals are provided to guarantee the anonymity of children and adolescents victimized. The pipeline allows filtering the data by applying thresholds to each specified signal and provides the distribution of such signals within the subset, correlations between signals, as well as a bias evaluation. We demonstrated our proposal on [DATASET OMITTED], one of the few CSAM benchmarks in the literature, composed of over 2000 samples among regular and CSAM images, produced in partnership with Brazil's Federal Police. Although noisy and limited in several senses, we argue that automatic signals can highlight important aspects of the overall distribution of data, which is valuable for databases that can not be disclosed. Our goal is to safely publicize the characteristics of CSAM datasets, encouraging researchers to join the field and perhaps other institutions to provide similar reports on their benchmarks.


Selection in the Presence of Implicit Bias: The Advantage of Intersectional Constraints

Anay Mehrotra, Bary S. R. Pradelski and Nisheeth K. Vishnoi

In selection processes such as hiring, promotion, and college admissions, implicit bias toward socially-salient attributes such as race, gender, or sexual orientation of candidates is known to produce persistent inequality and reduce aggregate utility for the decision maker. Interventions such as the Rooney Rule and its generalizations, which require the decision maker to select at least a specified number of individuals from each affected group, have been proposed to mitigate the adverse effects of implicit bias in selection. Recent works have established that such lower-bound constraints can be very effective in improving aggregate utility in the case when each individual belongs to at most one affected group. However, in several settings, individuals may belong to multiple affected groups and, consequently, face more extreme implicit bias due to this intersectionality. We consider independently drawn utilities and show that, in the intersectional case, the aforementioned non-intersectional constraints can only recover part of the total utility achievable in the absence of implicit bias. On the other hand, we show that if one includes appropriate lower-bound constraints on the intersections, almost all the utility achievable in the absence of implicit bias can be recovered. Thus, intersectional constraints can offer a significant advantage over a reductionist dimension-by-dimension non-intersectional approach to reducing inequality.


Sensible AI: Re-imagining Interpretability and Explainability using Sensemaking Theory

Harmanpreet Kaur, Eytan Adar, Eric Gilbert and Cliff Lampe

Understanding how ML models work is a prerequisite for responsibly designing, deploying, and using ML-based systems. With interpretability approaches, ML can now offer explanations for its outputs to aid human understanding. Though these approaches rely on guidelines for how humans explain things to each other, they ultimately solve for improving the artifact---an explanation. In this paper, we propose an alternate framework for interpretability grounded in Weick's sensemaking theory, which focuses on who the explanation is intended for. Recent work has advocated for the importance of understanding stakeholders' needs---we build on this by providing concrete properties (e.g., identity, social context, environmental cues, etc.) that shape human understanding. We use an application of sensemaking in organizations as a template for discussing design guidelines for sensible AI, AI that factors in the nuances of human cognition when trying to explain itself.


Should attention be all we need? The ethical and epistemic implications of unification in machine learning

Nic Fishman and Leif Hancox-Li

“Attention is all you need” has become a fundamental precept in machine learning research. Originally referring just to machine translation, transformers now find success across many ML problem domains. With the apparent domain-agnostic success of transformers, many machine learning researchers are excited that similar model architectures can be successfully deployed across diverse applications in vision and language. We consider the benefits and risks of these waves of unification on both epistemic and ethical fronts. On the epistemic side, we argue that many of the arguments in favor of unification in the natural sciences fail to transfer over to the machine learning case, or transfer over only under certain strenuous assumptions that might not hold. Unification also introduces epistemic risks related to increased black-boxing and path dependency. We also discuss ethical risks related to centralization of power, having fewer models across more domains of application, and removing the need for domain experts.


Smallset Timelines: A Visual Representation of Data Preprocessing Decisions

Lydia R. Lucchesi, Petra Kuhnert, Jenny L. Davis and Lexing Xie

Data preprocessing is a crucial stage in the data analysis pipeline, with both technical and social aspects to consider. Yet, the attention it receives is often lacking in research practice and dissemination. We present the Smallset Timeline, a visualisation to help reflect on and communicate data preprocessing decisions. A "Smallset" is a small selection of rows from the original dataset containing instances of dataset alterations. The Timeline is comprised of Smallset snapshots representing different points in the preprocessing stage and captions to describe the alterations visualised at each point. Edits, additions, and deletions to the dataset are highlighted with colour. We develop the R software package, smallset, to assist in creating these Timelines. Constructing the figure asks practitioners to reflect on and revise decisions as necessary, while sharing it aims to make the process accessible to a diverse range of audiences. Two included case studies on software defect data and census benchmark data illustrate use of the Smallset Timeline to visualise important decisions, e.g., those that result in differing levels of data loss or demographic balance for prediction tasks. We envision Smallset Timelines as a go-to data provenance tool, helping preprocessing to be better documented and communicated at large.


Social Inclusion in Curated Contexts: Insights from Museum Practices

Han-Yin Huang and Cynthia Liem

AI literature suggests that minority and fragile communities in society can be negatively impacted by machine learning algorithms due to inherent biases in the design process, which lead to socially exclusive decisions and policies. Faced with similar challenges in dealing with an increasingly diversified audience, the museum sector has seen changes in theory and practice, particularly in the areas of representation and meaning-making. While rarity and grandeur used to be at the centre stage of the early museum practices, folk life and museums' relationships with the diverse communities they serve become an widely integrated part of the contemporary practices. These changes address issues of diversity and accessibility in order to offer more socially inclusive services. Drawing on these changes and reflecting back on the AI world, we argue that the museum experience provides useful lessons for building AI with socially inclusive approaches, especially in situations in which both a collection and access to it will need to be curated or filtered, as frequently happens in search engines, recommender systems and digital libraries. We highlight three principles: (1) Instead of upholding the value of neutrality, practitioners are aware of the influences of their own backgrounds and those of others on their work. By not claiming to be neutral but practising cultural humility, the chances of addressing potential biases can be increased. (2) There should be room for context-sensitive reflection beyond the stages of data collection and machine learning. Before applying models and predictions, the contexts in which relevant parties exist should be taken into account. (3) Community participation serves the needs of communities and has the added benefit of bringing practitioners and communities together.


South Korean Public Value Coproduction Towards 'AI for Humanity': A Synergy of Sociocultural Norms and Multistakeholder Deliberation in Bridging the Design and Implementation of National AI Ethics Guidelines

You Jeen Ha

As emerging technologies such as Big Data, artificial intelligence (AI), robotics, and the Internet of Things (IoT) pose fundamental challenges for global and domestic technological governance, the ‘Fourth Industrial Revolution’ (4IR) comes to the fore with AI as a frontrunner, generating discussions on the ethical elements of AI amongst key stakeholder groups, such as government, academia, industry, and civil society. However, in recent AI ethics and governance scholarship, AI ethics design and implementation appear to be implicitly discretized into two separate matters of theory and practice, respectively, an approach that then invokes efforts to bridge the ‘gap’ between the two. Discretization here potentially overcomplicates the discussion surrounding AI ethics and limits its productivity. This paper thus presents South Korea’s people-centered ‘National Guidelines for Artificial Intelligence Ethics’ (국가 인공지능 윤리기준; ‘Guidelines’) and their development as a case study that can help readers conceptualize AI ethics design and implementation as a continuous process rather than discrete problems that need solving. From a public value perspective, the case study examines the Guidelines and the multistakeholder policymaking infrastructure that serves as the foundation for both the Guidelines’ design and implementation, drawing from literature in AI ethics and governance, public management and administration, and Korean policy and cultural studies as well as nine interviews with members from the four stakeholder groups that collectively designed and continue to deliberate upon the Guidelines. Further, the study specifically focuses on (i) identifying public values that were highlighted by the Guidelines, (ii) investigating how such values reflect prevalent Korean sociocultural norms, and (iii) exploring how these values, in a way made possible by Korean sociocultural norms and policymaking, have been negotiated amongst the four stakeholder groups in a democratic public sphere to be ultimately incorporated into the Guidelines and prepared for implementation. This paper hopes to contribute to theory-building in AI ethics and provide a point of comparison for future research concerning AI ethics design and implementation.


Subverting Fair Image Search with Generative Adversarial Perturbations

Avijit Ghosh, Matthew Jagielski and Christo Wilson

In this work, we explore the intersection of two pressing concerns in the machine learning community---fairness and robustness---in the context of ranking: when a ranking model has been carefully calibrated to achieve some definition of fairness, is it possible for an external adversary to make the ranking model behave unfairly without having access to the model or training data? To investigate this question, we present a case study in which we develop and then attack a fairness-aware image search engine using images that have been maliciously modified with adversarial perturbations. Our image search engine uses a state-of-the-art MultiModal Transformer (MMT) retrieval model and a fair re-ranking algorithm (FMMR) that aims to achieve demographic group fairness. We then train a generative adversarial perturbation (GAP) model that learns from pretrained demographic classifiers to strategically insert human-imperceptible perturbations into images. These perturbations attempt to cause FMMR to unfairly boost the rank of images containing people from an adversary-selected subpopulation. We present results from extensive experiments demonstrating that our attacks can successfully confer significant unfair advantage to people from the majority class relative to fairly-ranked baseline search results. We demonstrate that our attacks are robust across a number of variables, that they have close to zero impact on the relevance of search results, and that they succeed under a strict threat model. Our findings highlight the danger of deploying fair machine learning algorithms in-the-wild when (1) the data necessary to achieve fairness may be adversarially manipulated, and (2) the models themselves are not robust against attacks.


Subverting machines, fluctuating identities: Re-learning human categorization

Jackie Kay, Christina Lu and Kevin McKee

Most machine learning system that interacts with humans will construct some notion of a person’s “identity,” yet the default paradigm in AI research envisions identity as discrete and its essential attributes as static. In stark contrast, strands of thought within critical theory present a different conception of identity as malleable and constructed entirely through interaction; a doing rather than a being. In this work, we distill some of these theories for machine learning practitioners and introduce a theory of identity as “autopoiesis,” circular processes of formation and function. We argue that the default paradigm of identity used by the field immobilizes existing identity categories and the power differentials that co-occur, due to the absence of iterative feedback to our models. This includes a critique of emergent AI fairness practices that continue to impose the default paradigm. Finally, we apply our theory to sketch approaches to autopoietic identity through multilevel optimization and relational learning. While these ideas raise many open questions, we imagine the possibilities of machines that are capable of expressing human identity as a relationship perpetually in flux.


Surfacing Racial Stereotypes through Identity Portrayal

Gauri Kambhatla, Ian Stewart and Rada Mihalcea

People express racial stereotypes through conversations with others, increasingly in a digital format. With the vast number of digital interactions, it would be beneficial to computationally identify racial stereotypes to help mitigate some of the harmful effects of stereotyping. In this work, we seek to better understand how we can computationally surface racial bias and stereotypes in text by identifying linguistic features associated with differences in racial identity portrayal, focused on two races (Black and White) and two genders (men and women). We collect novel data of individuals' self-presentation via crowdsourcing, where each crowdworker answers a set of prompts from their own perspective (real identity), and from the perspective of the other racial identity (portrayed identity), keeping the race (or gender) constant. We use these responses as a dataset to identify stereotypes. Through a series of experiments based on classifications between real and portrayed identities, we show that generalizations and stereotypes appear to be more prevalent amongst white participants than black participants. Through analyses of predictive words and word usage patterns, we find that some of the most predictive features of an author portraying a different racial identity are known stereotypes, and reveal how people of different identities see themselves and others.


System Safety and Artificial Intelligence

Roel Dobbe

This chapter formulates seven lessons for preventing harm in artificial intelligence (AI) systems based on insights from the field of system safety for software-based automation in safety-critical domains. New applications of AI across societal domains and public organizations and infrastructures come with new hazards, which lead to new forms of harm, both grave and pernicious. The text addresses the lack of consensus for diagnosing and eliminating new AI system hazards. For decades, the field of system safety has dealt with accidents and harm in safety-critical systems governed by varying degrees of software-based automation and decision-making. This field embraces the core assumption of systems and control that AI systems cannot be safeguarded by technical design choices on the model or algorithm alone, instead requiring an end-to-end hazard analysis and design frame that includes the context of use, impacted stakeholders and the formal and informal institutional environment in which the system operates. Safety and other values are then inherently socio-technical and emergent system properties that require design and control measures to instantiate these across the technical, social and institutional components of a system. This chapter honors system safety pioneer Nancy Leveson, by situating her core lessons for today’s AI system safety challenges. For every lesson, concrete tools are offered for rethinking and reorganizing the safety management of AI systems, both in design and governance. This history tells us that effective AI safety management requires transdisciplinary approaches and a shared language that allows involvement of all levels of society.


Tackling Algorithmic Disability Discrimination in the Hiring Process: An Ethical, Legal and Technical Analysis

Maarten Buyl, Christina Cociancig, Cristina Frattone and Nele Roekens

Tackling algorithmic discrimination against persons with disabilities (PWDs) demands a distinctive approach that is fundamentally different to that applied to other protected characteristics, due to particular ethical, legal, and technical challenges. We address these challenges specifically in the context of artificial intelligence (AI) systems used in hiring processes (or automated hiring systems, AHSs), in which automated assessment procedures are subject to unique ethical and legal considerations and have an undeniable adverse impact on PWDs. In this paper, we discuss concerns and opportunities raised by AI-driven hiring in relation to disability discrimination. Ultimately, we aim to encourage further research into this topic. Hence, we establish some starting points and design a roadmap for ethicists, lawmakers, advocates as well as AI practitioners alike.


Taxonomy of Risks posed by Large Language Models

Laura Weidinger, Jonathan Uesato, Maribeth Rauh, Conor Griffin, Po-Sen Huang, John Mellor, Amelia Glaese, Myra Cheng, Borja Balle, Atoosa Kasirzadeh, Courtney Biles, Sasha Brown, Zac Kenton, Will Hawkins, Tom Stepleton, Abeba Birhane, Lisa Anne Hendricks, Laura Rimell, William Isaac, Julia Haas, Sean Legassick, Geoffrey Irving and Iason Gabriel

Responsible innovation on large-scale Language Models (LMs) requires foresight into and in-depth understanding of the risks these models may pose. This paper develops a comprehensive taxonomy of ethical and social risks associated with LMs. We identify twenty-one risks, drawing on expertise and literature from computer science, linguistics, and the social sciences. We situate these risks in our taxonomy of six risk areas: I. Discrimination, Hate speech and Exclusion, II. Information Hazards, III. Misinformation Harms, IV. Malicious Uses, V. Human-Computer Interaction Harms, and VI. Environmental and Socioeconomic harms. For risks that have already been observed in LMs, the causal mechanism leading to harm, evidence of the risk, and approaches to risk mitigation are discussed. We further describe and analyse risks in each area that have not yet been observed but are anticipated based on assessments of other language technologies. Organisational responsibilities associated with implementing mitigations are discussed. We also highlight challenges and directions for further research on risk evaluation and mitigation with the goal of ensuring that language models are developed responsibly.


Tech Worker Organizing for Power and Accountability

William Boag, Harini Suresh, Bianca Lepe and Catherine D'Ignazio

In recent years, there has been a growing interest in the field of “AI Ethics” and related areas. This field is purposefully broad, allowing for the intersection of numerous fields and disciplines. Because of the computational form of these questions, some Algorithmic Fairness communities have centered computational methods, leading to a narrow lens where technical tools are framed as solutions for broader sociotechnical problems. In this work, we discuss a less-explored mode of what it can mean to “do” AI Ethics: tech worker collective action. Through collective action, the employees of powerful tech companies can act as a countervailing force against strong corporate impulses to grow or make a profit to the detriment of other values. In this work, we ground these efforts in existing scholarship of social movements and labor organizing. Further, we examine an archive chronicling many recent collective actions and contextualize those actions as part of a broader movement with methods for how workers can build power and use power.


Testing Concerns about Technology's Behavioral Impacts with N-of-one Trials

Nathan Matias, Eric Pennington and Zenobia Chan

As public trust in technology companies has declined, people are questioning the effects of digital technologies in their lives. In this context, many evidence-free claims from corporations and tech critics are widely circulated. How can members of the public make evidence-based decisions about digital technology in their lives? In clinical fields, N-of-one trials enable participant-investigators to make personalized causal discoveries about managing health, improving fitness, and improving their education. Similar methods could help community scientists understand and manage how they use digital technologies. In this paper, we introduce Conjecture, a system for coordinating N-of-one trials that can guide personal decisions about technology use and contribute to science. We describe N-of-one trials as a design challenge and present the design of the Conjecture system. We evaluate the system with a field experiment that tests folk theories about the influence of colorful screens on alleged phone addiction. We present findings on the design of N-of-one-trial systems based on submitted data, interviews, and surveys with 14 participants. Taken together, this paper introduces N-of-one trials as a fruitful direction for computer scientists designing industry-independent systems for evidence-based technology governance and accountability.


Confronting Power and Corporate Capture at the FAccT Conference

Meg Young, Michael Katell and Peaks Krafft

Fields such as medicine and public health attest to deep conflict of interest concerns present when private companies fund evaluation of their own products and services. We draw on these lessons to consider corporate capture of the ACM Fairness, Accountability, and Transparency (FAccT) conference. We situate our analysis within scholarship on the entanglement of industry and academia and focus on the silences it produces in the research record. Our analysis of the institutional design at FAccT indicates the conference’s neglect of those people most negatively impacted by algorithmic systems. We focus on a 2021 paper by Wilson et al., “Building and auditing fair algorithms: A case study in candidate screening” as a key example of conflicted research accepted via peer review at FAccT. We call on the conference to (1) lead on models for how to manage conflicts of interest in the field of computing beyond individual disclosure of funding sources, (2) hold space for advocates and activists able to speak directly to questions of algorithmic harm, and (3) reconstitute the conference with attention to fostering agonistic dissensus—un-making the present manufactured consensus and nurturing challenges to power. These changes will position our community to contend with the political dimensions of research on AI harms.


The Alchemy of Trust: The Creative Act of Designing Trustworthy Socio-Technical Systems

Lauren Thornton, Bran Knowles and Gordon Blair

Trust is recognised as a significant and valuable component of socio-technical systems, facilitating numerous important benefits. Many trust models have been created throughout various streams of literature, describing trust for different stakeholders in different contexts. When designing a system with multiple stakeholders in their multiple contexts, how does one decide which trust model(s) to apply? And furthermore, how does one go from selecting a model or models to translating those into design? We review and analyse two prominent trust models, and apply them to the design of a trustworthy socio-technical system, namely virtual research environments. We show that a singular model cannot easily be imported and directly implemented into the design of such a system. We introduce the concept of alchemy as the most apt characterization of a successful design process, illustrating the need for designers to engage with the richness of the trust landscape and creatively experiment with components from multiple models to create the perfect blend for their context. We provide a demonstrative case study illustrating the process through which designers of socio-technical systems can become alchemists of trust.


The Algorithmic Imprint

Upol Ehsan, Ranjit Singh, Jacob Metcalf and Mark Riedl

When algorithmic harms emerge, a reasonable response is to stop using the algorithm to resolve concerns related to fairness, accountability, transparency, and ethics (FATE). However, as we illustrate in this paper, just because an algorithm is removed does not imply that FATE-related issues cease to exist. We introduce the notion of the “algorithmic imprint” to illustrate how merely removing an algorithm does not necessarily undo or mitigate its consequences. We illustrate this concept and its implications through the 2020 events surrounding the algorithmic grading of the General Certificate of Education (GCE) Advanced (A) Level exams, an internationally recognized UK-based high school diploma exam administered in over 160 countries. While the algorithmic standardization was ultimately removed due to global protests, we show how the removal failed to undo the algorithmic imprint on the sociotechnical infrastructures that shape students’, teachers’, and parents’ lives. These events provide a rare chance to analyze the state of the world both with and without algorithmic mediation. We situate our case study in Bangladesh to illustrate how algorithms made in the Global North disproportionately impact stakeholders in the Global South. Chronicling more than a year-long community engagement consisting of 47 interviews, we present the first coherent timeline of “what” happened in Bangladesh, contextualizing “why” and “how” they happened through the lenses of the algorithmic imprint and situated algorithmic fairness. Analyzing these events, we highlight how the contours of the algorithmic imprints can be inferred at the infrastructural, societal, and individual levels. We share conceptual and practical implications around how imprint-awareness can (a) broaden the boundaries of how we think about algorithmic impact, (b) inform how we design algorithms, and (c) guide us in AI governance. The imprint-aware design mindset can make the algorithmic development process more human-centered and sociotechnically-informed.


The Case for a Legal Compliance API for the Enforcement of the EU’s Digital Services Act on Social Media Platforms

Catalina Goanta, Thales Bertaglia and Adriana Iamnitchi

In the course of under a year, the European Commission has launched some of the most important regulatory proposals to date on platform governance. The Commission's goals behind cross-sectoral regulation of this sort include the protection of markets and democracies alike. While all these acts propose sophisticated rules for setting up new enforcement institutions and procedures, one aspect remains highly unclear: how digital enforcement will actually take place in practice. Focusing on the Digital Services Act (DSA), this discussion paper critically addresses issues around social media data access for the purpose of digital enforcement and proposes an original concept in the form of a legal compliance API as a means to facilitate compliance with the DSA and complementary European and national regulation. To contextualize this discussion, the paper pursues two scenarios that exemplify the harms arising out of content monetization affecting a particularly vulnerable category of social media users: children. The two scenarios are used to further reflect upon essential issues surrounding data access and legal compliance with the DSA and further applicable legal standards in the field of labour and consumer law.


The Conflict Between Explainable and Accountable Decision-Making Algorithms

Gabriel Lima, Nina Grgić-Hlača, Jinkeun Jeong and Meeyoung Cha

Decision-making algorithms are being used in important decisions, such as who should be enrolled in health care programs or be hired. Even though these systems are currently deployed in high-stakes scenarios, many of them cannot explain their decisions. This limitation has led to the Explainable Artificial Intelligence (XAI) initiative, which aims to make algorithms explainable to comply with legal requirements, promote trust, and maintain accountability. This discussion paper questions whether and to what extent explainability can solve the responsibility issues posed by autonomous AI systems. We propose that XAI systems providing post-hoc explanations could be seen as blameworthy agents, obscuring the responsibility of developers in the decision-making process. Furthermore, we argue that XAI could lead to wrong attributions of responsibility to vulnerable stakeholders, such as those subjected to algorithmic decisions (i.e., patients), due to a misguided perception that they have control over apparently explainable algorithms. This conflict between explainability and accountability can be aggravated if designers use algorithms and patients as moral and legal scapegoats. This paper concludes with a series of recommendations on how to consider this conflict in the socio-technical process of algorithmic decision-making and a defense of hard regulation to prevent designers from escaping responsibility for their systems.


The Death of the Legal Subject: How Predictive Algorithms Are (Re)constructing Legal Subjectivity

Katrina Geddes

This paper explores the epistemological differences between the socio-political legal subject of Western liberalism, and the algorithmic subject of informational capitalism. It argues that the increasing use of predictive algorithms in judicial decision-making is fundamentally reconstructing both the nature and experience of legal subjectivity in a manner that is incompatible with law’s normative commitments to individualized justice. Whereas algorithmic subjectivity derives its epistemic authority from population-level insights, legal subjectivity has historically derived credibility from its close approximation of the underlying individual, through careful evaluation of their mental and physical autonomy, prior to any assignment of legal liability. With the introduction of predictive algorithms in judicial decision-making, knowledge about the legal subject is increasingly algorithmically produced, in a manner that discounts, and effectively displaces, qualitative knowledge about the legal subject’s intentions, motivations, and moral capabilities. This results in the death of the legal subject, or the emergence of new, algorithmic practices of signification that no longer require the input of the underlying individual. As algorithms increasingly guide judicial decision-making, the shifting epistemology of legal subjectivity has long-term consequences for the legitimacy of legal institutions.


The Effects of Crowd Workers Biases in Fact-Checking Tasks

Tim Draws, David La Barbera, Michael Soprano, Kevin Roitero, Davide Ceolin, Alessandro Checco and Stefano Mizzaro

Due to the increasing amount of information shared online every day, the need for a sound and reliable way to distinguish between trustworthy and non-trustworthy information is as present as ever. One technique for performing fact-checking at scale is to employ human intelligence in the form of crowd workers. Although earlier work has suggested that crowd workers can reliably identify misinformation, cognitive biases of crowd workers may decrease the quality of truthfulness judgments in this context. We performed a systematic exploratory analysis of publicly available crowdsourced data to identify a set of potential systematic biases that may occur in fact-checking tasks performed by crowd workers. Following this exploratory study, we collected a novel data set of crowdsourced truthfulness judgments to validate our hypotheses. Our findings suggest that workers generally overestimate the truthfulness of statements and that different cognitive biases (i.e., the affect heuristic and overconfidence) can affect their annotations. Exploratory findings furthermore hint at a potential relationship between crowd workers' trust in politics and their ability to judge the truthfulness of statements accurately. Interestingly, we find that, depending on general judgment tendencies of workers, their biases may sometimes lead to more accurate judgments.


The Fallacy of AI Functionality

Inioluwa Deborah Raji, I. Elizabeth Kumar, Aaron Horowitz and Andrew Selbst

Deployed AI systems often do not work. They can be constructed haphazardly, deployed indiscriminately, and promoted deceptively. However, scholars, the press, and policymakers pay too little attention to functionality. This leads to technical and policy solutions focused on “ethical” or value-aligned deployments, often skipping over the prior question of whether a given system functions, or provides any benefits at all. To describe the harms of various types of functionality failures, we create a taxonomy of known AI functionality issues. We then point to policy and organizational responses that are often overlooked and become more readily available once functionality is drawn into focus. We argue that functionality is a meaningful AI policy issue, operating as a necessary first step towards protecting affected communities from algorithmic harm.


The forgotten margins of AI ethics

Abeba Birhane, Elayne Ruane, Thomas Laurent, Matthew S. Brown, Johnathan Flowers, Anthony Ventresque and Christopher L. Dancy

How has recent AI Ethics literature addressed topics such as fairness and justice in the context of continued structural power asymmetries? We trace both the historical roots and current landmark work that have been shaping the field and we categorize these works under three broad umbrellas: (i) those grounded in Western canonical philosophy, (ii) mathematical and statistical methods,and (iii) those emerging from critical data/algorithm/information studies. We also survey the field and explore emerging trends by examining the rapidly growing body of literature that falls under the broad umbrella ofAI Ethics. To that end, we read and annotated peer-reviewed papers published over the past four years in two premier conferences; FAccT and AIES. We classified the literature based on an annotation scheme we developed according to three main dimensions: whether the paper deals with concrete applications,use-cases, and/or people’s lived experience; to what extent it addresses harmed, threatened, or otherwise marginalized groups; and if so, whether it explicitly names such groups. We note that although the goals of various annotated papers were often commendable,there exists a problematic shallow consideration of the negative impacts of AI on traditionally marginalized groups. Taken together,our conceptual analysis and the data from annotated papers indicates that the field would benefit from an increased focus on ethical analysis grounded in concrete use-cases, people, and applications that incorporates structural and historical power asymmetries.


The Long Arc of Fairness: Formalisations and Ethical Discourse

Pola Schwöbel and Peter Remmers

In recent years, the idea of formalising and modelling fairness for algorithmic decision making (ADM) has advanced to a point of sophisticated specialisation. However, the relations between technical (formalised) and ethical discourse on fairness are not always clear and productive. Arguing for an alternative perspective, we review existing fairness metrics and discuss some common issues. For instance, the fairness of procedures and distributions is often formalised and discussed statically, disregarding both structural preconditions of the status quo and downstream effects of a given intervention. We then introduce dynamic fairness modelling, a more comprehensive approach that realigns formal fairness metrics with arguments from the ethical discourse. A dynamic fairness model incorporates (1) ethical goals, (2) formal metrics to quantify decision procedures and outcomes and (3) mid-term or long-term downstream effects. By contextualising these elements of fairness-related processes, dynamic fairness modelling explicates formerly latent ethical aspects and thereby provides a helpful tool to navigate trade-offs between different fairness interventions. To illustrate the framework, we discuss an example application -- the current European efforts to increase the number of women on company boards, e.g. via quota solutions -- and present early technical work that fits within our framework.


The Model Card Authoring Toolkit: Toward Community-centered, Deliberation-driven AI Design

Hong Shen, Leijie Wang, Wesley Hanwen Deng, Ciell, Ronald Velgersdijk and Haiyi Zhu

There have been increasing calls for centering impacted communities – both online and offline – in the design of the AI systems that will be deployed in their communities. However, the complicated nature of a community’s goals and needs, as well as the complexity of AI’s development procedures, outputs, and potential impacts, often prevents effective participation. In this paper, we present the Model Card Authoring Toolkit, a toolkit that supports community members to understand, navigate and negotiate a spectrum of machine learning models via deliberation and pick the ones that best align with their collective values. Through a series of workshops, we conduct an empirical investigation of the initial effectiveness of our approach in two online communities – English Wikipedia and Dutch Wikipedia, and document how our participants try to collectively set the threshold for a machine learning based quality prediction system used in their communities’ content moderation applications. Our results suggest that the use of the Model Card Authoring Toolkit helps improve the understanding of the trade-offs across multiple community goals on AI design, engage community members to discuss and negotiate the trade-offs, and facilitate collective and informed decision-making in their own community contexts. Finally, we discuss the challenges for a community-based, deliberation-driven approach for AI design as well as potential design implications.


The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

Aparna Balagopalan, Haoran Zhang, Kimia Hamidieh, Thomas Hartvigsen, Frank Rudzicz and Marzyeh Ghassemi

Machine learning models in safety-critical settings like healthcare are often ``blackboxes'': they contain a large number of parameters which are not transparent to users. Post-hoc explainability methods where a simple, human-interpretable model imitates the behavior of these blackbox models are often proposed to help users trust model predictions. In this work, we audit the quality of such explanations for different protected subgroups using real data from four settings in finance, healthcare, college admissions, and the US justice system. Across two different blackbox model architectures and four popular explainability methods, we find that the approximation quality of explanation models, also known as the fidelity, differs significantly between subgroups. We also demonstrate that pairing explainability methods with recent advances in robust machine learning can significantly improve explanation fairness in some settings. However, we highlight the importance of communicating details of non-zero fidelity gaps to users, since a single solution might not exist across settings. Finally, we discuss the implications of unfair explanation models as a deep-rooted, challenging, and understudied problem facing the machine learning community.


The Spotlight: A General Method for Discovering Systematic Errors in Deep Learning Models

Greg d'Eon, Jason d'Eon, James R. Wright and Kevin Leyton-Brown

Supervised learning models often make systematic errors on rare subsets of the data. When these subsets correspond to explicit labels in the data (e.g., gender, race) such poor performance can be identified straightforwardly. This paper introduces a method for discovering systematic errors that do not correspond to such explicitly labelled subgroups. The key idea is that similar inputs tend to have similar representations in the final hidden layer of a neural network. We leverage this structure by "shining a spotlight" on this representation space to find contiguous regions where the model performs poorly. We show that the spotlight surfaces semantically meaningful areas of weakness in a wide variety of existing models spanning computer vision, NLP, and recommender systems.


The Values Encoded in Machine Learning Research

Co-Winner: Distinguished Paper Award

Abeba Birhane, Pratyusha Kalluri, Dallas Card, William Agnew, Ravit Dotan and Michelle Bao

Machine learning (ML) currently exerts an outsized influence on the world, increasingly affecting communities and institutional practices. It is therefore critical that we question vague conceptions of the field as value-neutral or universally beneficial, and investigate what specific values the field is advancing. In this paper, we present a rigorous examination of the values of the field by quantitatively and qualitatively analyzing 100 highly cited ML papers published at premier ML conferences, ICML and NeurIPS. We annotate key features of papers which reveal their values: how they justify their choice of project, which aspects they uplift, their consideration of potential negative consequences, and their institutional affiliations and funding sources. We find that societal needs are typically very loosely connected to the choice of project, if mentioned at all, and that consideration of negative consequences is extremely rare. We identify 67 values that are uplifted in machine learning research, and, of these, we find that papers most frequently justify and assess themselves based on performance, generalization, efficiency, researcher understanding, novelty, and building on previous work. We present extensive textual evidence and analysis of how these values are operationalized. Notably, we find that each of these top values is currently being defined and applied with assumptions and implications generally supporting the centralization of power. Finally, we find increasingly close ties between these highly cited papers and tech companies and elite universities.


Theories of “Gender” in NLP Bias Research

Hannah Devinney, Jenny Björklund and Henrik Björklund

The rise of concern around Natural Language Processing (NLP) technologies containing and perpetuating social biases has led to a rich and rapidly growing area of research. Gender bias is one of the central biases being analyzed; but to date there is no comprehensive analysis of how “gender” is theorized in the field. We survey nearly 200 articles concerning gender bias in NLP to discover how the field conceptualizes gender both explicitly (e.g. through definitions of terms) and implicitly (e.g. through how gender is operationalized in practice). In order to get a better idea of emerging trajectories of thought, we split these articles into two sections by time. We find that the majority of the articles do not make their theorization of gender explicit, even if they clearly define “bias.” Almost none use a model of gender that is intersectional or inclusive of non-binary genders; and many conflate sex characteristics, social gender, and linguistic gender in ways that disregard the existence and experience of trans, nonbinary, and intersex people. Despite an increase in statements acknowledging that gender is a complicated reality, very few works manage to put this acknowledgment into practice. In addition to analyzing these findings, we provide specific recommendations to facilitate interdisciplinary work, and to incorporate theory and methodology from Gender Studies. Our hope is that this will produce more inclusive gender bias research in NLP.


Towards a multi-stakeholder value-based assessment framework for algorithmic systems

Mireia Yurrita, Dave Murray-Rust, Agathe Balayn and Alessandro Bozzon

In an effort to regulate Machine Learning-driven (ML) systems, current auditing processes mostly focus on detecting harmful algorithmic biases. While these strategies have proven to be impactful, some values outlined in documents dealing with ethics in ML-driven systems are still underrepresented in auditing processes. Such unaddressed values mainly deal with contextual factors that cannot be easily quantified. In this paper, we develop a value-based assessment framework that is not limited to bias auditing and that covers prominent ethical principles for algorithmic systems. Our framework presents a circular arrangement of values with two bipolar dimensions that make common motivations and potential tensions explicit. In order to operationalize these high-level principles, values are then broken down into specific criteria and their manifestations. However, some of these value-specific criteria are mutually exclusive and require negotiation. As opposed to some other auditing frameworks that merely rely on ML researchers' and practitioners' input, we argue that it is necessary to include stakeholders that present diverse standpoints to systematically negotiate and consolidate value and criteria tensions. To that end, we map stakeholders with different insight needs, and assign tailored means for communicating value manifestations to them. We, therefore, contribute to current ML auditing practices with an assessment framework that visualizes closeness and tensions between values and we give guidelines on how to operationalize them, while opening up the evaluation and deliberation process to a wide range of stakeholders.


Designing for Responsible Trust in AI Systems: A Communication Perspective

Q.Vera Liao and S. Shyam Sundar

Current literature and public discourse on ``trust in AI'' are often fixated on what constitutes trustworthy AI in principle. With individual AI systems differing in their level of trustworthiness, two open questions come to the fore: how should system trustworthiness be responsibly communicated to ensure appropriate and equitable trust judgments by different users, and how can we prevent deceptive trust? We draw from communication theories and literature on trust in technologies to develop a conceptual model called MATCH, which describes how trustworthiness is communicated in AI systems through trustworthiness cues and how the cues are processed by users to make trust judgments. Besides AI-generated content, we highlight transparency and interaction as AI systems' affordances for a variety of trustworthiness cues. By bringing to light the plurality of users' cognitive processes to make trust judgments and their potential limitations, we urge technology creators to make conscious decisions in choosing reliable trustworthiness cues for target users and, as an industry, to regulate this space and prevent malicious use. Towards these goals, we define the concepts of warranted trustworthiness cues and expensive trustworthiness cues, and propose a checklist of requirements to help technology creators identify appropriate cues to use. We present a hypothetical use case to illustrate how practitioners can use MATCH to design AI systems responsibly, and discuss future directions for research and industry efforts aimed at establishing responsible trust in AI.


Towards Fair Unsupervised Learning

Francois Buet-Golfouse and Islam Utyagulov

Bias-mitigating techniques are now well established in the supervised learning literature and have shown their ability to tackle fairness-accuracy, as well as fairness-fairness trade-offs. These are usually predicated on different conceptions of fairness, such as demographic parity or equal odds that depend on the available labels in the dataset. However, it is often the case in practice that unsupervised learning is used as part of a machine learning pipeline (for instance, to perform dimensionality reduction or representation learning via SVD) or as a standalone model (for example, to derive a customer segmentation via k-Means). It is thus crucial to develop approaches towards fair unsupervised learning. This work investigates fair unsupervised learning within the broad framework of generalised low-rank models ("GLRM"), which includes many applications such as PCA, k-Means, matrix factorisation. We embed GLRMs in a simple supervised learning framework and introduce the concept of fairness functional that encompasses both traditional unsupervised learning techniques and min-max algorithms (whereby one minimises the maximum group loss). Finally, we show on benchmark datasets that our fair generalised low-rank models ("fGLRM") perform well and help reduce disparity amongst groups while only incurring small runtime overheads.


Towards Intersectional Feminist and Participatory ML: A Case Study in Supporting Feminicide Counterdata Collection

Co-Winner: Distinguished Student Paper Award

Harini Suresh, Rajiv Movva, Amelia Lee Dogan, Rahul Bhargava, Isadora Cruxen, Angeles Martinez Cuba, Guilia Taurino, Wonyoung So and Catherine D'Ignazio

Data ethics and fairness have emerged as important areas of research in recent years. However, much of the work in this area focuses on retroactively auditing and “mitigating bias” in existing, potentially flawed systems, without interrogating the deeper structural inequalities underlying them. There are not yet examples of how to apply feminist and participatory methodolgies from the start, to conceptualize and design machine learning-based tools that center and aim to challenge power inequalities. Our work targets this more prospective goal. Guided by the framework of Data Feminism, we co-design datasets and machine learning models to support the efforts of activists who collect and monitor data about feminicide – gender-based killings of women and girls. We describe how intersectional feminist goals and participatory processes shaped each stage of our approach, from problem conceptualization to data collection to model evaluation. We highlight several methodological contributions, including 1) an iterative data collection and annotation process that targets model weaknesses and interrogates framing concepts (such as who is included/excluded in "feminicide"), 2) models that explicitly focus on intersectional identities, rather than statistical majorities, and 3) a multi-step evaluation process—with quantitative, qualitative and participatory steps—focused on context-specific relevance. We also distill more general insights and tensions that arise from bridging intersectional feminist goals with ML. These include reflections on how ML may challenge power, embrace pluralism, rethink binaries and consider context, as well as the inherent limitations of any technology-based solution to address durable structural inequalities.


Towards Intersectionality in Machine Learning: Including More Identities, Handling Underrepresentation, and Performing Evaluation

Angelina Wang, Vikram Ramaswamy and Olga Russakovsky

Research in machine learning fairness has historically considered a single binary demographic attribute; however, the reality is of course far more complicated. In this work, we grapple with three questions that arise when incorporating intersectionality (in the form of multiple demographic attributes) into machine learning: (1) which demographic attributes and groups to include, (2) how to handle the progressively smaller size of subgroups, and (3) how to move beyond existing evaluation metrics when benchmarking model fairness for more subgroups. For each question, we provide thorough empirical evaluation on tabular datasets derived from the US Census, and present constructive recommendations for the machine learning community. Concretely, we advocate for supplementing domain knowledge with empirical validation when choosing which identity labels to train on, but always evaluating on the full set of identities; warn against using data imbalance techniques without considering their normative implications and suggest an alternative; and introduce new evaluation metrics which are more appropriate for the intersectional setting. Overall, we provide substantive suggestions on three necessary but not sufficient considerations when incorporating intersectionality into machine learning.


Trade-offs between Group Fairness Metrics in Societal Resource Allocation

Tasfia Mashiat, Xavier Gitiaux, Huzefa Rangwala, Patrick Fowler and Sanmay Das

We consider social resource allocations that deliver an array of scarce supports to a diverse population. Such allocations pervade social service delivery, such as provision of homeless services, assignment of refugees to cities, among others. At issue is whether allocations are fair across sociodemographic groups and intersectional identities. Our paper shows that necessary trade-offs exist for fairness in the context of scarcity; many reasonable definitions of equitable outcomes cannot hold simultaneously except under stringent conditions. For example, defining fairness in terms of improvement over a baseline inherently conflicts with defining fairness in terms of loss compared with the best possible outcome. Moreover, we demonstrate that the fairness trade-offs stem from heterogeneity across groups in intervention responses. Administrative records on homeless service delivery offer a real-world example. Building on prior work, we measure utilities for each household as the probability of reentry into homeless services if given three homeless services. Heterogeneity in utility distributions (conditional on received services) for several sociodemographic groups (e.g. single women with children versus without children) generates divergence across fairness metrics. We argue that such heterogeneity, and thus, fairness trade-offs pervade many social policy contexts.


Treatment Effect Risk: Bounds and Inference

Nathan Kallus

Since the average treatment effect (ATE) measures the change in social welfare, even if positive, there is a risk of negative effect on, say, some 10% of the population. Assessing such risk is difficult, however, because any one individual treatment effect (ITE) is never observed so the 10% worst-affected cannot be identified, while distributional treatment effects only compare the first deciles within each treatment group, which does not correspond to any 10%-subpopulation. In this paper we consider how to nonetheless assess this important risk measure, formalized as the conditional value at risk (CVaR) of the ITE distribution. We leverage the availability of pre-treatment covariates and characterize the tightest-possible upper and lower bounds on ITE-CVaR given by the covariate-conditional average treatment effect (CATE) function. Some bounds can also be interpreted as summarizing a complex CATE function into a single metric and are of interest independently of being a bound. We then proceed to study how to estimate these bounds efficiently from data and construct confidence intervals. This is challenging even in randomized experiments as it requires understanding the distribution of the unknown CATE function, which can be very complex if we use rich covariates so as to best control for heterogeneity. We develop a debiasing method that overcomes this and prove it enjoys favorable statistical properties even when CATE and other nuisances are estimated by black-box machine learning or even inconsistently. Studying a hypothetical change to French job-search counseling services, our bounds and inference demonstrate a small social benefit entails a negative impact on a substantial subpopulation.


Trucks Don’t Mean Trump: Diagnosing Human Error in Image Analysis

J.D. Zamfirescu-Pereira, Jerry Chen, Emily Wen, Allison Koenecke, Nikhil Garg and Emma Pierson

Algorithms provide powerful tools for detecting and dissecting human bias and error. Here, we develop machine learning methods to to analyze how humans err in a particular high-stakes task: image interpretation. We leverage a unique dataset of 16,135,392 human predictions of whether a neighborhood voted for Donald Trump or Joe Biden in the 2020 US election, based on a Google Street View image. We show that by training a machine learning estimator of the Bayes optimal decision for each image, we can provide an actionable decomposition of human error into bias, variance, and noise terms, and further identify specific features (like pickup trucks) which lead humans astray. Our methods can be applied to ensure that human-in-the-loop decision-making is accurate and fair and are also applicable to black-box algorithmic systems.


Uncertainty and the Social Planner’s Problem: Why Sample Complexity Matters

Cyrus Cousins

Welfare measures overall positive utility across a population, whereas malfare measures overall disutility, and the social planner’s problem can be cast either as maximizing the former or minimizing the letter. We show novel bounds on expectations and tails of estimates of welfare, malfare and regret of per-group risk or utility values, where the estimate is based on a finite sample drawn from each group. In particular, we consider estimating these values for individual functions (e.g., allocations or classifiers) with standard concentration of measure bounds, and over families of functions (i.e., we quantify overfitting) using Rademacher averages. We then examine the technically dense study of sample complexity through an algorithmic fairness lens, finding that because marginalized or minority groups are often understudied, and less data are therefore available, the social planner is more likely to overfit to these groups, thus even models that seem fair in training can be systematically biased against such groups. We argue that this effect can be mitigated by ensuring sufficient sample sizes for each group, and our sample complexity analysis characterizes these sample sizes. Motivated by these conclusions, we present progressive sampling algorithms to efficiently use data to optimize various fairness objectives.


Understanding and being understood: user strategies for identifying and recovering from mistranslations in machine translation-mediated chat

Samantha Robertson and Mark Díaz

Machine translation (MT) is now widely and freely available, and has the potential to greatly improve cross-lingual communication. However, it can be difficult for users to detect and recover from mistranslations due to limited language skills. In order to use MT reliably and safely, end users must be able to assess the quality of system outputs and determine how much they can rely on them to guide their decisions and actions. In this work we collected 19 MT-mediated role-play conversations in housing and employment scenarios, and conducted in-depth interviews to understand how users identify and recover from translation errors. Participants communicated using four language pairs: English, and one of Spanish, Farsi, Igbo, or Tagalog. We conducted qualitative analysis to understand user challenges in light of limited system transparency, strategies for recovery, and the kinds of translation errors that proved more or less difficult for users to overcome. We found that users broadly lacked relevant and helpful information to guide their assessments of translation quality. Instances where a user erroneously thought they had understood a translation correctly were rare but held the potential for serious consequences in the real world. Finally, inaccurate and disfluent translations had social consequences for participants, because it was difficult to discern when a disfluent message was reflective of the other person’s intentions, or an artifact of imperfect MT. We draw on theories of grounding and repair in communication to contextualize these findings, and propose design implications for explainable AI (XAI) researchers, MT researchers, as well as collaboration among them to support transparency and explainability in MT. These directions include handling typos and non-standard grammar common in interpersonal communication, making MT in interfaces more visible to help users evaluate errors, supporting collaborative repair of conversation breakdowns, and communicating model strengths and weaknesses to users.


Understanding Lay Users' Needs of Counterfactual Explanations for Everyday Recommendations

Ruoxi Shang, K. J. Kevin Feng and Chirag Shah

Intelligent everyday applications typically rely on automated Recommender Systems (RS) to generate recommendations that help users make decisions among a large number of options. Due to the increasing complexity of RS and the lack of transparency in its algorithmic decision-making, researchers have recognized the need to support users with more explanations. While traditional explainability methods fall short in disclosing the internal intricacy of recommender systems, counterfactual explanations provide many desirable explainable features by offering human-like explanations that contrast an existing recommendation with alternatives. However, there is a lack of empirical research in understanding lay users’ needs of counterfactual explanations in their usage of everyday intelligent applications. In this paper, we investigate whether and where should we provide counterfactual explanations in everyday recommender systems through a question-driven approach. We first conducted an interview study to understand how existing explanations might be insufficient to support lay users and elicit the triggers that prompt them to ask why not questions and seek additional explanations. The findings reveal that the utility of decision is a primary factor that may affect whether users want to consume counterfactual explanations. We then conducted an online scenario-based survey to quantify the correlation between utility and explanation needs and found significant correlations between the measured variables.


What Does it Mean for a Language Model to Preserve Privacy?

Hannah Brown, Katherine Lee, Fatemehsadat Mireshghallah, Reza Shokri and Florian Tramèr

Natural language reflects our private lives and identities, making its privacy concerns as broad as those of real life. Language models lack the ability to understand the context and sensitivity of text, and tend to memorize phrases present in their training sets. An adversary can exploit this tendency to extract training data. Depending on the nature of the content and the context in which this data was collected, this could violate expectations of privacy. Thus, there is a growing interest in techniques for training language models that preserve privacy. In this paper, we discuss the mismatch between the narrow assumptions made by popular data protection techniques (data sanitization and differential privacy), and the broadness of natural language and of privacy as a social norm. We argue that existing protection methods cannot guarantee a generic and meaningful notion of privacy for language models. We conclude that language models should be trained on text data which was explicitly produced for public use.


What is Proxy Discrimination?

Michael Carl Tschantz

The near universal condemnation of proxy discrimination hides a disagreement over what it is. This work surveys various notions of proxy and proxy discrimination found in prior work and represents them in a common framework. These notions variously turn on statistical dependencies, causal effects, and intentions. We discussion the limitations and uses of each notation and of the concept as a whole.


What is the Bureaucratic Counterfactual? Categorical versus Algorithmic Prioritization in U.S. Social Policy

Rebecca Johnson and Simone Zhang

There is growing concern about governments’ use of algorithms to make high-stakes decisions. While an early wave of research focused on algorithms that predict risk to allocate punishment and suspicion, a newer wave of research studies algorithms that predict “need” or “benefit” to target beneficial resources, such as ranking those experiencing homelessness by their need for housing. The present paper argues that existing research on the role of algorithms in social policy could benefit from a counterfactual perspective that asks: given that a social service bureaucracy needs to make some decision about whom to help, what status quo prioritization method would algorithms replace? While a large body of research contrasts human versus algorithmic decision-making, social service bureaucracies target help not by giving street-level bureaucrats full discretion. Instead, they primarily target help through pre-algorithmic, rule-based methods. In this paper, we outline social policy’s current status quo method—categorical prioritization—where decision-makers manually (1) decide which attributes of help seekers should give those help seekers priority, (2) simplify any continuous measures of need into categories (e.g., household income falls below a threshold), and (3) manually choose the decision rules for the way those categories combine to create priority. We draw on novel data and quantitative and qualitative social science methods to outline categorical prioritization in two case studies of U.S. social policy: waitlists for scarce housing vouchers and K-12 school finance formulas. We outline three main differences between categorical and algorithmic prioritization: is the basis for prioritization formalized; what role does power play in prioritization; and are decision rules for priority manually chosen or inductively derived from the model. Concluding, we show how the counterfactual perspective underscores both the understudied costs of categorical prioritization in social policy and the understudied potential of predictive algorithms to narrow inequalities.


What People Think AI Should Infer From Faces

Severin Engelmann, Chiara Ullstein, Orestis Papakyriakopoulos and Jens Grossklags

Faces play an indispensable role in human social life. At present, computer vision artificial intelligence (AI) captures and interprets human faces for a variety of digital applications and services. The ambiguity of facial information has recently led to a debate among scholars in different fields about the types of inferences AI should make about people based on their facial looks. AI research often justifies facial AI inference-making by referring to how people form impressions in first-encounter scenarios. Critics raise concerns about bias and discrimination and warn that facial analysis AI resembles an automated version of physiognomy. What has been missing from this debate, however, is an understanding of how ``non-experts'' in AI ethically evaluate facial AI inference-making. In a two-scenario vignette study with 24 treatment groups (N = 3745), we show that non-experts reject facial AI inferences such as trustworthiness and likability in a low-stake advertising and a high-stake hiring context. In contrast, non-experts agree with facial AI inferences such as skin color or gender in the advertising but not the hiring decision context. For each AI inference, we ask non-experts to justify their evaluation in a written response. Analyzing 29,760 written justifications, we find that non-experts are either ``evidentialists'' or ``pragmatists'': they assess the ethical status of a facial AI inference based on whether they think faces warrant sufficient or insufficient evidence for an inference (evidentialist justification) or whether making the inference results in beneficial or detrimental outcomes (pragmatist justification). Non-experts' justifications underscore the normative complexity behind facial AI inference-making. AI inferences with insufficient evidence can be rationalized by considerations of relevance while irrelevant inferences can be justified by reference to sufficient evidence. We argue that participatory approaches contribute valuable insights for the development of ethical AI in an increasingly visual data culture.


When learning becomes impossible

Nicholas Asher and Julie Hunter

We formally analyze an epistemic bias we call interpretive blindness (IB), in which under certain conditions a learner will be incapable of learning. IB is now common in our society, but it is a natural consequence of Bayesian inference and what we argue are mild assumptions about the relation between belief and evidence. IB a special problem for learning from testimony, in which one acquires information only from text or conversation. We show that IB follows from a co-dependence between background beliefs and interpretation in a Bayesian setting and the nature of contemporary testimony. We argue that a particular characteristic of contemporary testimony, argumentative completeness, can preclude learning in hierarchical Bayesian settings, even in the presence of constraints that are designed to promote good epistemic practices.


Who Audits the Auditors? Recommendations from a field scan of the algorithmic auditing ecosystem

Sasha Costanza-Chock, Inioluwa Deborah Raji and Joy Buolamwini

Algorithmic audits (or `AI audits') are an increasingly popular mechanism for algorithmic accountability; however, they remain poorly defined. Without a clear understanding of audit practices, let alone widely used standards or regulatory guidance, claims that an AI product or system has been audited, whether by first-, second-, or third-party auditors, are difficult to verify and may potentially exacerbate, rather than mitigate, bias and harm. To address this knowledge gap, we provide the first comprehensive field scan of the AI audit ecosystem. We share a catalog of individuals (N=438) and organizations (N=189) who engage in algorithmic audits or whose work is directly relevant to algorithmic audits; conduct an anonymous survey of the group (N=152); and interview industry leaders (N=10). We identify emerging best practices as well as methods and tools that are becoming commonplace, and enumerate common barriers to leveraging algorithmic audits as effective accountability mechanisms. We outline policy recommendations to improve the quality and impact of these audits, and highlight proposals with wide support from algorithmic auditors as well as areas of debate. Our recommendations have implications for lawmakers, regulators, internal company policymakers, and standards-setting bodies, as well as for auditors. They are: 1) require the owners and operators of AI systems to engage in independent algorithmic audits against clearly defined standards; 2) notify individuals when they are subject to algorithmic decision-making systems; 3) mandate disclosure of key components of audit findings for peer review; 4) consider real-world harm in the audit process, including through standardized harm incident reporting and response mechanisms; 5) directly involve the stakeholders most likely to be harmed by AI systems in the algorithmic audit process; and 6) formalize evaluation and, potentially, accreditation of algorithmic auditors.


Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging

Riccardo Fogliato, Shreya Chappidi, Matthew Lungren, Paul Fisher, Diane Wilson, Michael Fitzke, Mark Parkinson, Eric Horvitz, Kori Inkpen and Besmira Nushi

Designs and mechanisms in support of human-AI collaboration are important considerations in the real-world fielding of AI technologies. A critical aspect of interaction design for AI-assisted human decision-making is the sequencing of human and AI inferences. We have poor understanding of the influence of displaying AI inferences before or after human review and the impact of that order on the deliberation in a diagnostic task. In this work, we explore the effects of providing AI assistance before versus after the human has made a provisional decision. We conduct a user study where 19 veterinary radiologists identify radiographic findings present in patients' X-ray images, with the aid of an AI tool. We employ two workflow configurations to analyze (i) anchoring effects, (ii) human-AI team diagnostic performance and (dis)agreement, (iii) time spent and confidence in decision making, (iv) perceived usefulness of the AI. We find that participants who are asked to register provisional responses in advance of reviewing AI inferences are less likely to agree with the AI regardless of whether the advice is accurate and, in instances of disagreement with the AI, they are less likely to seek the second opinion of a colleague. These participants also report the AI advice to be less useful. Surprisingly, requiring provisional decisions did not lengthen the time participants spent on the task. The study provides generalizable and actionable insights for the deployment of clinical AI tools in human-in-the-loop systems. The experimental platform is available as open source to facilitate future research in human-AI decision making.