Presented by Marta Maslej (CAMH), Laura Sikstrom (CAMH), Darla Reslan (Toronto), and Yifan Wang (McMaster)Video Video
The ability of machine learning (ML) to identify patterns in large and complex datasets holds tremendous promise for solving some of the most complex and intractable problems in health care. However, emergent tools are plagued by ongoing questions about a range of known gendered and racialized biases that get “baked in” to these datasets, which can bias model performance for certain groups. Upon deployment, these tools can “amplify harms” for marginalized populations, particularly those defined by intersections of features (e.g., gender, race, class). In this tutorial, we demonstrate the value of applying an interdisciplinary approach to analyzing intersectional biases, towards ensuring that these innovative tools are implemented safely and ethically. The first half of the tutorial introduces attendees to the theoretical underpinnings of Critical Race Theory (CRT). This approach emphasizes that any attempts to measure health inequities must consider the interlocking nature of co-occurring social categories (e.g., race and gender) or risk over and/or underestimating systemic harms. We outline the rationale for this intersectional approach to fairness assessments of datasets used to train ML models, interpreting potential biases, and evaluating their impact on health equity. The second half of the tutorial includes an interactive practice component, focusing on quantitative methods used to evaluate model bias, based on a hypothetical case scenario from psychiatry. We show how an uninformed application of ML can amplify existing harms, by conducting an intersectional analysis of an ML model trained on simulated data. We demonstrate its value for the case scenario and discuss its implications. We conclude by re-iterating the value of interdisciplinary approaches to intersectional analysis. We highlight the value of pairing quantitative evaluations with qualitative research into the social and political complexities contributing to model bias, as well as its downstream potential for exacerbating health inequities.
Presented by Daniel Angus and Abdul Obied (Queensland University of Technology)Video Video
This tutorial examines the use of citizen data donation as a methodological approach for the study of proprietary automated decision-making systems. The tutorial examines in detail the design, deployment and ongoing operation of two browser-based plugins designed to gather data on targeted advertising from Facebook, and search results and recommendations from Google Search, Google News, Google Video, and YouTube. The plugins and associated data analysis infrastructure have been developed through the Tools and Methods for Public Oversight workstream of the multi-institutional Australian Research Council Centre of Excellence for Automated Decision Making & Society (ADM+S). The tutorial will cover the motivations for this form of independent transparency, the design and development of browser plugins, recruitment of participants, ethics of data collection, analysis and statistical considerations, and the responses from platforms to these new forms of external scrutiny.
Presented by Robert Gorwa (WZB Berlin Social Science Centre)Video Video
The goal of this translation tutorial will be to provide an accessible overview of the burgeoning interdisciplinary field of ‘platform governance’ for the FAccT community [1-9]. This tutorial will aim to offer a survey of both the major themes in this scholarship --including its conceptual foundations across disciplines and its development in the latest empirical research across fields like human-computer interaction, legal studies, and ‘platform studies’ -- as well as a practical outline of the emerging informal and formal policy landscape taking shape in an interplay of public and private sector regulation. This conversation will feature a discussion and assessment of potentially high-stakes, already existing implementations of automated decision-making systems by user-generated content services like Facebook and YouTube to tackle issue areas ranging from copyright to violent extremism (e.g. ContentID, the GIFCT Hash-sharing database), grounding this conversation by exploring the policy dynamics and incentives that are continuously driving the increasing adoption of these systems by platform companies.
Presented by Angelina Wang (Princeton), Seungbae Kim (UCLA), Olga Russakovsky (Princeton) and Jungseock Joo (UCLA)Video Video
Recent advances in computer vision and deep learning have created an unprecedented amount of new applications in many areas such as IT, transportation, entertainment, medicine, and education. Despite its wide applicability and robust performance, recent studies have also shown potential risks and ethical concerns, such as skewed demographic representations, accuracy disparity, or spurious correlations, hidden in large scale computer vision datasets, models and products. Following Buolamwini and Gebru's work on uncovering demographic bias in commercial facial attribute classifiers, ``Gender Shades'', numerous studies have investigated critical fairness issues in the field of computer vision to better understand the characteristics of data or model biases and to develop algorithmic solutions to help mitigate the problems. The goal of this tutorial is to review the emerging literature of computer vision fairness such that the FAccT community, more broadly AI and CS communities, can be informed of the latest technical developments as well as challenging research questions on the topic.
Presented by Divyansha Sehgal, Torsha Sarkar, and Divyank Katira (Centre for Internet and Society)
Global internet freedom seems to be in decline, with Asia leading the charge. Specifically, in South Asia, the last five years has shown various ways in which digital participation has been curtailed: intentional communication and network disruptions, sanctioning dissent, disinformation-mediated violence, and harassment and incitement tactics against minorities. India has been no exception.
Conversations on freedom of speech in the Global North often revolve around platform neutrality and selective implementation of content moderation policies. While these remain important issues in South Asia as well, the existing nature of regulatory models in the region give the executive branch an outsized ability to censor content it deems problematic. Additionally, it disincentivizes online service providers (OSPs) from preserving user rights through threats of economic and legal sanctions. Accordingly, the efforts at accountability and transparency of socio-technical systems that emerge in response to this regulatory model look very different and must be a creative partnership between disciplines.
As the FAccT audience is already aware, algorithms are released into a social and legal context which affect their stated goal. With our tutorial, we hope to familiarize the community with aspects of the South Asian regulatory model that curbs access to digital participation. Using India as a case-study, we intend to discuss the various forms online censorship can take, which socio-technical affordances help or hinder progress and what interdisciplinary measures for ensuring accountability look like.
Presented by Emily Hadley, Rob Chew, Stephen Tueller, Megan Comfort, Matthew DeMichele and Pamela Lattimore (RTI International)
Pretrial risk assessment instruments (RAIs) are tools used in the criminal justice system to assist in pretrial release decisions following an arrest. As RAIs are increasingly utilized in the United States, a growing body of literature emphasizes the importance of evaluating these tools for potential bias by race and sex. While numerous studies have analyzed bias in RAIs using a variety of metrics, less research is available regarding the barriers that community-level stakeholders face when trying to calculate and utilize these metrics in practice. This practice tutorial outlines the development of a R Markdown template (Figure 1) that is used to evaluate bias on the basis of race or sex in historical criminal justice data for localities considering implementation of the Public Safety Assessment (PSA) developed by Arnold Ventures. We will discuss identification and selection of key metrics, development of the automated report to facilitate regular and repeated use of the tool, and critically, additional steps taken for clearly communicating findings from the tool with non-technical stakeholders. This work extends previous research on which metrics are useful when evaluating risk assessments to how the metrics are used for real-world decisions, including the importance of transparency and user-oriented documentation. We discuss how we are actively using this template in partnership with stakeholders in multiple US jurisdictions to make decisions about implementing the PSA.
Presented by Margarita Boyarskaya (NYU), Solon Barocas (Microsoft Research/Cornell), Hanna Wallach (Microsoft Research), and Michael Carl Tschantz (ICSI)Video Video
Warnings about so-called "proxy variables" have become ubiquitous in recent policy debates about machine learning’s potential to enable decisions that disparately impact various demographic groups. Yet it is far from clear what makes something a proxy and why proxies pose problems. In most cases, commentators seem to worry that even when a legally proscribed variable such as race is not provided directly as an input into a machine learning model, discrimination may persist because non-proscribed variables are correlated with — that is, serve as a proxy for — the proscribed variable. How are we to decide whether a variable is serving as a proxy for race or as a legitimate predictor that happens to be correlated with race? This question cuts to the core of discrimination law, posing both practical and conceptual challenges for resolving whether any observed disparate impact is justified when a decision relies on variables that exhibit any correlation with membership in a demographic group.In this tutorial, we will guide participants through the process of reasoning about the potential threat of disparate impact posed by proxies. We will describe the various conditions that might lead commentators to attribute a disparate impact to the use of proxies, and we will overview a range of responses that have been proposed to address the proxy problem. We will demonstrate that disagreements about normative permissibility of using a proxy are difficult to understand (and hence, to address) unless one specifies one's beliefs about why a proxy correlates with the proscribed variable. Our tutorial will demonstrate how to use causal graphs to make such beliefs explicit, enabling more nuanced and context-aware reasoning about proxy variable use.
Presented by Asia Biega (Max Planck Institute for Security and Privacy) and Michèle Finck (Tübingen)Video Video
Contemporary data-driven systems frequently process personal user data. As a result, they need to comply with data protection laws governing data processing for users from certain jurisdictions, such as the European Union's General Data Protection Regulation ('GDPR'). Purpose limitation and data minimization are two of the core GDPR principles. Unlike other principles, including fairness or transparency, they have not yet received as much attention from the FAccT community. As a result, their implementation poses a number of challenges and open research questions. This tutorial synthesizes the state-of-the-art knowledge about the two principles from across the (i) research literature in law and computer science, (ii) guidelines issued by data protection authorities, as well as (iii) relevant court rulings. We present recent advances in computational interpretations of the principles as well as highlighting future interdisciplinary research opportunities.
Presented by Sanghyuk Chun (Naver corp), Kyungwoo Song (University of Seoul), and Yonghan Jung (Purdue)Video Video
Recent advances in deep learning open a new era of practical AI applications. Despite their successes, numerous case studies have reported that even large-scale ML models with billions of parameters often perform poorly when testing environments are changed slightly; e.g., if the model has been trained for identifying boats in the water and cars on the road, then the model could make a mistake by identifying a boat on the road as a ``car'', which is the mistake that preschool children wouldn't make. This is a phenomenon called ``shortcut learning'', occurred due to the tendency that ML-models rely on features that are strongly correlated with outcomes even if they are not causally related. Shortcut learning is a realistic threat to fairness, accountability, and transparency because ML models may establish discriminating decision rules based on non-causally-related features such as gender or race.
In this tutorial, we will outline notions of shortcut learning and discuss methods for alleviating the issue. Specifically, this tutorial will illustrate shortcut learning to demonstrate the challenges of the modern deep learning models. Next, we will provide a formal notion of shortcut learning under the rubric of causality. Finally, we will extensively review the state-of-the-art techniques to address the shortcut learning risks. Our tutorial will alert experts in fairness, accountability, and transparency (FAccT) to the risk of shortcut learning and inspiration to develop more fair, accountable, and transparent ML methods.
Presented by Nick Schuster (ANU)Video Video
This tutorial begins with an overview of the discipline of ethical theory and its relevance to machine learning. It then identifies two approaches to ethical ML: design machines to behave ethically or design machines to afford ethical human behavior. And it notes three problems that any approach to ethical ML will face: moral risk, moral uncertainty, and (reasonable) moral disagreement. Given these facts of moral life, how can ethical theory best inform the development and implementation of ML-driven technologies? I argue, first, that as long as these technologies remain tools for human use (as opposed to fully autonomous agents in themselves), the question of how to afford ethical human behavior is primary. And second, I suggest that, in answering this question, we should shift focus away from moral principles and toward moral skills instead. Skills enable us to cope with risk, uncertainty, and reasonable disagreement about what to do in all domains of practical life. Ethical theory can help us understand the skills we need to act well in the moral domain, specifically. And along with empirical insights from the cognitive and social sciences, as well as a technical understanding of how ML-driven technologies work, the moral skill paradigm can inform how these technologies might be designed and implemented to enable good (i.e. morally skilled) human agency.
Margarita Boyarskaya (NYU)Video Video
Warnings about so-called ‘proxy variables’ have become ubiquitous in recent policy debates about machine learning’s potential to enable decisions that disparately impact various demographic groups. Yet it is far from clear what makes something a proxy and why proxies pose problems. In most cases, commentators seem to worry that even when a legally proscribed variable such as race is not provided directly as an input into a machine learning model, discrimination may persist because non-proscribed variables are correlated with — that is, serve as a proxy for — the proscribed variable.
How are we to decide whether a variable is serving as a proxy for race or as a legitimate predictor that happens to be correlated with race? This question cuts to the core of discrimination law, posing both practical and conceptual challenges for resolving whether any observed disparate impact is justified when a decision relies on variables that exhibit any correlation with membership in a demographic group.
In this tutorial, we will guide participants through the process of reasoning about the potential threat of disparate impact posed by proxies. We will describe the various conditions that might lead commentators to attribute a disparate impact to the use of proxies, and we will overview a range of responses that have been proposed to address the proxy problem. We will demonstrate that disagreements about normative permissibility of using a proxy are difficult to understand (and hence, to address) unless one specifies one's beliefs about why a proxy correlates with the proscribed variable. Our tutorial will demonstrate how to use causal graphs to make such beliefs explicit, enabling more nuanced and context-aware reasoning about proxy variable use.
Presented by Gabby Johnson (Claremont McKenna College)Video Video
This tutorial will survey three issues at the intersection of philosophy, computer science, and psychology: bias, semantics, and cognitive architecture. Regarding bias, we’ll explore how models of social bias in humans and machines have informed and shaped one another in psychology and computer science. The comparison of these two fields has helped to progress theories of what biases are, from where they originate, and how we might (when necessary) mitigate them. Regarding semantics, we’ll explore how models of NLPs differ from psychological models of language interpretation. We’ll use these comparisons to ask in what sense concepts applied in one domain might naturally extend to the other, and whether such differences produce barriers to communication and explanation in machine learning. Regarding cognitive architecture, we’ll explore the importance of innate structure in computational models in psychology and machine learning. In particular, we’ll explore the apparent informational encapsulation of human perceptual systems from higher cognitive capacities in order to determine to what extent machine learning systems should be built with similar structures on the wider path toward artificial general intelligence.
Presented by Won Ik Cho (Seoul National University)Video Video
Hate speech towards individuals or specific groups of people, including offensive, toxic, and biased language, is a crucial issue in today's media and webspace. However, despite substantial discussions on the meaning of hate speech and its scope so far, there is a clear limitation that a solid establishment of the taxonomy of hate speech does not necessarily enable us to tell hate speech from those not due to their blurry boundary. Also, those criteria are often not globally accepted, that an utterance or expression that is considered hate speech in a specific culture or community may not be regarded as toxic or offensive in other perspectives, and vice versa. This makes building a dataset for hate speech in low-resourced language and less-studied domains more challenging than in widely studied areas, which becomes a hurdle to reaching the social agreement and providing an automatic protection system.
This tutorial aims to translate the scheme of building up a hate speech dataset from scratch from a low-resource perspective, from problem definition to a scalable construction process. Making up a dataset for hate speech can benefit both research and practice, and the data itself becomes a significant contribution to the community. Furthermore, we want to highlight how the well-constructed datasets can serve as an ingredient for less biased inference and detoxified language generation in natural language processing, guaranteeing safe and fair machine learning for previously less protected areas.
Presented by Reuben Binns (Oxford)Video
Much work in ‘fair’ machine learning aims to provide techniques to enable organisations to comply with equality and anti-discrimination law. However, it is often unclear whether common technical fairness approaches are even compatible with equality law in the round and on the ground. Furthermore, many have questioned whether compliance with equality law is sufficient to secure broader aims of social justice. This tutorial aims to clarify these issues, focusing on EU equality law. It covers:
Presented by Bianca Wylie (Co-founder of Tech Reset Canada)
In this tutorial, we’ll take a look at a few of the methods available to the public to interrogate and shape the use of technology in the delivery of public services. In major public sector technology projects, software development lifecycles meet procurement processes and traditional mechanisms for public accountability, creating opportunities to intervene if you understand how both processes work.
We will start with some contextual matters: how the public service is structured, political power dynamics between elected officials and the public service, procurement fundamentals, conflicting and concurrent incentives for using public technology, and more. We will examine different opportunities to intervene in the use of public technology that draw on existing and long-standing public accountability regimes. These include commissioners, ombudspeople, and lobbying oversight. We will also relate these opportunities to the phases of the lifecycle of software: procurement, contracting, implementation, maintenance, sunsetting and transitioning of software. In the workshop we’ll spend some time on each to identify the stakeholders, the processes at play, and some of the tactics that can be used to intervene from a range of angles.
We will also identify some of the major challenge areas: these include norms within the public sector regarding the purchase and use of technology, and how they have hardened around some problematic, non-transparent practices. Today some of those norms are being questioned. With more pressure for both transparency and accountability in public sector tech, there is also growing pressure to increase general governmental and public control over technology vendors. Terms such as “vendor lock-in” are becoming increasingly well known. This signals a need for more flexibility within the public service to adapt and evolve public technology practices.
The tutorial draws on Wylie’s experiences intervening to hold Canadian public sector actors accountable in major public sector technology projects including Sidewalk Toronto, PayIt (digital payments software) and Canada’s COVID Alert app.
Presented by Jenny Davis (ANU)Video
The field of fair machine learning (FML) predominates efforts towards rectifying the biases and inequalities that pervade machine learning systems. Underlying these efforts are mathematical approaches that optimize neutrality, ridding models of direct and proxy indicators of race, class, gender, and other protected class attributes. In practice, these efforts consistently fall short, operating from an algorithmic idealism that presupposes a meritocratic social world (Davis, Williams, and Yang In Press). For this reason, a nascent critique of FML has emerged, challenging “fairness” as the ideal value standard (Birhane and Guest 2020; Davis et al. In Press; Hanna et al. 2020; Hoffmann 2019). In this tutorial, we will go through the central critiques leveraged against FML, consider alternative proposals, and engage in interactive debates about the relative merits of retaining and improving upon “fairness” or instead, displacing it with new commitments and related values. These debates will be anchored in the domain of job sorting and hiring decisions, with implications for the myriad sites in which machine learning outputs affect resource allocations. The workshop is led by Dr. Jenny L. Davis, a sociologist and Deputy Lead of the Humanising Machine Intelligence Project at the Australian National University. Her work on Algorithmic Reparation (Davis et al. In Press) makes a case against “fairness” in machine learning, proposing an alternative that eschews neutrality in favor of redress, while centralizing, rather than obviating, Intersectional identity categories.
Presented by Ferdinando Fioretto (Syracuse) and Claire McKay Bowen (Urabn Institute)Video
This tutorial surveys recent work in the intersection of differential privacy and fairness and provides a focused prospective on the policy implication of using differentially private data for critical decision-making tasks.The tutorial will discuss the equity implications of using differential privacy in decision problems and learning tasks, reviews the conditions under which privacy and fairness may have aligned or contrasting goals, analyzes how and why DP may exacerbate bias and unfairness, and describes available mitigation measures for the fairness issues arising in differentially private systems. The tutorial will also provide a unified understanding of the main challenges and potential risks arising when deploying privacy-preserving machine-learning or decisions-making tasks under a fairness lens.
Presented by Krishnaram Kenthapadi (Fiddler AI), Hima Lakkaraju (Harvard), Pradeep Natarajan (Amazon Alexa AI), and Mehrnoosh Sameki (Microsoft Azure AI)Video Video
More information and resources for this tutorial including slides and videos can be found here.
Artificial Intelligence (AI) is increasingly playing an integral role in determining our day-to-day experiences. With AI based solutions in high-stakes domains such as hiring, lending, criminal justice, healthcare, and education, the resulting implications of AI are far-reaching. Consequently, we need to ensure that these models are making accurate predictions, are robust to shifts in the data, are not relying on spurious features, and are not unduly discriminating against minority groups. While several approaches spanning various areas such as explainability, fairness, and robustness have been proposed in recent literature, there is relatively less attention on the need for monitoring machine learning (ML) models once they are deployed and the associated research challenges. We first motivate the need for ML model monitoring, as part of a broader AI model governance and responsible AI framework, from societal, legal, customer/end-user, and model developer perspectives, and provide a roadmap for thinking about model monitoring in practice. We then present findings and insights on model monitoring desiderata based on interviews with various ML practitioners spanning domains such as financial services, healthcare, hiring, online retail, computational advertising, and conversational assistants. We then describe the technical considerations and challenges associated with realizing the above desiderata in practice. Then, we focus on the real-world application of model monitoring methods and tools, present practical challenges/guidelines for using such techniques effectively, and lessons learned from deploying model monitoring tools for several web-scale AI/ML applications. We present case studies across different companies, spanning application domains such as financial services, healthcare, hiring, conversational assistants, online retail, computational advertising, search and recommendation systems, and fraud detection. We hope that our tutorial will inform both researchers and practitioners, stimulate further research on model monitoring, and pave the way for building more reliable ML models and monitoring tools in the future.
Presented by Baobao Zhang (Syracuse)Video
Bluetooth proximity exposure notification apps were promoted as a privacy-preserving way to automate contact tracing in the Covid-19 pandemic. Nevertheless, these apps have not gained traction in the US and Western Europe with a minority of the population downloading and using these apps. We consider why contact tracing apps have failed to become an effective public health tool, particularly in the US. First, these apps were not prioritized in public health campaign messaging compared with other interventions (e.g., getting vaccinated and wearing masks). As a result, uptake of the apps was limited, thereby decreasing their effectiveness to alert individuals of exposure. Second, uptake was clouded by public skepticism of these apps’ effectiveness, privacy, and security. Finally, the rise of highly infectious Covid-19 variants (e.g., Delta and Omicron) have overwhelmed testing and contact tracing capabilities. Understanding why contact tracing apps failed to become an effective public health tool will inform plans to prevent future pandemics.
Presented by Sarah Fox (CMU Human-Computer Interaction Institute)Video
Across the United States, startups vie for contracts to deploy “micromobility,” connected scooters, e-bikes, and delivery robots designed to offer choice in urban landscapes marked by disjointed transit. Yet, reports of blocked curb cuts and walkways highlight the need for coherent regulation. Drawing on interviews with government officials, disabled activists, and micromobility representatives, this tutorial offers a case study interrogating how micromobility is negotiated, with firms and city officials working together to define who benefits. Through the lens of anticipatory politics, this case concludes with an argument to move away from reactive approaches to policy development toward prefigurative models that give rise to more equitable municipal environments.
Presented by Roberto Zicari (Seoul National University)
The ethical and societal implications of artificial intelligence systems raise concerns. Started in January 2019, a team of International researchers developed a process to assess Trustworthy AI in practice, called Z-Inspection®. The Z-Inspection® is a holistic process based on the method of evaluating new technologies according to which ethical issues must be discussed through the elaboration of socio-technical scenarios. The Z-Inspection® process is composed of three main phases: 1) the Set Up Phase, 2) the Assess Phase, and 3) the Resolve Phase. The process has been successfully applied to both assess post-ante and ex-ante trustworthiness of AI systems used in healthcare. In this tutorial you will learn the basic of the EU “Framework for Trustworthy AI” and how to apply them in practice using the Z-Inspection® process. For more info see http://z-inspection.org
Presented by Helen Nissenbaum (Cornell)Video
The theory of contextual integrity (CI) defines privacy as appropriate flow of personal information answering a societal need for a concept of privacy that, simultaneously, is meaningful, explains why privacy is worth caring about, and underscores why we must protect it. I argue that contextual integrity meets all three of these benchmarks. CI requires that we bend away from one-dimensional ideas, which for decades have gripped the privacy domain, namely, control over information about ourselves, stoppage of flow, or the fetishization of specific, “sensitive” attributes (e.g. identity, health.) My talk will review key ideas behind CI, contrast it with alternative accounts, and present a few applications of possible interest to FAccT.