Algorithmic Accountability: Moving Beyond Audits
Apr 11, 2023
Despite unresolved concerns, an audit-centered algorithmic accountability approach is being rapidly mainstreamed into voluntary frameworks and regulations, and industry has taken a leadership role in its development.
Technical modes of evaluation have long been critiqued for narrowly positioning ‘bias’ as a flaw within an algorithmic system that can be fixed and eliminated. While calls from civil society to move towards broader ‘socio-technical’ evaluation expand the frame in needed directions, these have failed to make the leap from theory to practice. Such approaches are prone to vague and underspecified benchmarks, and both technical and socio-technical audits place the primary burden for algorithmic accountability on those with the fewest resources.
Across the board, audits run the risk of entrenching power within the tech industry, and take focus away from more structural responses.
An emerging policy regime composed of a combination of audits, impact assessments, and mandates for access to company data is rapidly being mainstreamed as the primary means of addressing and mitigating the harms caused by AI-powered systems.
Far from distancing itself from this policy momentum, the tech industry is strategically assuming a leadership position in the field of AI auditing.
A growing wave of policy endorses the use of audits for AI systems in both public- and private-sector contexts. In the EU, the Digital Services Act (DSA), which came into force in 2022, includes multiple provisions that require audits1See Sections 28 and 31 of the European Union’s Digital Services Act (DSA), October 27, 2022. and creates more pathways to access company data for regulators as well as vetted researchers.2See Section 31 of the DSA. In the US, there are multiple legislative proposals that endorse elements of this approach for regulating the tech industry, including the Algorithmic Accountability Act, which was reintroduced in 20223Algorithmic Accountability Act of 2022, H.R. 6580, 117th Congress (2021–2022). and tasks the FTC with implementing impact assessments for AI-enabled decision-making, or the Platform Accountability and Transparency Act of 20224Chris Coons, “Senator Coons, Colleagues Introduce Legislation to Provide Public with Transparency of Social Media Platforms,” press release, December 21, 2022. that creates pathways for external researchers to get access to data. There is also increasing momentum behind voluntary mechanisms like the National Institute of Standards and Technology (NIST)’s recently published 2023 Risk Management Framework, which endorses independent third-party audits,5National Institute of Standards and Technology (NIST), US Department of Commerce, “Artificial Intelligence Risk Management Framework (AI RMF 1.0),” January 2023. as well as a large and growing research community (including industry-funded and/or affiliated actors)6See for example Rolls Royce, “The Aletheia Framework™: Helping Build Trust in Artificial Intelligence,” n.d., accessed March 3, 2023; PwC, “PwC’s Responsible AI: AI You Can Trust,” n.d., accessed March 3, 2023; and Deloitte, “Deloitte AI Institute Teams with Chatterbox Labs to Ensure Ethical Application of AI,” press release, March 15, 2021. engaged in developing frameworks following this approach.
First, Second, and Third-Party Audits
Researchers have helpfully classified algorithmic audits into first-, second-, or third-party audits as a way to differentiate the entities and incentives at play.7See Sasha Costanza-Chock, Inioluwa Deborah Raji, and Joy Buolamwini, “Who Audits the Auditors? Recommendations from a Field Scan of the Algorithmic Auditing Ecosystem,” FAccT ’22: 2022 ACM Conference on Fairness, Accountability, and Transparency (June 2022): 1571–1583; and Kate Kaye, “A New Wave of AI Auditing Startups Wants to Prove Responsibility Can Be Profitable,” Protocol, January 3, 2022. First-party, or internal, audits are those conducted by teams within an organization and tasked with reviewing tools created in-house. Several tech companies have specialized teams and tools for this purpose. Second-party audits involve contracted vendors who offer auditing-as-a-service, which includes traditional consulting organizations like PwC and Deloitte, as well as a growing number of independent ventures (both for profit and nonprofit) that specialize in certain kinds of audits (“bias audits”) or sectors.8See Parity (which changed its name to Vera in January 2023), accessed March 3, 2023; Vera, accessed March 3, 2023; https://foundation.mozilla.org/en/blog/its-time-to-develop-the-tools-we-need-to-hold-algorithms-accountable. See also Deborah Raji, “It’s Time to Develop the Tools We Need to Hold Algorithms Accountable,” Mozilla Foundation, February 2, 2022; and Laurie Clarke, “AI Auditing Is the Next Big Thing. But Will It Ensure Ethical Algorithms?” Tech Monitor, April 14, 2021.Third-party audits stand apart: they have been conducted by journalists, independent researchers, or entities with no contractual relationship to the audit target. From Gender Shades9 Algorithmic Justice League Project, MIT Media Lab, Gender Shades, 2018, accessed March 3, 2023. to the audit of London’s LFR system10Evani Radiya-Dixit, “A Sociotechnical Audit: Assessing Police Use of Facial Recognition,” Minderoo Centre for Technology and Democracy, October 2022. to ProPublica’s audit of predictive policing tech,11Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner, “Machine Bias,” ProPublica, May 23, 2016. these audits have been pivotal in galvanizing advocacy around AI-related harms.12Inioluwa Deborah Raji and Joy Buolamwini, “Actionable Auditing: Investigating the Impact of Publicly Naming Biased Performance Results of Commercial AI Products,” Conference on Artificial Intelligence, Ethics, and Society, 2019.
This wave of policy activity has created business opportunities, as evidenced by a fast-developing industry around AI audits.13See Costanza-Chock, Raji, and Buolamwini, “Who Audits the Auditors?” And Sebastian Klovig Skelton, “AI accountability held back by ‘audit-washing’ practices,” Computer Weekly, November 23, 2022. Far from distancing itself from this policy momentum, the industry is strategically assuming a leadership position in the field of AI auditing, creating and even licensing their own auditing tools and mechanisms. Microsoft,14 Microsoft, “Responsible AI Impact Assessment Template,” June 2022. Salesforce,15See Salesforce, “Salesforce Debuts AI Ethics Model: How Ethical Practices Further Responsible Artificial Intelligence,” September 2, 2021; and Kathy Baxter, “AI Ethics Maturity Model,” Salesforce, n.d., accessed March 3. Google,16See Khari Johnson, “Google researchers release audit framework to close AI accountability gap,” VentureBeat, January 30, 2020; and Hansa Srinivasan, “ML-Fairness-Gym: A Tool for Exploring Long-Term Impacts of Machine Learning Systems,” Google Research (blog), February 5, 2020. Meta,17See Issie Lapowsky, “Facebook’s Decision on Trump Posts Is a ‘Devastating’ Setback, Says Internal Audit,” Protocol, July 8, 2020; Issie Lapowsky, “One Year in, Meta’s Civil Rights Team Still Needs a Win,” Protocol, April 9, 2022; Jerome Pesenti, “Facebook’s Five Pillars of Responsible AI,” Meta AI, June 22, 2021; and Roy L. Austin, “Following Through on Meta’s Civil Rights Audit Progress,” Meta, November 18, 2021. Twitter,18See Anna Kramer, “Twitter’s Image Cropping Was Biased, So It Dumped the Algorithm,” Protocol, May 19, 2021; and Anna Kramer, “How Twitter Hired Tech’s Biggest Critics to Build Ethical AI,” Protocol, June 23, 2021. IBM,19AI Fairness 360, IBM / Linux Foundation AI & Data, accessed March 3, 2023. and Amazon20Jeffrey Dastin and Paresh Dave, “Amazon to Warn Customers on Limitations of Its AI,” Reuters, November 30, 2022. all launched widely publicized initiatives around internal technical audit tools, with the purported goal of mitigating harms like bias. (Twitter’s ethics and accountability team was infamously dissolved in the aftermath of Elon Musk’s takeover, and Microsoft laid off its entire AI ethics and society team in March 2023).21Will Knight, “Elon Musk Has Fired Twitter’s ‘Ethical AI’ Team,” Wired, November 4, 2022; and Zoë Schiffer and Casey Newton, “Microsoft just laid off one of its responsible AI teams”, Platformer, March 14, 2023. The tech industry has also been vocally supportive of voluntary approaches with little or no enforcement mechanisms, most notably NIST’s recent Risk Management Framework guidelines. For example, in a 2022 US House hearing on Trustworthy AI,22Trustworthy AI: Managing the Risks of Artificial Intelligence, House Event 115165, 117th Congress (2021–2022), September 29, 2022. the vice president of the US Chamber of Commerce, Jordan Crenshaw, speaking on behalf of industry, said: “We believe it’s premature to get into prescriptive regulation. We support voluntary frameworks like we see at NIST.”
The process-based nature of these rules allow them to be easily internalized by companies as a cost of doing business. With the encouragement of industry, this “algorithmic accountability” tool kit is displacing structural approaches that would require more fundamental changes.
Audits, impact assessments, and mandates for access to company data have various features in common:
- They are responses to what is popularly referred to as AI’s “black-box problem”, i.e., the concern that AI systems have a range of characteristics that make it impossible to identify or diagnose harm from the outside without access to the technical components and logics of the system.
- They focus on verifying performance and other characteristics of the system, but this is typically evaluated at the technical level, rather than within real-life contexts of use and largely to the exclusion of interrogating the business model and related power dynamics.
- They are procedural (rather than substantive) mechanisms. They intervene through process-based modes like requiring inspection, documentation, evaluation, or greater transparency, as opposed to bright-line rules.
- They are generally intended to surface and address harms such as discrimination and bias, consumer manipulation/deception, and data privacy and security-related concerns rather than competition-related concerns. Efforts to enable audits or access to data for competitors or regulators for the purpose of enhancing competition exist but are currently siloed from the algorithmic accountability discourse.23See European Commission, “Antitrust: Commission Accepts Commitments by Amazon Barring It from Using Marketplace Seller Data, and Ensuring Equal Access to Buy Box and Prime,” press release, December 20, 2022; Kate Cox, “Amazon’s Use of Marketplace Data Breaks Competition Law, EU Charges,” Ars Technica, November 10, 2020; Natasha Lomas, “Europe Lays Out Antitrust Case against Amazon’s Use of Big Data,” TechCrunch, November 10, 2020; and Competition and Markets Authority, “Competition and Data Protection in Digital Markets: A Joint Statement between the CMA and the ICO 2021 (CMA, ICO),” March 25, 2022.
The focus on greater levels of visibility, diligence, and reflexivity in data and computational processes is valuable, especially given the structural opacity that plagues industry AI.24For an unpacking of the many layers of opacity in commercial systems, see Jenna Burrell, “How the Machine ‘Thinks’: Understanding Opacity in Machine Learning Algorithms,” Big Data & Society 3, no. 1 (January–June 2016). The process-based of these algorithmic accountability tools is also relatively less controversial for industry,25The vice president of the US Chamber of Commerce also vocally supported the NIST RMF’s voluntary approach. See Alex LaCasse, “US NIST publishes AI Risk Management Framework 1.0,” International Association of Privacy Professionals (IAPP), January 27, 2023; Deloitte, “Deloitte AI Institute Teams with Chatterbox Labs to Ensure Ethical Application of AI,” press release, March 15, 2021; and Rumman Chowdhury, “Sharing Learnings about Our Image Cropping Algorithm,” Twitter Engineering (blog), May 19, 2021. and therefore more politically feasible, compared to prescriptive rule-based approaches that put bright-line restrictions in place.However, at a time when disproportionate energy appears to be channeled toward this policy template as a core focus of many civil society and government actors,26See Ellen P. Goodman and Julia Tréhu, “AI Audit-Washing and Accountability,” German Marshall Fund of the United States (GMF), November 2022; Christine Custis, “Operationalizing AI Ethics through Documentation: ABOUT ML in 2021 and Beyond,” Partnership on AI (PAI), April 14, 2021; Stanford University Human-Centered Artificial Intelligence, “AI Audit Challenge,” 2022, accessed March 2, 2023; and Ada Lovelace Institute and DataKind UK, “Examining the Black Box: Tools for Assessing Algorithmic Systems,” April 2020. often to the exclusion of other remedies, we must urgently confront its limits.
This is especially true in the context of large, complex, and well-resourced companies that operate with arguably limitless financial and technical resources and growing influence over policy spaces. What do we lose when we make audit-centered algorithmic accountability the focus of our policy strategy for the tech industry? In a scathing critique of the research community’s focus on algorithmic fairness and accountability, Sean McDonald and Ben Gansky argued that these approaches can preclude “the ability of stakeholders to ask first principles questions (i.e. even if a system executes its task perfectly, would it be just?) and channels moral energy away from more fundamental reforms.”27Ben Gansky and Sean McDonald, “CounterFAccTual: How FAccT Undermines Its Organizing Principles,” FAccT ’22: 2022 ACM Conference on Fairness, Accountability, and Transparency (June 2022): 1982–1992.
There is a burgeoning audit economy with companies offering audits-as-a-service despite no clarity on the standards and methodologies for algorithmic auditing, nor consensus on the definitions of risk and harm.
Coherent standards and methodologies for assessing when an algorithmic system is harmful to its users are hard to establish, especially when it comes to complex and sprawling Big Tech platforms. Audit tools will forever be compromised by this conundrum, making it more likely than not that audits will devolve into a superficial “checkbox” exercise.
This algorithmic accountability regime, and audits in particular, have proliferated as self-regulatory mechanisms within industry without an existing, clear policy framework setting standards for harm. This has resulted in an anomalous situation where there is widespread confusion around fundamental questions: What are we auditing for? Which harms count and how are they being defined?
Europe’s DSA, the first example of a legal algorithmic auditing requirement for the tech industry, keeps this standard very broad. It requires companies to audit for “systemic risks to fundamental rights,” although with special emphasis on harms related to manipulation and illegal content. While this could theoretically allow for expansive evaluation of potential harms, it also leaves enormous discretion for the auditing entity (the company selling the software, or a third party) to limit the scope of the audit to those issues least threatening to their interests. Deloitte, for example, has already proposed applying its own methodologies in the absence of “specific parameters and audit methodology.”28See Goodman and Tréhu, “AI Audit-Washing and Accountability.”
At the same time, this lack of specificity could be the symptom of a more foundational challenge with presenting audits as a general accountability measure, rather than a problem in itself. Cathy O’Neil, the founder of one of the first algorithmic auditing firms, herself declared that she would never take on the task of “auditing Facebook” because the scale and granularity of harms was too diffused, systemic, and intractable to lend itself to any simplified template for an audit.29Cathy O’Neil, “Facebook’s Algorithms Are Too Big to Fix,” Bloomberg, October 8, 2021. When it comes to social media content algorithms, for example, a major focus for algorithmic accountability debates, the question of harm, methodology, and expertise required will in large part be determined by the specificities of the geographical and/or social context in which such harm transpires, and the particular groups that are impacted. This is also where the relative lack of support or initiative from Big Tech for non-Western and other minoritized countries and contexts has become glaring.30Conducted in the face of major public backlash, Meta’s human rights impact assessment of its role in the Myanmar genocide of 2017, for example, was widely decried as superficial “ethics washing.” See Dan Milmo, “Rohingya Sue Facebook for £150bn over Myanmar Genocide,” Guardian, December 6, 2021.
These broadly scoped audits can be helpfully contrasted against inspections done by regulators in the context of enforcement challenges, from the Australian Competition regulator’s audit of travel website Trivago’s algorithms,31Peter Leonard, “The Deceptive Algorithm in Court: Australian Competition and Consumer Commission v Trivago N.V.  FCA 16,” Society for Computers and Law, January 31, 2020. the investigations into Clearview AI’s facial recognition system,32Information Commissioner’s Office (ICO), “ICO Fines Facial Recognition Database Company Clearview AI Inc More than £7.5m and Orders UK Data to Be Deleted,” May 23, 2022. or UK privacy regulator ICO’s investigation of microtargeting in political campaigns in 2017.33Information Commissioner’s Office (ICO), “Democracy Disrupted? Personal Information and Political Influence,” July 11, 2018. These inspections had narrowly defined objectives—that is, to assess whether the company was in violation of a clearly articulated standard of harm—and were typically evaluated alongside a range of other contextual evidence, including testimony from company executives. Similarly, data protection impact assessments under the GDPR have served to create a documentation trail around compliance with the specific provisions of the regulation, which regulators can use for monitoring.
While attention has largely been on technical audits, there might be greater promise from the possibility of accountability mechanisms that more directly surface organizational incentives and business models. A recent report from a consortium of UK-based regulators, while highlighting the lack of standards and methodology around technical auditing, identified “governance audits” as a tool that requires companies to provide detailed documentation on operational structures for design, development, management, and internal mechanisms oversight for algorithmic systems.34Competition and Markets Authority, “Auditing Algorithms: The Existing Landscape, Role of Regulators and Future Outlook,” September 23, 2022. Given pervasive corporate secrecy in the tech industry around decision-making and human involvement in algorithmic processes, public disclosure of this information could be a more impactful intervention.
The response to the failures of technical audits includes recommendations for more participation from directly impacted communities in the audit process and calls to adapt testing to resemble real-life contexts. However, these proposals remain largely theoretical and are at risk of being superficially incorporated into a “checkbox” exercise. Many of these proposals place the primary burden for algorithmic accountability on those with the fewest resources—researchers, or even on the communities most harmed by these systems.
The field of “bias testing” is both the most mature in the accountability tool kit compared to other algorithmic harms and the most glaring in its deficiencies. Most industry activity and research on computational audits has focused on quantifying the fairness of algorithmic decisions and proposing technical measures for mitigating bias and discrimination. Several policy proposals in the US specifically include requirements for audits and impact assessments to evaluate for bias against protected groups.35See Richard Vanderford, “New York’s Landmark AI Bias Law Prompts Uncertainty,” Wall Street Journal, September 21, 2022; and DC Chamber of Commerce, “DC Chamber of Commerce Small Business Action Alert: Stop Discrimination by Algorithms Act Of 2021,” n.d., accessed March 3, 2023; https://www.npr.org/local/305/2021/12/10/1062991462/d-c-attorney-general-introduces-bill-to-ban-algorithmic-discrimination. The Algorithmic Justice and Online Platform Transparency Act would prohibit discriminatory use of personal information in algorithmic processes; see Algorithmic Justice and Online Platform Transparency Act, S. 1896, 117th Congress (2021–2022). Most industry activity and research on computational audits has focused on quantifying the fairness (a contested term, as we’ll see below) of algorithmic decisions and proposing technical measures to mitigate these concerns. It’s also worth noting that legal antidiscrimination discourse has influenced the ways in which fairness has been conceptualized in technical fields.36See Anna Lauren Hoffman, “Where Fairness Fails: Data, Algorithms, and the Limits of Antidiscrimination Discourse,” Information, Communication & Society 22, no. 7 (2019): 900–915; and Daniel Greene, Anna Lauren Hoffmann, and Luke Stark, “Better, Nicer, Clearer, Fairer: A Critical Assessment of the Movement for Ethical Artificial Intelligence and Machine Learning,” Hawaii International Conference on System Sciences, January 2019. See also Samuel R. Bagenstos, “The Structural Turn and the Limits of Antidiscrimination Law,” California Law Review 94, no. 1 (January 2006): 1–47; and Kimberle Crenshaw, “Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Antidiscrimination Doctrine, Feminist Theory and Antiracist Politics,” University of Chicago Legal Forum 1989, no. 1 (1989).
Yet the fundamental limits of computational approaches to fairness, carefully demonstrated through journalism, advocacy, and research over the past five years,37See Meredith Whittaker, Kate Crawford, Roel Dobbe, Genevieve Fried, Elizabeth Kaziunas, Varoon Mathur, Sarah Myers West, Rashida Richardson, Jason Schultz, and Oscar Schwartz, AI Now Report 2018, AI Now Institute, 2018; https://arxiv.org/abs/2101.09869; Abeba Birhane, Elayne Ruane, Thomas Laurent, Matthew S. Brown, Johnathan Flowers, Anthony Ventresque, and Christopher L. Dancy, “The Forgotten Margins of AI Ethics,” FAccT ’22: 2022 ACM Conference on Fairness, Accountability, and Transparency, May 9, 2022; Pratyusha Kalluri, “Don’t Ask If Artificial Intelligence Is Good or Fair, Ask How It Shifts Power,” Nature, July 7, 2020; Os Keyes, “Automating Autism: Disability, Discourse, and Artificial Intelligence,” Journal of Sociotechnical Critique 1, no. 1 (2020): 1–31; Os Keyes, Jevan Hutson, and Meredith Durbin, “A Mulching Proposal: Analysing and Improving an Algorithmic System for Turning the Elderly into High-Nutrient Slurry,” CHI EA ’19: Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, May 2019; Cynthia L. Bennett and Os Keyes, “What Is the Point of Fairness? Disability, AI and the Complexity of Justice,” August 9, 2019; Sarah Myers West, “Redistribution and Rekognition: A Feminist Critique of Algorithmic Fairness,” Catalyst: Feminism, Theory, Technoscience 6, no. 2 (2020): 1–24; Andrew D Selbst, Danah Boyd, Sorelle A Friedler, Suresh Venkatasubramanian, and Janet Vertesi, “Fairness and Abstraction in Sociotechnical Systems,” In Proceedings of the Conference on Fairness, Accountability, and Transparency (2019); and Sofia Kypraiou, “What Is Fairness?” Feminist AI, September 13, 2021. are more widely acknowledged than ever before. These critiques include:
- Computational approaches to fairness remain limited to evaluating for bias in laboratory-like conditions, abstracted from the real social and political contexts in which these systems are generated and used.38Ben Green and Lily Hu, “The Myth in the Methodology: Towards a Recontextualization of Fairness in Machine Learning,” 35th International Conference on Machine Learning, 2018; Shira Mitchell, Eric Potash, Solon Barocas, Alexander D’Amour, and Kristian Lum, “Algorithmic Fairness: Choices, Assumptions, and Definitions,” Annual Review of Statistics and Its Application 8 (2021): 141–163; and Rodrigo Ochigame, “The Long History of Algorithmic Fairness,” Phenomenal World, January 30, 2020. This includes the ways in which the system, in practice, plays into existing power asymmetries; and the limitations of any kind of “human review” of these systems, which is typically superficial due to the tendency to defer to algorithmic decisions; and the added risks that these decisions work to legitimize discriminatory decisions and allow management to evade responsibility.39See Ben Green and Amba Kak, “The False Comfort of Human Oversight as an Antidote to A.I. Harm,” Slate, June 15, 2021; and Ben Green, “The Flaws of Policies Requiring Human Oversight of Government Algorithms,” Computer Law & Security Review 45 (2022).
- There’s a need for greater attention to the structural bias embedded in the contexts in which data is collected for training AI systems.40See Rashida Richardson, Jason Schultz, and Kate Crawford, “Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice,” New York University Law Review 94 (May 2019): 192–233; Safiya Umoja Noble, Algorithms of Oppression: How Search Engines Reinforce Racism (New York: NYU Press, 2018); and Sarah Myers West, Meredith Whittaker, and Kate Crawford, “Discriminating Systems: Gender, Race, and Power in AI,” AI Now Institute, April 2019. This defeats the notion that “more data” or “better data” will mitigate AI challenges.41See Bennett and Keyes, “What Is the Point of Fairness?”
- The reduction of fairness evaluation to solely quantifiable metrics disguises the inherently subjective and value-laden assumptions built into these systems.42Lindsay Weinberg, “Rethinking Fairness: An Interdisciplinary Survey of Critiques of Hegemonic ML Fairness Approaches,” Journal of Artificial Intelligence Research 74 (2022): 75–109.
- These approaches, by unquestioningly placing people in particular groups or classes, fail to account for how social and racial group classifications are themselves socially constructed through the “widespread use of racial categories as if they represent natural and objective differences between groups.”43Alex Hanna, Emily Denton, Andrew Smart, Jamila Smith-Loud, “Towards a Critical Race Methodology in Algorithmic Fairness,” Conference on Fairness, Accountability, and Transparency (FAT* ’20), January 27–30, 2020; J. Khadijah Abdurahman, “FAT* Be Wilin’,” Medium, February 24, 2019; Morgan Klaus Scheuerman, Madeleine Pape, and Alex Hanna, “Auto-Essentialization: Gender in Automated Facial Analysis as Extended Colonial Project,” Big Data & Society 8, no. 2 (2021); and Michele Elam, “Signs Taken for Wonders: AI, Art & the Matter of Race,” Daedalus 151, no. 2 (Spring 2022): 198–217.
Under the Biden Administration, there is certainly greater acknowledgment that problems of bias and discrimination cannot be reduced to technical metrics, and an explicit embrace of ‘socio-technical’ approaches that have been championed by the algorithmic accountability research community. A 2022 report on AI bias by NIST, the US government’s primary technical standard-setting organization, identified “human bias” and “systemic bias,” along with technical bias, as key to evaluating the impact of a system in practice.44Reva Schwartz, Apostol Vassilev, Kristen Greene, Lori Perine, Andrew Burt, and Patrick Hall, “Towards a Standard for Identifying and Managing Bias in Artificial Intelligence,” National Institute of Standards and Technology (NIST), US Department of Commerce, March 2022. These findings were reflected to some degree in the first version of NIST’s Risk Management Framework, released in January 2023, which proposes several robust recommendations, including:
- Evaluating for impacts of AI systems across their life cycle
- Accounting for the sociotechnical context in which the application is being deployed, and using both qualitative and quantitative measures
- Developing approaches for “evaluating systemic and human-cognitive sources of bias”
- Documenting decisions, risk-related trade-offs, and system limitations
- Normalizing the ability to reconfigure or reconsider the product in early stages
- Processes for involving stakeholders in decision-making and examining internal cultural dynamics and norms
While this policy is a pivotal shift in how bias evaluation is conceptualized, the distance between this theoretical vision and the practice of algorithmic auditing as it stands today represents a chasm that these proposed reforms cannot bridge. Crucially, NIST’s policy document is explicitly voluntary,45Note that GOP leaders and the US Chamber of Commerce have both been supportive of the NIST RMF approach. See for example Nihal Krishan, “GOP House Committee Leaders Probe ‘Conflicting Definitions’ in NIST AI Framework and AI ‘Bill of Rights’,” FedScoop, January 27, 2023; and LaCasse, “US NIST publishes AI Risk Management Framework 1.0.” which begs the question: Why would developers of AI models feel empowered or incentivized to course-correct or shape products in ways that might contradict the firm’s business goals?46This question may be posed of auditing frameworks more broadly. See for example Michael Power, The Audit Society: Rituals of Verification (Oxford: Oxford University Press, 1997; see also remarks by Arvind Narayanan, Federal Trade Commission, “PrivacyCon 2022: Part 1,” video. If algorithmic bias is—and it often is—inextricably linked to power asymmetries and structural inequity, how can those be effectively surfaced in procedural mechanisms like audits, and to what end? Incorporating the participation of impacted communities has become an attractive response to some of these concerns in the algorithmic accountability space, but it risks being conflated with democratic control. More likely, however, its ability to be mainstreamed into a procedural audit requirement provides legitimacy to the tech industry to continue developing AI as it has been and undercuts calls for regulation, while asking for even greater resources from those impacted to ensure the accountability of the firms responsible for creating harm.
“Access to data” as a weak policy response given the shrinking space for independent research
Provisions mandating data access for vetted researchers are being integrated into a growing number of policy proposals. Given the way many platforms mediate key domains of society and how obscure the inner workings of decision-making algorithms are, granting access to platform data is both needed and societally beneficial. This research should be protected under robust safe-harbor provisions that provide good-faith researchers with immunity from legal charges associated with hacking:47Alex Abdo, Ramya Krishnan, Stephanie Krent, Evan Welber Falcón, and Andrew Keane Woods, “A Safe Harbor for Platform Research,” Knight First Amendment Institute at Columbia University, January 19, 2022. in particular, the question of whether research that involves web scraping, a practice researchers are forced to rely on when companies refuse to share their data, is legal under the Computer Fraud and Abuse Act (CFAA)48Christian W. Sandvig et al. v. Loretta Lynch, United States District Court for the District of Columbia, Case 1:16-cv-1368 (JDB), October 7, 2016. a question that is still being litigated.49American Civil Liberties Union, “Sandvig v. Barr — Challenge to CFAA Prohibition on Uncovering Racial Discrimination Online,” May 22, 2019. Web scraping should receive robust protections under the law.50Rachel Goodman, “Tips for Data Journalism in the Shadow of an Overbroad Anti-Hacking Law,” American Civil Liberties Union, October 13, 2017. Additionally, community-based research offers a meaningful and robust means through which to shift the balance of power away from company-directed approaches to accountability. This research deserves not only strong legal protections and access, but the adequate resourcing to enable communities to document harms.
However, data access provisions can be harmful when they are used to supplant other structural remedies. This implicitly shifts the burden away from companies and onto under-resourced actors for identifying tech-enabled harms. It also puts platforms in a gatekeeping position over tech accountability work.
Policy proposals, including the US Platform Accountability and Transparency Act and the EU Digital Services Act, position data access as a stand-in for other forms of accountability. This presumes the existence of a robust and well-resourced body of researchers with ample time, resources, and expertise to conduct meaningful technical audits. This is far from the reality, and it benefits companies to promote this positioning since it removes responsibility for the safety of their products from their hands.51This also, as a report by the Center for Democracy and Technology notes, grants an opening for law enforcement agencies to utilize data access provisions to step up their demands for platform information. See Caitlin Vogus, “Report – Defending Data: Privacy Protection, Independent Researchers, and Access to Social Media Data in the US and EU,” Center for Democracy and Technology, January 25, 2023.
Such proposals also grant platforms the opportunity to act as gatekeepers over potentially critical research in multiple ways:52In August 2021, Meta disabled the accounts of researchers at NYU’s Ad Observatory who were investigating political ads and the spread of misinformation on Facebook. See Meghan Bobrowsky, “Facebook Disables Access for NYU Research into Political-Ad Targeting,” Wall Street Journal, August 4, 2021; Cristiano Lima, “Twitter Curbs Researcher Access, Sparking Backlash in Washington,” Washington Post, February 3, 2023; and Mike Clark, “Research Cannot Be the Justification for Compromising People’s Privacy,” Meta, August 3, 2021.
- Through who gets granted access to data, in particular through narrow interpretations of the definition of “research” and “researcher” that may exclude investigative journalists and civil society advocates.53For example, under the DSA researchers must be “vetted” and (1) be affiliated with an academic institution (2) be independent from commercial interests, (3) have proven records of expertise in the fields related to the risks investigated or related research methodologies, and (4) commit to and be in a capacity to preserve data security and confidentiality requirements.
- Through claims that certain research raises “feasibility concerns” or creates an undue burden for the company, and thus must be blocked from access, or by immunizing them against certain causes of action in exchange for the provision of data.54Platform Accountability and Transparency Act, 117th Congress (2021–2022), First Session.
- Through prior review before publication of research based on data access: companies may claim trade secrecy over the insights contained in critical research papers and ensure they never see the light of day.55Georgia Wells, Jeff Horwitz, and Deena Seetharaman, “Facebook Knows Instagram Is Toxic for Teen Girls, Company Documents Show,” Wall Street Journal, September 14, 2021.
Lastly, these proposals need to be read in the context of an increasingly precarious environment for critical tech accountability research, in which economic pressure leaves academic researchers increasingly exposed to undue influence by corporate actors.56See J. Nathan Matias, Susan Benesch, Rebekah Tromble, Alex Abdo, J. Bob Alotta, David Karpf, David Lazer, Nathalie Maréchal, Nabiha Syed, and Ethan Zuckerman, “Manifesto: The Coalition for Independent Technology Research,” Coalition for Independent Technology Research, October 12, 2022; and “Gig Economy Project – Uber whistleblower Mark MacGann’s full statement to the European Parliament,” Brave New Europe, October 25, 2022.