3: Lessons from the FDA Model

What lessons can we learn from the FDA model for AI governance, and how might FDA-style interventions be applied to AI? This section outlines three crosscutting themes from the experience of the FDA’s regulation of pharmaceuticals. Rather than homing in on specific functions like the previous section did, this one looks at how these functions interact to produce particular outcomes in the sector.

I. Establishing Efficacy and Safety

The safety and efficacy of products must be evaluated in parallel to make a fulsome assessment of their impact on society at large. In the context of AI, policymaking and regulatory activity have tended to index heavily on risk, and have insufficiently evaluated the efficacy of AI systems.

Establishing efficacy in the pharmaceutical context is a complex task: the history of the FDA is a history of many value-laden disputes around how the effects of drugs ought to be measured and what ought to count as evidence. Compared to AI, though, the task of pharmaceutical regulators is comparatively straightforward: Does the drug work when evaluated against an agreed-upon end point?1Within this, there are value-laden disputes throughout the FDA’s history around measurements. (What should count as evidence?)

While evaluating the claims made by companies themselves offers a useful starting point, doing so can be much more complicated in the context of artificial intelligence, where systems are highly complex and also sociotechnical—a system that operates effectively in controlled environments may fail to function appropriately when deployed in the real world.2Inioluwa Deborah Raji, I. Elizabeth Kumar, Aaron Horowitz, and Andrew D. Selbst, “The Fallacy of AI Functionality,” (<)em(>)FaccT ’22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency(<)/em(>) (June 2022): 959–72, (<)a href='https://doi.org/10.1145/3531146.3533158'(>)https://doi.org/10.1145/3531146.3533158(<)/a(>). The makers of general-purpose AI systems are also much less likely to make specific claims against which efficacy could readily be tested.

Furthermore, some AI systems are deterministic (i.e., we can with a reasonable degree of accuracy understand and trace how an outcome was defined and arrived upon), while others are probabilistic (we can understand the basic mechanisms of how the system functions, but it may be difficult or impossible to consistently explain retrospectively, or anticipate proactively, how it will behave). In addition, the measures needed to evaluate effectiveness in AI are inherently more dynamic, multivariate, and complex than in the pharmaceutical context: they require contextual expertise from across a range of sectors, not solely constrained to technical expertise but incorporating the domains in which an AI system is deployed.3Michael Feffer, Michael Skirpan, Zachary Lipton, and Hoda Heidari, “From Preference Elicitation to Participatory ML: A Critical Survey & Guidelines for Future Research,” (<)em(>)Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society(<)/em(>) (August 2023): 38–48,(<)a href='https://doi.org/10.1145/3600211.3604661'(>) https://doi.org/10.1145/3600211.3604661(<)/a(>); Fernando Delgado, Stephen Yang, Michael Madaio, and Qian Yang, “The Participatory Turn in AI Design: Theoretical Foundations and the Current State of Practice,” arXiv, October 2, 2023,(<)a href='https://doi.org/10.48550/arXiv.2310.00907'(>) https://doi.org/10.48550/arXiv.2310.00907(<)/a(>); Abeba Birhane et al., “Power to the People? Opportunities and Challenges for Participatory AI,” (<)em(>)EAAMO ’22: Proceedings of the 2nd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization(<)/em(>) (October 2022): 1–8, (<)a href='https://doi.org/10.1145/3551624.3555290'(>)https://doi.org/10.1145/3551624.3555290(<)/a(>); Laura Weidinger et al., “Sociotechnical Safety Evaluation of Generative AI Systems,” arXiv, October 31, 2023,(<)a href='https://doi.org/10.48550/arXiv.2310.11986'(>) https://doi.org/10.48550/arXiv.2310.11986(<)/a(>).

In some settings, we may be willing to take on this degree of uncertainty—it’s sufficient to know that an AI system works well enough. In others, it may be necessary to have a high degree of certainty both that the system works effectively and that it will not behave in ways that are detrimental to particular users or to society at large. Without an approach to benchmarking and validation of AI that considers safety and efficacy in tandem, we lack the necessary information to make these kinds of decisions.

There will always be potential harms from AI; the regulatory question thus must consider whether the benefits outweigh the harms. But to know that, we need clear evidence—which we currently lack—of the specific benefits offered by AI technologies.

To serve the public interest, measures of efficacy should be considered carefully. They should not be primarily or solely indexed on profit or growth, but should take into account benefits to society more generally.

Regulatory approaches in AI should require developers of AI systems to explain how an AI system works, the problems it attempts to address, and the benefits it offers—not just evaluate where it fails. Accurately measuring and validating noneconomic benefits has become a key challenge in other domains (notably in the context of carbon emissions reduction targets), and developing robust metrics for this should be a priority for AI governance.

II. From Opacity to Openness

The FDA model offers a powerful lesson in transparency: product safety cannot be divorced from the process of optimizing regulatory design for information production. Prior to the existence of the agency, much of the pharmaceutical industry was largely opaque, in ways that bear similarities to the AI market.

Over time, the FDA’s interventions have expanded the public’s understanding of how drugs work by ensuring firms invest in research and documentation. Beyond simply understanding incidents in isolation, it has catalyzed and organized an entire field of expertise and disseminated this expertise across stakeholders, enriching our understanding of pharmaceuticals and their role in our society and economy.4Kapczynski, “Dangerous Times.”

This information-production function is particularly important for AI. Key players in the market are incentivized against transparency, and even identifying these actors in the first place is a challenging task absent regulatory intervention.

Many specific aspects of information exchange in the FDA model offer lessons for thinking about AI regulation. For example, in the context of pharmaceuticals, there is a focus on multistakeholder communication that requires ongoing information exchange between staff, expert panels, patients, and drug developers. Drug developers are mandated to submit troves of internal documentation, which the FDA reformats for public release. The FDA also manages a database of adverse incidents, clinical trials, and guidance documentation.5Center for Drug Evaluation and Research, “FDA Adverse Event Reporting System (FAERS) Public Dashboard,” Food and Drug Administration, December 7, 2023,(<)a href='https://www.fda.gov/drugs/questions-and-answers-fdas-adverse-event-reporting-system-faers/fda-adverse-event-reporting-system-faers-public-dashboard'(>) https://www.fda.gov/drugs/questions-and-answers-fdas-adverse-event-reporting-system-faers/fda-adverse-event-reporting-system-faers-public-dashboard(<)/a(>). What’s more, it produces its own independent analysis on top of industry-supplied data that offers an important check that sometimes differs from—and even challenges—industry conclusions.6See Christopher J. Morten and Amy Kapczynski, “The Big Data Regulator, Rebooted: Why and How the FDA Can and Should Disclose Confidential Data on Prescription Drugs and Vaccines,” (<)em(>)California Law Review(<)/em(>) 109 (April 2021), (<)a href='https://www.californialawreview.org/print/the-big-data-regulator-rebooted-why-and-how-the-fda-can-and-should-disclose-confidential-data-on-prescription-drugs-and-vaccines'(>)https://www.californialawreview.org/print/the-big-data-regulator-rebooted-why-and-how-the-fda-can-and-should-disclose-confidential-data-on-prescription-drugs-and-vaccines(<)/a(>); and Kapczynski, “Dangerous Times.”

Implementing documentation requirements of this sort for AI would represent a significant change from the current accountability vacuum in AI. Encouraging AI firms to adopt stronger monitoring and compliance activities like recordkeeping and documentation practices would substantively change those firms’ approach to building systems and potentially even their operating models. Such monitoring and compliance may also need to extend to the agency itself, ensuring that agency leadership doesn’t use opacity to limit second-guessing of its decision-making, establishing information law rules that govern information flows between the regulator, researchers, and the broader public. Moreover, traceability remains an underexplored field in the context of AI: change and control management systems in software required significant investment, and similar investment in corollary approaches in AI would alleviate the burden on smaller players.7Thanks to Heidy Khlaaf for this point.

On one hand, this may make the development process more expensive and difficult, requiring additional documentation and validation processes rather than encouraging open experimentation.8Emily Black et al., “Less Discriminatory Algorithms,” (<)em(>)Georgetown Law Journal(<)/em(>) 113, no. 1 (2024),(<)a href='https://doi.org/10.2139/ssrn.4590481'(>) https://doi.org/10.2139/ssrn.4590481(<)/a(>). On the other, such mandates would have beneficial effects for AI governance by streamlining organizational processes, ensuring durability of knowledge of how systems were developed and creating greater internal transparency and accountability. Requiring that companies conduct baseline measures to adequately scrutinize and document the development process would also enable and increase the effectiveness of external auditing, facilitating the development of an “ecosystem of inspection.” It would also provide legal hooks for ex post enforcement, and aid the work of enforcement agencies when they need to investigate AI companies.

It is worth considering how such measures may differentially impact companies of various sizes and stages of development: AI startups may express more hesitancy about potential advantages that could be gained by bigger companies when they provide information about an AI product still in development. Thus it would be worth considering in more depth the right balance between publishing information that enables public validation of the outcomes of assessments and directly publishing the information provided by companies; the pharmaceutical process may offer useful points of comparison but this will likely need further study in the specific context of artificial intelligence.

III. Generating Agency and Rebalancing Power

The FDA approach creates and distributes new sites of agency in the healthcare system, empowering actors like doctors who are obligated to work in their patients’ best interest.

The power of FDA regulation comes in part from other actors in the system, including physicians, insurers, whistleblowers, and other actors who strengthen its monitoring regime. This has acted as an important second line of defense in pharmaceuticals, where the regulatory process has been insufficiently rigorous.

By contrast, we lack similar professional obligations in the AI context, dependencies and sites of friction remain comparatively immature, and the relevant actors are not necessarily incentivized toward accountability. AI is frequently used in ways that are designed to ultimately undermine the agency of those on whom it’s used.

In comparison to pharmaceuticals, where numerous interdependencies exist across actors within a market—including doctors, insurance companies, pharmaceutical companies, and patient advocacy groups—artificial intelligence has relatively few. The dependencies that do exist tend to be unidirectional: smaller companies and startups depend on the resources dominated by cloud infrastructure providers like Google, Amazon, and Microsoft.

An important distinction brought up throughout this report is that the “users” of AI systems are often not the group most affected by those systems. In many instances, AI is used by comparatively powerful entities—employers, law enforcement agencies, healthcare providers, banks—on those who are less powerful.9See “Algorithmic Management: Restraining Workplace Surveillance,” AI Now Institute, April 11, 2023, (<)a href='https://ainowinstitute.org/publication/algorithmic-management'(>)https://ainowinstitute.org/publication/algorithmic-management(<)/a(>); AI Nationalism(s) Executive Summary; and Philip Alston, “Report of the Special Rapporteur on Extreme Poverty and Human Rights,” October11, 2019, (<)a href='https://srpovertyorg.files.wordpress.com/2019/10/a_74_48037_advanceuneditedversion-1.pdf'(>)https://srpovertyorg.files.wordpress.com/2019/10/a_74_48037_advanceuneditedversion-1.pdf(<)/a(>). Often, those on whom AI is used are not informed of its use; even if they are informed, they may lack the necessary information, or the power and authority, to seek redress if the system has caused harm.

These points of distinction between AI and pharmaceuticals, or many other consumer products for that matter, merit further attention. In particular, such differences raise questions about how information should be redistributed not only among the many actors involved in the development, procurement, and regulation of AI systems, but also among those on whom AI systems are used. It also raises questions about what sorts of responsibilities and obligations different actors should have to mitigate the power imbalances that characterize much of AI use and deployment, and how to effectively introduce the necessary friction to ensure these responsibilities are carried out.

Research Areas

Accountability