Data Minimization as a Tool for AI Accountability

But the next generation of data minimization policies—bright-line rules that prohibit excessive or harmful data collection and use—show greater promise. Championed by a growing chorus within civil society, these data rules could be a powerful lever in restraining some of the most concerning AI systems (and even the business model that sustains them).

Broad data minimization principles (“collect no more data than necessary”) are a core part of global data privacy laws like the GDPR, but have been woefully underenforced.

Data privacy policy approaches have evolved considerably over the past decade. The abject failure of the “notice and consent” model—which mandates that data collection is broadly permissible provided the user has been notified and given consent—as the primary way to protect people’s privacy is now a mainstream critique.1See Federal Trade Commission, “(<)a href='https://www.ftc.gov/system/files/ftc_gov/pdf/commercial_surveillance_and_data_security_anpr.pdf'(>)Trade Regulation on Commercial Surveillance and Data Security,(<)/a(>)” 16 CFR Part 464, 2022; Neil Richards and Woodrow Hartzog, (<)a href='https://openscholarship.wustl.edu/cgi/viewcontent.cgi?article=6460&context=law_lawreview'(>)“The Pathologies of Digital Consent,”(<)/a(>) Washington University Law Review 96, no. 6 (2019); and Claire Park, (<)a href='https://www.newamerica.org/oti/blog/how-notice-and-consent-fails-to-protect-our-privacy'(>)“How ‘Notice and Consent’ Fails to Protect Our Privacy,”(<)/a(>) New America (blog), March 23, 2020.The dominant legal privacy regime globally, led by the European GDPR, retains consent as a way to legitimize data processing in certain instances but also imposes baseline standards on firms’ data processing activities that apply irrespective of what the user “chooses.”2 Intersoft Consulting, (<)a href='https://gdpr-info.eu/art-5-gdpr'(>)“Art. 5 GDPR: Principles Relating to Processing of Personal Data,”(<)/a(>) n.d., accessed March 3, 2023.

Data minimization is the umbrella term increasingly used to refer to some of these core obligations. Aimed at limiting the incentives for unbridled commercial surveillance practices, these include (1) restrictions on what data is collected (collection limitations), (2) the purposes for which it can be used following collection (purpose limitations), and (3) the amount of time firms can retain data (storage limitations).3Ibid. See also (<)a href='https://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32018R1725'(>)“Regulation (EU) 2018/1725 of the European Parliament and of the Council of 23 October 2018 on the Protection of Natural Persons with Regard to the Processing of Personal Data by the Union Institutions, Bodies, Offices and Agencies and on the Free Movement of Such Data, and Repealing Regulation (EC) No 45/2001 and Decision No 1247/2002/EC,”(<)/a(>) (<)em(>)Official Journal of the European Union(<)/em(>), November 21, 2018; and (<)a href='https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/principles/data-minimisation'(>)https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/principles/data-minimisation(<)/a(>). These rules require firms to demonstrate the necessity and proportionality of the data processing—to prove, for example, that it is in fact necessary to collect certain kinds of data for the purposes they seek to achieve; or to state that they will only use such data for predefined purposes; or to ensure that they will only retain data for a period of time that is necessary and proportionate to these purposes. Typically, certain types of data classified as “sensitive” receive a heightened level of protection, for example, a stricter standard of necessity for the collection of biometric data with fewer exceptions.4See Intersoft Consulting, (<)a href='https://gdpr-info.eu/art-9-gdpr'(>)“Art. 9 GDPR: Processing of Special Categories of Personal Data,”(<)/a(>) n.d., accessed March 15, 2023; and Information Commissioner’s Office (ICO), (<)a href='https://ico.org.uk/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr/special-category-data/what-are-the-rules-on-special-category-data'(>)“What Are the Rules on Special Category Data?”(<)/a(>) n.d., accessed March 15, 2023. In the US, data minimization rules are a part of the California Privacy Rights Act,5(<)a href='https://oag.ca.gov/privacy/ccpa'(>)“California Consumer Privacy Act (CCPA),”(<)/a(>) Rob Bonta, Attorney General, State of California Department of Justice, February 15, 2023. which came into force in January 2023 and is also a core part of a proposed federal privacy law, the American Data Privacy and Protection Act, that has gained widespread momentum.6(<)a href='https://www.congress.gov/bill/117th-congress/house-bill/8152/text'(>)American Data Privacy and Protection Act(<)/a(>), H.R. 8152, 117th Congress (2021–2022).

These broad data-minimization principles offer a clear shift away from consent or control-based approaches. They shift the burden away from individuals having to make decisions or proactively exercise their data rights, and onto firms to demonstrate their compliance with these principles in the interests of users.7David Medine and Gayatri Murthy, (<)a href='https://www.brookings.edu/blog/techtank/2019/12/18/companies-not-people-should-bear-the-burden-of-protecting-data'(>)“Companies, Not People, Should Bear the Burden of Protecting Data,”(<)/a(>) Brookings Institution, December 18, 2019. They also create clear curbs on the kinds of invasive data processing companies are otherwise incentivized to engage in under the behavioral advertising business model.

Despite their strong potential, in practice, these standards (now a part of data protection laws like the GDPR and more than a hundred counterparts around the world8Graham Greenleaf, (<)a href='https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3381593'(>)“Global Data Privacy Laws 2019: 132 National Laws & Many Bills,”(<)/a(>) (<)em(>)Privacy Laws & Business International Report(<)/em(>) 157 (2019).) haven’t had the kind of structural impact they promise. A key reason for this is the inherent ambiguity in interpreting the legal standards of necessity and proportionality encoded in these principles,9See Josephine Wolff and Nicole Atallah, (<)a href='https://scholarlypublishingcollective.org/psup/information-policy/article/doi/10.5325/jinfopoli.11.2021.0063/291999/Early-GDPR-Penalties-Analysis-of-Implementation'(>)“Early GDPR Penalties: Analysis of Implementation and Fines through May 2020,”(<)/a(>) (<)em(>)Journal of Information Policy(<)/em(>) 11 (December 2021); and Information Commissioner’s Office (ICO), (<)em(>)(<)a href='https://ico.org.uk/media/for-organisations/documents/2013559/big-data-ai-ml-and-data-protection.pdf'(>)Big Data, Artificial Intelligence, Machine Learning and Data Protection(<)/a(>)(<)/em(>), September 4, 2017. which, combined with overburdened enforcement agencies,10Access Now, (<)a href='https://www.accessnow.org/alarm-over-weak-enforcement-of-gdpr-on-two-year-anniversary'(>)“Access Now Raises the Alarm over Weak Enforcement of the EU GDPR on the Two-Year Anniversary,”(<)/a(>) press release, May 25, 2020. leaves companies a great deal of leeway in how to apply (or likely evade) these requirements. The enforcement of data minimization throws up fundamentally thorny questions: Does maximizing advertising revenue qualify as a reasonable business purpose? If so, does it justify virtually limitless data collection for behavioral advertising? How far can security justifications stretch to legitimize indefinite data retention? These issues are far from resolved despite almost a decade of enforcement. There have been far and few notable exceptions where data minimization principles in the GDPR have been enforced to successfully draw bright-line rules against certain kinds of data processing. In one example, the Swedish Data Protection Authority outlawed the use of facial recognition in schools on the basis of the collection limitation principle, finding that its use for monitoring attendance was a disproportionate means to achieve this goal.11European Data Protection Board, (<)a href='https://edpb.europa.eu/news/national-news/2021/swedish-dpa-police-unlawfully-used-facial-recognition-app_en'(>)“Swedish DPA: Police Unlawfully Used Facial Recognition App,”(<)/a(>) February 12, 2021.

A range of new data minimization proposals move toward specific restrictions around excessive or harmful data practices, such as restricting targeted advertising or banning the collection of biometric data in certain domains.

A new iteration of data-minimization rules could overcome these challenges by moving beyond high-level normative standards (as in the GDPR) to specific restrictions around particular types of data and kinds of data use. Bold proposals have been surfaced in the US by civil society and in legislative proposals, including restricting the use of data for targeted advertising,12See Accountable Tech, (<)a href='https://accountabletech.org/media/accountable-tech-petitions-ftc-to-ban-surveillance-advertising-as-an-unfair-method-of-competition'(>)“Accountable Tech Petitions FTC to Ban Surveillance Advertising as an ‘Unfair Method of Competition’,”(<)/a(>) press release, September 28, 2021; Electronic Privacy Information Center (EPIC) and Consumer Reports, (<)em(>)(<)a href='https://epic.org/documents/how-the-ftc-can-mandate-data-minimization-through-a-section-5-unfairness-rulemaking'(>)How the FTC Can Mandate Data Minimization through a Section 5 Unfairness Rulemaking(<)/a(>)(<)/em(>), January 2022; Accountable Tech, (<)a href='https://www.bansurveillanceadvertising.com/coalition-letter'(>)“Ban Surveillance Advertising: Coalition Letter,”(<)/a(>) 2022, accessed March 15, 2023; and (<)em(>)(<)a href='https://www.bansurveillanceadvertising.com/coalition-letter'(>)In the Matter of Trade Regulation Rule on Commercial Surveillance and Data Security, R111004, Before the Federal Trade Commission, Washington, D.C.(<)/a(>)(<)/em(>), November 21, 2022 (statement of Center for Democracy & Technology). or a narrower version that limits the use of sensitive data for all secondary purposes, including advertising;13Ada Lovelace Institute, (<)em(>)(<)a href='https://www.adalovelaceinstitute.org/wp-content/uploads/2022/06/Countermeasures-the-need-for-new-legislation-to-govern-biometric-technologies-in-the-UK-Ada-Lovelace-Institute-June-2022.pdf'(>)Countermeasures: The Need for New Legislation to Govern Biometric Technologies in the UK(<)/a(>)(<)/em(>), June 2022. restricting the collection and use of biometric information for particular groups such as children;14Lindsey Barrett, (<)em(>)“(<)a href='https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3660118'(>)Ban Facial Recognition Technologies for Children—and for Everyone Else(<)/a(>)”(<)/em(>), 26 B.U. J. S CI . & T ECH . L. 223 (2020) and in certain contexts such as workplaces, schools, and hiring.15See (<)a href='https://leginfo.legislature.ca.gov/faces/billTextClient.xhtml?bill_id=202120220AB1651'(>)Worker Rights: Workplace Technology Accountability Act(<)/a(>), A.B. 1651, California Legislature (2021–2022); Sofia Edvardsen, (<)a href='https://iapp.org/news/a/how-to-interpret-swedens-first-gdpr-fine-on-facial-recognition-in-school'(>)“How to Interpret Sweden’s First GDPR Fine on Facial Recognition in School,”(<)/a(>) International Association of Privacy Professionals (IAPP), August 27, 2019; European Data Protection Board, (<)a href='https://edpb.europa.eu/news/news/2021/edpb-edps-call-ban-use-ai-automated-recognition-human-features-publicly-accessible_en'(>)“EDPB & EDPS Call for Ban on Use of AI for Automated Recognition of Human Features in Publicly Accessible Spaces, and Some Other Uses of AI That Can Lead to Unfair Discrimination,”(<)/a(>) June 21, 2021; and (<)a href='https://www.article19.org/biometric-technologies-privacy-data-free-expression'(>)“When Bodies Become Data: Biometric Technologies and Free Expression,”(<)/a(>) Article 19, April 2021.

These proposals clarify bright-line rules when it comes to data collection and use. Some, like the ban on using data for behavioral advertising, are justified as both pro-privacy and pro-competition interventions since they target first-party data collection that is currently concentrated among Big Tech companies.16Accountable Tech, (<)a href='https://accountabletech.org/campaign/ftc-public-comment'(>)“FTC Rulemaking Petition to Prohibit Surveillance Advertising,”(<)/a(>) 2022, accessed March 15, 2023.

The proposals also could effectively shut down some of the most concerning uses of AI—for example, by placing restrictions on the collection of emotion-related data or restricting biometric data collection in the workplace (which fuels a range of algorithmic surveillance and management tools). In these ways, the next generation of data minimization rules could be a powerful lever in addressing some of the most concerning AI systems and the business model that sustains them.

FURTHER READING:

Research Areas

Privacy & Surveillance

Broad data minimization principles (“collect no more data than necessary”), a core part of data privacy laws like the EU’s GDPR, have been woefully underenforced and given too much interpretive wiggle room.