Computing power and the governance of artificial intelligence

AI Insight Forum – privacy and liability

Summary

On November 8, our Head of Strategy, Mackenzie Arnold, spoke before the US Senate’s bipartisan AI Insight Forum on Privacy and Liability, convened by Senate Majority Leader Chuck Schumer. We presented our perspective on how Congress can meet the unique challenges that AI presents to liability law.[ref 1]

In our statement, we note that:

We then make several recommendations for how Congress could respond to these challenges:


Dear Senate Majority Leader Schumer, Senators Rounds, Heinrich, and Young, and distinguished members of the U.S. Senate, thank you for the opportunity to speak with you about this important issue. Liability is a critical tool for addressing risks posed by AI systems today and in the future. In some respects, existing law will function well, compensating victims, correcting market inefficiencies, and driving safety innovation. However, artificial intelligence also presents unusual challenges to liability law that may lead to inconsistency and uncertainty, penalize the wrong actors, and leave victims uncompensated. Courts, limited to the specific cases and facts at hand, may be slow to respond. It is in this context that Congress has an opportunity to act. 

Problem 1: Existing law will under-deter malicious and criminal misuse of AI. 

Many have noted the potential for AI systems to increase the risk of various hostile threats, ranging from biological and chemical weapons to attacks on critical infrastructure like energy, elections, and water systems. AI’s unique contribution to these risks goes beyond simply identifying dangerous chemicals and pathogens; advanced systems may help plan, design, and execute complex research tasks or help criminals operate on a vastly greater scale. With this in mind, President Biden’s recent Executive Order has called upon federal agencies to evaluate and respond to systems that may “substantially lower[] the barrier of entry for non-experts to design, synthesize, acquire, or use chemical, biological, radiological, or nuclear (CBRN) weapons.” While large-scale malicious threats have yet to materialize, many AI systems are inherently dual-use by nature. If AI is capable of tremendous innovation, it may also be capable of tremendous, real-world harms. In many cases, the benefits of these systems will outweigh the risks, but the law can take steps to minimize misuse while preserving benefits. 

Existing criminal, civil, and tort law will penalize malevolent actors for the harms they cause; however, liability is insufficient to deter those who know they are breaking the law. AI developers and some deployers will have the most control over whether powerful AI systems fall into the wrong hands, yet they may escape liability (or believe and act as if they will). Unfortunately, existing law may treat malevolent actors’ intentional bad acts or alterations to models as intervening causes that sever the causal chain and preclude liability, and the law leaves unclear what obligations companies have to secure their models. Victims will go uncompensated if their only source of recourse is small, hostile actors with limited funds. Reform is needed to make clear that those with the greatest ability to protect and compensate victims will be responsible for preventing malicious harms. 

Recommendations

(1.1) Hold AI developers and some deployers strictly liable for attacks on critical infrastructure and harms that result from biological, chemical, radiological, or nuclear weapons.

The law has long recognized that certain harms are so egregious that those who create them should internalize their cost by default. Harms caused by biological, chemical, radiological, and nuclear weapons fit these criteria, as do harms caused by attacks on critical infrastructure. Congress has addressed similar harms before, for example, creating strict liability for releasing hazardous chemicals into the environment. 

(1.2) Consider (a) holding developers strictly liable for harms caused by malicious use of exfiltrated systems and open-sourced weights or (b) creating a duty to ensure the security of model weights.

Access to model weights increases malicious actors’ ability to enhance dangerous capabilities and remove critical safeguards. And once model weights are out, companies cannot regain control or restrict malicious use. Despite this, existing information security norms are insufficient, as evidenced by the leak of Meta’s LLaMA model just one week after it was announced and significant efforts by China to steal intellectual property from key US tech companies. Congress should create strong incentives to secure and protect model weights. 

Getting this balance right will be difficult. Open-sourcing is a major source of innovation, and even the most scrupulous information security practices will sometimes fail. Moreover, penalizing exfiltration without restricting the open-sourcing of weights may create perverse incentives to open-source weights in order to avoid liability—what has been published openly can’t be stolen. To address these tradeoffs, Congress could pair strict liability with the ability to apply for safe harbor or limit liability to only the largest developers, who have the resources to secure the most powerful systems, while excluding smaller and more decentralized open-source platforms. At the very least, Congress should create obligations for leading developers to maintain adequate security practices and empower a qualified agency to update these duties over time. Congress could also support open-source development through secure, subsidized platforms like NAIRR or investigate
other alternatives to safe access.

(1.3) Create duties to (a) identify and test for model capabilities that could be misused and (b) design and implement safeguards that consistently prevent misuse and cannot be easily removed. 

Leading AI developers are best positioned to secure their models and identify dangerous misuse capabilities before they cause harm. The latter requires evaluation and red-teaming before deployment, as acknowledged in President Biden’s Recent Executive Order, and continued testing and updates after deployment. Congress should codify clear minimum standards for identifying capabilities and preventing misuse and should grant a qualified agency authority to update these duties over time. 

Problem 2: Existing law will under-compensate harms from models with unexpected capabilities and failure modes. 

A core characteristic of modern AI systems is their tendency to display rapid capability jumps and unexpected emergent behaviors. While many of these advances have been benign, when unexpected capabilities cause harm, courts may treat them as unforeseeable and decline to impose liability. Other failures may occur when AI systems are integrated into new contexts, such as healthcare, employment, and agriculture, where integration presents both great upside and novel risks. Developers of frontier systems and deployers introducing AI into novel contexts will be best positioned to develop containment methods and detect and correct harms that emerge.

Recommendations

(2.1) Adjust the timing of obligations to account for redressability. 

To balance innovation and risk, liability law can create obligations at different stages of the product development cycle. For harms that are difficult to control or remedy after they have occurred, like harms that upset complex financial systems or that result from uncontrolled model behavior, Congress should impose greater ex-ante obligations that encourage the proactive identification of potential risks. For harms that are capable of containment and remedy, obligations should instead encourage rapid detection and remedy. 

(2.2) Create a duty to test for emergent capabilities, including agentic behavior and its precursors. 

Developers will be best positioned to identify new emergent behaviors, including agentic behavior. While today’s systems have not displayed such qualities, there are strong theoretical reasons to believe that autonomous capabilities may emerge in the future, as acknowledged by the actions of key AI developers like Anthropic and OpenAI. As techniques develop, Congress should ensure that those working on frontier systems utilize these tools rigorously and consistently. Here too, Congress should authorize a qualified agency to update these duties over time as new best practices emerge.

(2.3) Create duties to monitor, report, and respond to post-deployment harms, including taking down or fixing models that pose an ongoing risk. 

If, as we expect, emergent capabilities are difficult to predict, it will be important to identify them even after deployment. In many cases, the only actors with sufficient information and technical insight to do so will be major developers of cutting-edge systems. Monitoring helps only insofar as it is accompanied by duties to report or respond. In at least some contexts, corporations already have a duty to report security breaches and respond to continuing risks of harm, but legal uncertainty limits the effectiveness of these obligations and puts safe actors at a competitive disadvantage. By clarifying these duties, Congress can ensure that all major developers meet a minimum threshold of safety. 

(2.4) Create strict liability for harms that result from agentic model behavior such as self-exfiltration, self-alteration, self-proliferation, and self-directed goal-seeking. 

Developers and deployers should maintain control over the systems they create. Behaviors that enable models to act on their own—without human oversight—should be disincentivized through liability for any resulting harms. “The model did it” is an untenable defense in a functioning liability system, and Congress should ensure that, where intent or personhood requirements would stand in the way, the law imputes liability to a responsible human or corporate actor.

Problem 3: Existing law may struggle to allocate costs efficiently. 

The AI value chain is complex, often involving a number of different parties who help develop, train, integrate, and deploy systems. Because those later in the value chain are more proximate to the harms that occur, they may be the first to be brought to court. But these smaller, less-resourced actors will often have less ability to prevent harm. Disproportionately penalizing these actors will further concentrate power and diminish safety incentives for large, capable developers. Congress can ensure that responsibility lies with those most able to prevent harm. 

Recommendations

(3.1) Establish joint and several liability for harms involving AI systems. 

Victims will have limited information about who in the value chain is responsible for their injuries. Joint and several liability would allow victims to bring any responsible party to court for the full value of the injury. This would limit the burden on victims and allow better-resourced corporate actors to quickly and efficiently bargain toward a fair allocation of blame. 

(3.2) Limit indemnification of liability by developers. 

Existing law may allow wealthy developers to escape liability by contractually transferring blame to smaller third parties with neither the control to prevent nor assets to remedy harms. Because cutting-edge systems will be so desirable, a small number of powerful AI developers will have considerable leverage to extract concessions from third parties and users. Congress should limit indemnification clauses that help the wealthiest players avoid internalizing the costs of their products while still permitting them to voluntarily indemnify users

(3.3) Clarify that AI systems are products under products liability law. 

For over a decade, courts have refused to answer whether AI systems are software or products. This leaves critical ambiguity in existing law. The EU has proposed to resolve this uncertainty by declaring that AI systems are products. Though products liability is primarily developed through state law, a definitive federal answer to this question may spur quick resolution at the state level. Products liability has some notable advantages, focusing courts’ attention on the level of safety that is technically feasible, directly weighing risks and benefits, and applying liability across the value chain. Some have argued that this creates clearer incentives to proactively identify and invest in safer technology and limits temptations to go through the motions of adopting safety procedures without actually limiting risk. Products liability has its limitations, particularly in dealing with defects that emerge after deployment or alteration, but clarifying that AI systems are products is a good start. 

Problem 4: Federal law may obstruct the functioning of liability law. 

Parties are likely to argue that federal law preempts state tort and civil law and that Section 230 shields liability from generative AI models. Both would be unfortunate results that would prevent the redress of individual harms through state tort law and provide sweeping immunity to the very largest AI developers. 

Recommendations

(4.1) Add a savings clause to any federal legislation to avoid preemption. 

Congress regularly adds express statements that federal law does not eliminate, constrain, or preempt existing remedies under state law. Congress should do the same here. While federal law will provide much-needed ex-ante requirements, state liability law will serve a critical role in compensating victims and will be more responsive to harms that occur as AI develops by continuing to adjust obligations and standards of care. 

(4.2) Clarify that Section 230 does not apply to generative AI. 

The most sensible reading of Section 230 suggests that generative AI is a content creator. It creates novel and creative outputs rather than merely hosting existing information. But absent Congressional intervention, this ambiguity may persist. Congress should provide a clear answer: Section 230 does not apply to generative AI.

Advanced AI governance: a literature review of problems, options, and proposals

Executive Summary 

This literature review provides an overview and taxonomy of past and recent research in the emerging field of advanced AI governance.

Aim: The aim of this review is to help disentangle and consolidate the field, improve its accessibility, enable clearer conversations and better evaluations, and contribute to overall strategic clarity or coherence in public and policy debates. 

Summary: Accordingly, this review is organized as follows:

The introduction discusses the aims, scope, selection criteria, and limits of this review and provides a brief reading guide. 

Part I reviews problem-clarifying work aimed at mapping the parameters of the AI governance challenge, including lines of research to map and understand:

  1. Key technical parameters constituting the technical characteristics of advanced AI technology and its resulting (sociotechnical) impacts and risks. These include evaluations of the technical landscape of advanced AI (its forms, possible developmental pathways, timelines, trajectories), models for its general social impacts, threat models for potential extreme risks (based on general arguments and direct and indirect threat models), and the profile of the technical alignment problem and its dedicated research field. 
  2. Key deployment parameters constituting the conditions (present and future) of the AI development ecosystem and how these affect the distribution and disposition of the actors that will (first) deploy such systems. These include the size, productivity, and geographic distribution of the AI research field; key AI inputs; and the global AI supply chain. 
  3. Key governance parameters affecting the conditions (present and future) for governance interventions. These include stakeholder perceptions of AI and trust in its developers, the default regulatory landscape affecting AI, prevailing barriers to effective AI governance, and effects of AI systems on the tools of law and governance themselves.
  4. Other lenses on characterizing the advanced AI governance problem. These include lessons derived from theory, from abstract models and wargames, from historical case studies (of technology development and proliferation, of its societal impacts and societal reactions, of successes and failures in historical attempts to initiate technology governance, and of successes and failures in the efficacy of different governance levers at regulating technology), and lessons derived from ethics and political theory. 

Part II reviews option-identifying work aimed at mapping potential affordances and avenues for governance, including lines of research to map and understand: 

  1. Potential key actors shaping advanced AI, including actors such as or within AI labs and companies, the digital AI services and compute hardware supply chains, AI industry and academia, state and governmental actors (including the US, China, the EU, the UK, and other states), standard-setting organizations, international organizations, and public, civil society, and media actors. 
  2. Levers of governance available to each of these actors to shape AI directly or indirectly.
  3. Pathways to influence on each of these key actors that may be available to (some) other actors in aiming to help inform or shape the key actors’ decisions around whether or how to utilize key levers of governance to improve the governance of advanced AI. 

Part III reviews prescriptive work aimed at putting this research into practice in order to improve the governance of advanced AI (for some view of the problem and of the options). This includes lines of research or advocacy to map, articulate, and advance:

  1. Priorities for policy given theories of change based on some view of the problem and of the options.
  2. Good heuristics for crafting AI policy. These include general heuristics for good regulation, for (international) institutional design, and for future-proofing governance.
  3. Concrete policy proposals for the regulation of advanced AI, and the assets or products that can help these be realized and implemented. This includes proposals to regulate advanced AI using existing authorities, laws, or institutions; proposals to establish new policies, laws, or institutions (e.g., temporary or permanent pauses on AI development; the establishment of licensing regimes, lab-level safety practices, or governance regimes on AI inputs; new domestic governance institutions; new international AI research hubs; new bilateral agreements; new multilateral agreements; and new international governance institutions).

Introduction 

This document aims to review, structure, and organize existing work in the field of advanced AI governance. 

Background: Despite being a fairly young and interdisciplinary field, advanced AI governance offers a wealth of productive work to draw on and is increasingly structured through various research agendas[ref 1] and syllabi.[ref 2] However, while technical research on the possibility, impacts, and risks of advanced AI has been mapped in various literature reviews and distillations,[ref 3] few attempts have been made to comprehensively map and integrate existing research on the governance of advanced AI.[ref 4] This document aims to provide an overview and taxonomy of work in this field.

Aims: The aims of this review are several: 

  1. Disentangle and consolidate the field to promote greater clarity and legibility regarding the range of research, connections between different research streams and directions, and open gaps or underexplored questions. Literature reviews can contribute to such a consolidation of academic work;[ref 5] 
  2. Improve the field’s accessibility and reduce some of its “research debt”[ref 6] to help those new to the field understand the existing literature, in order to facilitate a more cohesive and coordinated research field with lower barriers to entry, which reduces duplication of effort or work; 
  3. Enable clearer conversations between researchers exploring different questions or lines of research, discussing how and where their insights intersect or complement one another; 
  4. Enable better comparison between different approaches and policy proposals; and
  5. Contribute to greater strategic clarity or coherence,[ref 7] improving the quality of interventions, and refining public and policy debates. 

Scope: While there are many ways of framing the field, one approach is to define advanced AI governance as:

However, the aim of this document is not to engage in restrictive boundary policing of which research is part of this emerging field, let alone the “core” of it. The guiding heuristic here is not whether a given piece of research is directly, explicitly, and exclusively focused on certain “right” problems (e.g., extreme risks from advanced AI), nor whether it is motivated by certain political orientations or normative frameworks, nor even whether it explicitly uses certain terminology (e.g., “Transformative AI,” “AGI,” “General-Purpose AI System,” or “Frontier AI”).[ref 9] Rather, the broad heuristic is simply whether the research helps answer a part of the advanced AI governance puzzle. 

Accordingly, this review aims to cast a fairly broad net to cover work that meets any of the following criteria:

Limitations: With this in mind, there are also a range of limitations or shortcomings for this review:

Finally, a few remaining disclaimers: (1) inclusion does not imply endorsement of a given article’s conclusions; (2) this review aims to also highlight promising directions, such as issues or actors, that are not yet discussed in depth in the literature. As such, whenever I list certain issues (e.g., “actors” or “levers”) without sources, this is because I have not yet found (or have missed out on) much work on that issue, suggesting there is a gap in the literature—and room for future work. Overall, this review should be seen as a living document that will be occasionally updated as the field develops. To that end, I welcome feedback, criticism, and suggestions for improvement. 

Reading guide: In general, I recommend that rather than aiming to read this from the top, readers instead identify a theme or area of interest and jump to that section. In particular, this review may be most useful to readers (a) that already have a specific research question and want to see what work has been done and how a particular line of work would fit into the larger landscape; (b) that aim to generate or distill syllabi for reading groups or courses; or (c) that aim to explore the broader landscape or build familiarity with fields or lines of research they have not previously explored. All the research presented here is collected from prior work, and I encourage readers to consult and directly cite those original sources named here.

I. Problem-clarifying work: Understanding the AI governance challenge 

Most object-level work in the field of advanced AI governance has sought to disambiguate and reduce uncertainties around relevant strategic parameters of the AI governance challenge.[ref 12]

Strategic parameters serve as highly decision-relevant or even crucial considerations, determining which interventions or solutions are appropriate, necessary, viable, or beneficial for addressing the advanced AI governance challenge. Different views of these parameters constitute underlying cruxes for different theories of actions and approaches. This review discusses three types of strategic parameters:[ref 14]

Accordingly, research in this subfield includes:

1. Technical parameters 

An initial body of work focuses on mapping the relevant technical parameters of the challenge for advanced AI governance. This includes work on a range of topics relating to understanding the future technical landscape, understanding the likelihood of catastrophic risks given various specific threat models, and understanding the profile of the technical alignment problem and the prospects of it being solved by existing technical alignment research agendas.[ref 15]

1.1. Advanced AI technical landscape 

One subfield involves research to chart the future technical landscape of advanced AI systems.[ref 16] Work to map this landscape includes research on the future form, pathways, timelines, and trajectories of advanced AI.

Forms of advanced AI 

Work exploring distinct potential forms of advanced AI,[ref 17] including:

Developmental paths towards advanced AI

This includes research and debate on a range of domains. In particular, such work focuses on analyzing different hypothesized pathways towards achieving advanced AI based on different paradigms or theories.[ref 30] Note that many of these are controversial and contested, and there is pervasive disagreement over the feasibility of many (or even all) of these approaches for producing advanced AI. 

Nonetheless, some of these paradigms include programs to produce advanced AI based on:

Notably, of these approaches, recent years have seen most sustained attention focused on the direct (scaling) approach and whether current approaches to advanced AI, if scaled up with enough computing power or training data, will suffice to produce advanced or transformative AI capabilities. There have been various arguments both in favor of and against this direct path. 

Advanced AI timelines: Approaches and lines of evidence

A core aim of the field is to chart the timelines for advanced AI development across the future technical development landscape.[ref 49] This research focuses on various lines of evidence,[ref 50] which are here listed in order from more abstract to more concrete and empirical, and from relying more on outside-view arguments to relying more on inside-view arguments,[ref 51] with no specific ranking on the basis of the strength of individual lines of evidence.

Outside-view analyses of timelines

Outside-view analyses of AI development timelines, including:

Judgment-based analyses of timelines

Judgment-based analyses of timelines, including:

Estimates based on (specialist) expert opinions:

Inside-view models on AI timelines

Inside-view models-based analyses of timelines, including:

Methodological debates on AI-timelines analysis

Various methodological debates around AI-timelines analysis:

Advanced AI trajectories and early warning signals

A third technical subfield aims at charting the trajectories of advanced AI development, especially the potential for rapid and sudden capability gains, and whether there will be advanced warning signs:

1.2. Impact models for general social impacts from advanced AI 

Various significant societal impacts that could result from advanced AI systems:[ref 95]Potential for advanced AI systems to drive significant, even “explosive” economic growth[ref 96] but also risks of significant inequality or corrosive effects on political discourse;[ref 97]

This is an extensive field that spans a wide range of work, and the above is by no means exhaustive.

1.3. Threat models for extreme risks from advanced AI 

A second subcluster of work focuses on understanding the threat models of advanced AI risk,[ref 102] based on indirect arguments for risks, specific threat models for direct catastrophe, or takeover,[ref 103] or on specific threat models for indirect risks.[ref 104]

General arguments for risks from AI

Analyses that aim to explore general arguments (by analogy, on the basis of conceptual argument, or on the basis of empirical evidence from existing AI systems) over whether or why we might have grounds to be concerned about advanced AI.[ref 105]

Analogical arguments for risks

Analogies[ref 106] with historical cases or phenomena in other domains:

Analogies with known “control problems” observed in other domains:

Conceptual arguments for risks

Conceptual and theoretical arguments based on existing ML architectures: 

Conceptual and theoretical arguments based on the competitive environment that will shape the evolutionary development of AIs:

Empirical evidence for risks

Empirical evidence of unsolved alignment failures in existing ML systems, which are expected to persist or scale in more advanced AI systems:[ref 124]

Empirical examples of elements of AI threat models that have already occurred in other domains or with simpler AI systems:

Direct threat models for direct catastrophe from AI

Work focused at understanding direct existential threat models.[ref 139] This includes:

Scenarios for direct catastrophe caused by AI

Other lines of work have moved from providing indirect arguments of risk, to instead sketching specific scenarios in and through which advanced AI systems could directly inflict existential catastrophe.

Scenario: Existential disaster because of misaligned superintelligence or power-seeking AI
Scenario: Gradual, irretrievable ceding of human power over the future to AI systems
Scenario: Extreme “suffering risks” because of a misaligned system
Scenario: Existential disaster because of conflict between AI systems and multi-system interactions 
Scenario: Dystopian trajectory lock-in because of misuse of advanced AI to establish and/or maintain totalitarian regimes;
Scenario: Failures in or misuse of intermediary (non-AGI) AI systems, resulting in catastrophe
Other work: vignettes, surveys, methodologies, historiography, critiques

Threat models for indirect AI contributions to existential risk factors

Work focused at understanding indirect ways in which AI could contribute to existential threats, such as by shaping societal “turbulence”[ref 192] and other existential risk factors.[ref 193] This covers various long-term impacts on societal parameters such as science, cooperation, power, epistemics, and values:[ref 194] 

1.4. Profile of technical alignment problem

2. Deployment parameters

Another major part of the field aims to understand the parameters of the advanced AI deployment landscape by mapping the size and configuration of the “game board” of relevant advanced AI developers—the actors whose (ability to take) key decisions (e.g., around whether or how to deploy particular advanced AI systems, how much to invest in alignment research, etc.) may be key in determining risks and outcomes from advanced AI. 

As such, there is significant work on mapping the disposition of the AI development ecosystem and how this will determine who is (or will likely be) in the position to develop and deploy the most advanced AI systems. Some work in this space focuses on mapping the current state of these deployment parameters; other work focuses on the likely future trajectories of these deployment parameters over time.

2.1. Size, productivity, and geographic distribution of the AI research field 

2.2. Geographic distribution of key inputs in AI development

2.3. Organization of global AI supply chain

2.4. Dispositions and values of advanced AI developers

2.5. Developments in converging technologies

3. Governance parameters

Work on governance parameters aims to map (1) how AI systems are currently being governed, (2) how they are likely to be governed by default (given prevailing perceptions and regulatory initiatives), as well as (3) the conditions for developing and implementing productive governance interventions on advanced AI risk. 

Some work in this space focuses on mapping the current state of these governance parameters and how they affect AI governance efforts initiated today. Other work focuses on the likely future trajectories of these governance parameters.

3.1. Stakeholder perceptions of AI 

Surveys of current perceptions of AI among different relevant actors: 

Predicting future shifts in perceptions of AI among relevant actors given:

3.2. Stakeholder trust in AI developers 

3.3. Default landscape of regulations applied to AI

This work maps the prevailing (i.e., default, “business-as-usual”) landscape of regulations that will be applied to AI in the near term. These matter as they will directly affect the development landscape for advanced AI and indirectly bracket the space for any new (AI-specific) governance proposals.[ref 245] This work includes:

3.4. Prevailing barriers to effective AI governance

3.5. Effects of AI systems on tools of governance

Predicting the impact of future technologies on governance and the ways these could shift the possibility frontier of what kind of regimes will be politically viable and enforceable:

4. Other lenses on the advanced AI governance problem

Other work aims to derive key strategic lessons for advanced AI governance, not by aiming to empirically map or estimate first-order facts about the key (technical, deployment, or governance) strategic parameters, but rather by drawing indirect (empirical, strategic, and/or normative) lessons from abstract models, historical cases, and/or political theory.

4.1. Lessons derived from theory

Work characterizing the features of advanced AI technology and of its governance challenge, drawing on existing literatures or bodies of theory:

Mapping clusters and taxonomies of AI’s governance problems:

Mapping the political features of advanced AI technology:

Mapping the structural features of the advanced AI governance challenge:

Identifying design considerations for international institutions and regimes, from:

4.2. Lessons derived from models and wargames

Work to derive or construct abstract models for AI governance in order to gather lessons from these for understanding AI systems’ proliferation and societal impacts. This includes models of:

4.3. Lessons derived from history

Work to identify and study relevant historical precedents, analogies, or cases and to derive lessons for (AI) governance.[ref 299] This includes studies where historical cases have been directly applied to advanced AI governance as well as studies where the link has not been drawn but which might nevertheless offer productive insights for the governance of advanced AI.

Lessons from the history of technology development and spread

Historical cases that (potentially) provide insights into when, why, and how new technologies are pursued and developed—and how they subsequently (fail to) spread.

Historical rationales for technology pursuit and development

Historical rationales for actors pursuing large-scale scientific or technology development programs:

Historical strategies of deliberate large-scale technology development projects

Historical strategies for unilateral large-scale technology project development:

Historical strategies for joint or collaborative large-scale technology development:

Historical instances of sudden, unexpected technological breakthroughs

Historical cases of rapid, historically discontinuous breakthroughs in technological performance on key metrics:

Historical patterns in technological proliferation and take-up

Historical cases of technological proliferation and take-up:[ref 321]

Lessons from the historical societal impacts of new technologies

Historical cases that (potentially) provide insights into when, why, and how new technologies can have (unusually) significant societal impacts or pose acute risks.

Historical cases of large-scale societal impacts from new technologies

Historical cases of large-scale societal impacts from new technologies:[ref 331]

Historical cases of particular dangers or risks from new technologies

Historical precedents for particular types of dangers or threat models from technologies:

Historical cases of value changes as a result of new technologies

Historical precedents for technologically induced value erosion or value shifts: 

Historical cases of the disruptive effects on law and governance from new technologies

Historical precedents for effects of new technology on governance tools: 

Lessons from the history of societal reactions to new technologies

Historical cases that (potentially) provide insights into how societies are likely to perceive, react to, or regulate new technologies.

Historical reactions to and regulations of new technologies 

Historical precedents for how key actors are likely to view, treat, or regulate AI:

Lessons from the history of attempts to initiate technology governance

Historical cases that (potentially) provide insights into when efforts to initiate governance intervention on emerging technologies are likely to be successful and into the efficacy of various pathways towards influencing key actors to deploy regulatory levers in response.

Historical failures to initiate or shape technology governance 

Historical cases where a fear of false positives slowed (plausibly warranted) regulatory attention or intervention: 

Historical cases of excessive hype leading to (possibly) premature regulatory attention or intervention: 

Historical successes for pathways in shaping technology governance 

Historical precedents for successful action towards understanding and responding to the risks of emerging technologies, influencing key actors to deploy regulatory levers:

Lessons from the historical efficacy of different governance levers

Historical cases that (potentially) provide insights into when different societal (legal, regulatory, and governance) levers have proven effective in shaping technology development and use in desired directions.

Historical failures of technology governance levers 

Historical precedents for failed or unsuccessful use of various (domestic and/or international) governance levers for shaping technology:

Historical successes of technology governance levers 

Historical precedents for successful use of various governance levers at shaping technology:

4.4. Lessons derived from ethics and political theory 

Mapping the space of principles or criteria for “ideal AI governance”:[ref 451]

II. Option-identifying work: Mapping actors and affordances

Strategic clarity requires an understanding not just of the features of the advanced AI governance problem, but also of the options in response. 

This entails mapping the range of possible levers that could be used in response to this problem. Critically, this is not just about speculating about what governance tools we may want to put in place for future advanced AI systems mid-transition (after they have arrived). Rather, there might be actions we could take in the “pre-emergence” stage to adequately prepare ourselves.[ref 456] 

Within the field, there has been extensive work on options and areas of intervention. Yet there is no clear, integrated map of the advanced AI governance landscape and its gaps. Sam Clarke proposes that there are different ways of carving up the landscape, such as based on different types of interventions, different geographic hubs, or “Theories of Victory.”[ref 457] To extend this, one might segment the advanced AI governance solution space along work which aims to identify and understand, in turn:[ref 458]

1. Potential key actors shaping advanced AI

In other words, whose decisions might especially affect the development and deployment of advanced AI, directly or indirectly, such that these decisions should be shaped to be as beneficial as possible?

Some work in this space explores the relative importance of (the decisions of) different types of key actors: 

Other work focuses more specifically on mapping particular key actors whose decisions may be particularly important in shaping advanced AI outcomes, depending on one’s view of strategic parameters. 

The following list should be taken more as a “landscape” review than a literature review, since coverage of different actors differs amongst papers. Moreover, while the list aims to be relatively inclusive of actors, it is clear that the (absolute and relative) importance of each of these actors obviously differs hugely between worldviews and approaches.

1.1. AI developer (lab & tech company) actors 

Leading AI firms pursuing AGI: 

Chinese labs and institutions researching “general AI”;

Large tech companies[ref 472] that may take an increasingly significant role in AGI research:

Future frontier labs, currently not known but to be established/achieve prominence (e.g., “Magma”[ref 473]).

1.2. AI services & compute hardware supply chains 

AI services supply chain actors:[ref 474] 

Hardware supply chain industry actors:[ref 476] 

1.3. AI industry and academic actors 

Industry bodies:

Standard-setting organizations:

Software tools & community service providers:

Academic communities:

Other active tech community actors:

1.4. State and governmental actors 

Various states, and their constituent (government) agencies or bodies that are, plausibly will be, or potentially could be moved to be in powerful positions to shape the development of advanced AI.

The United States

Key actors in the US:[ref 489] 

China

Key actors in China:[ref 497]

The EU

Key actors in the EU:[ref 500]

The UK

Key actors in the UK:[ref 503]

Other states with varying roles

Other states that may play key roles because of their general geopolitical influence, AI-relevant resources (e.g., compute supply chain and significant research talent), or track record as digital norm setters: 

1.5. Standard-setting organizations

International standard-setting institutions:[ref 513] 

1.6. International organizations

Various United Nations agencies:[ref 515]

Other international institutions already engaged on AI in some capacity[ref 522] (in no particular order):

Other international institutions not yet engaged on AI:

1.7. Public, Civil Society, & media actors 

Civil society organizations:[ref 532] 


Media actors:

Cultural actors: 

2. Levers of governance (for each key actor)

That is, how might each key actor shape the development of advanced AI?

Research in this field includes analysis of different types of tools (key levers or interventions) available to different actors to shape advanced AI development and use.[ref 541]

2.1. AI developer levers

Developer (intra-lab)-level levers:[ref 542]

Developer external (unilateral) levers:

2.2. AI industry & academia levers 

Industry-level (coordinated inter-lab) levers:

Third-party industry actors levers:

Scientific community levers:

2.3. Compute supply chain industry levers

Global compute industry-level levers:[ref 584] 

2.4. Governmental levers

We can distinguish between general governmental levers and the specific levers available to particular key states.

General governmental levers[ref 586]

Legislatures’ levers:[ref 587]

Executive levers:


Judiciaries’ levers:


Expert agencies’ levers:


Ancillary institutions:

Foreign Ministries/State Department:


Specific key governments levers

Levers available to specific key governments:

US-specific levers:[ref 616] 


EU-specific levers: 

China-specific levers:


UK-specific levers:[ref 632] 


2.5. Public, civil society & media actor levers

Civil Society/activist movement levers:[ref 633]

2.6. International organizations and regime levers 

International standards bodies’ levers:

International regime levers:[ref 647]

2.7. Future, new types of institutions and levers

Novel governance institutions and innovations:

3. Pathways to influence (on each key actor)

That is, how might concerned stakeholders ensure that key actors use their levers to shape advanced AI development in appropriate ways?

This includes research on the different pathways by which the use of these above levers might be enabled, advocated for, and implemented (i.e., the tools available to affect the decisions by key actors).

This can draw on mappings and taxonomies: “A Map to Navigate AI Governance”[ref 660] “The Longtermist AI Governance Landscape”.[ref 661] 

3.1. Pathways to directly shaping advanced AI systems’ actions through law

Directly shaping advanced AI actions through law (i.e., legal systems and norms as an anchor or lodestar for technical alignment approaches):

3.2. Pathways to shaping governmental decisions

Shaping governmental decisions around AI levers at the level of:

3.3. Pathways to shaping court decisions

Shaping court decisions around AI systems that set critical precedent for the application of AI policy to advanced AI:

3.4. Pathways to shaping AI developers’ decisions

Shaping individual lab decisions around AI governance:

Shaping industry-wide decisions around AI governance:

3.5. Pathways to shaping AI research community decisions

Shaping AI research community decisions around AI governance:

Shaping civil society decisions around AI governance:

3.6. Pathways to shaping international institutions’ decisions

Shaping international institutional decisions around AI governance:

Shaping standards bodies’ decisions around AI governance:

3.7. Other pathways to shape various actors’ decisions

Shaping various actors’ decisions around AI governance:

III. Prescriptive work: Identifying priorities and proposing policies

Finally, a third category of work aims to go beyond either analyzing the problem of AI governance (Part I) or surveying potential elements or options for governance solutions analytically (Part II). This category is rather prescriptive in that it aims to directly propose or advocate for specific policies or actions by key actors. This includes work focused on: 

  1. Articulating broad theories of change to identify priorities for AI governance (given a certain view of the problem and of the options available); 
  2. Articulating broad heuristics for crafting good AI regulation; 
  3. Putting forward policy proposals as well as assets that aim to help in their implementation.

1. Prioritization: Articulating theories of change

Achieving an understanding of the AI governance problem and potential options in response is valuable. Yet, this is not enough alone to deliver strategic clarity about which of these actors should be approached or which of these levers should be utilized in what ways. For that, it is necessary to develop more systematic accounts of different (currently held or possible) theories of change or impact. 

The idea of exploring and comparing such theories of action is not new. There have been various accounts that aim to articulate the linkages between near-term actions and longer-term goals. Some of these have focused primarily on theories of change (or “impact”) from the perspective of technical AI alignment.[ref 705] Others have articulated more specific theories of impact for the advanced AI governance space.[ref 706] These include:

In addition, some have articulated specific scenarios for what successful policy action on advanced AI might look like,[ref 711] especially in the relative near-term future (“AI strategy nearcasting”).[ref 712] However much further work is needed.

2. General heuristics for crafting advanced AI policy 

General heuristics for making policies relevant or actionable to advanced AI.

2.1. General heuristics for good regulation

Heuristics for crafting good AI regulation:

2.2. Heuristics for good institutional design 

Heuristics for good institutional design:

2.3. Heuristics for future-proofing governance 

Heuristics for future-proofing governance regimes and desiderata and systems for making existing regulations more adaptive, scalable, or resilient:[ref 721]

3. Policy proposals, assets and products

That is, what are specific proposals for policies to be implemented? How can these proposals serve as products or assets in persuading key actors to act upon them?

Specific proposals for advanced AI-relevant policies; note that these are presented without comparison or prioritization. This list is non-exhaustive. Many proposals moreover combine several ideas, falling into different categories.

3.1. Overviews and collections of policies

3.2. Proposals to regulate AI using existing authorities, laws, or institutions

In particular, drawing on evaluations of the default landscape of regulations applied to AI (see Section I.3.3), and of the levers of governance for particular governments (see Section II.2.4).

Regulate AI using existing laws or policies

Proposals to set soft-law policy through existing international processes

3.3. Proposals for new policies, laws, or institutions 

A range of proposals for novel policies.

Impose (temporary) pauses on AI development

Establish licensing regimes

Establish lab-level safety practices

Establish governance regimes on AI inputs (compute, data)

Establish domestic institutions for AI governance

Establish international AI research consortia

Proposals to establish new international hubs or organizations aimed at AI research.[ref 770]

Establish bilateral agreements and dialogues

Establish multilateral international agreements 

Proposal to establish a new multilateral treaty on AI:[ref 784]

Establish international governance institutions

Proposals to establish a new international organization, along one or several models:[ref 791]

Conclusion

The recent advances in AI have turned global public attention to this technology’s capabilities, impacts, and risks. AI’s significant present-day impacts and the prospect that these will only spread and scale further as these systems get increasingly advanced have firmly fixed this technology as a preeminent challenge for law and global governance this century. 

In response, the disparate community of researchers that have explored aspects of these questions over the past years may increasingly be called upon to translate that research into rigorous, actionable, legitimate, and effective policies. They have developed—and continue to produce—a remarkably far-flung body of research, drawing on a diverse range of disciplines and methodologies. The urgency of action around advanced AI accordingly create a need for this field to increase the clarity of its work and its assumptions, to identify gaps in its approaches and methodologies where it can learn from yet more disciplines and communities, to improve coordination amongst lines of research, and to improve legibility of its argument and work to improve constructive scrutiny and evaluation of key arguments and proposed policies. 

This review has not remotely achieved these goals—as no single document or review can. Yet by attempting to distill and disentangle key areas of scholarship, analysis, and policy advocacy, it hopes to help contribute to greater analytical and strategic clarity, more focused and productive research, and better-informed public debates and policymaker initiatives on the critical global challenges of advanced AI.


Also in this series

AI is like… A literature review of AI metaphors and why they matter for policy

Executive summary

This report provides an overview, taxonomy, and preliminary analysis of the role of basic metaphors and analogies in AI governance. 

Aim: The aim of this report is to contribute to improved analysis, debate, and policy for AI systems by providing greater clarity around the way that analogies and metaphors can affect technology governance generally, around how they may shape AI governance, and about how to improve the processes by which some analogies or metaphors for AI are considered, selected, deployed, and reviewed.

Summary: In sum, this report:

  1. Draws on technology law scholarship to review five ways in which metaphors or analogies exert influence throughout the entire cycle of technology policymaking by shaping:
    1. patterns of technological innovation; 
    2. the study of particular technologies’ sociotechnical impacts or risks; 
    3. which of those sociotechnical impacts make it onto the regulatory agenda; 
    4. how those technologies are framed within the policymaking process in ways that highlight some issues and policy levers over others; and 
    5. how these technologies are approached within legislative and judicial systems. 
  2. Illustrates these dynamics with brief case studies where foundational metaphors shaped policy for cyberspace, as well as for recent AI issues. 
  3. Provides an initial atlas of 55 analogies for AI, which have been used in expert, policymaker, and public debate to frame discussion of AI issues, and discusses their implications for regulation.
  4. Reflects on the risks of adopting unreflexive analogies and misspecified (legal) definitions.

Below, the reviewed analogies are summarized in Table 1.

Table 1: Overview of surveyed analogies for AI (brief, without policy implications)

ThemeFrame (varieties)
Essence
Terms focusing on what AI is
Field of science
IT technology (just better algorithms, AI as a product)
Information technology
Robots (cyber-physical systems, autonomous platforms)
Software (AI as a service)
Black box
Organism (artificial life)
Brain
Mind (digital minds, idiot savant)
Alien (shoggoth)
Supernatural entity (god-like AI, demon)
Intelligence technology (markets, bureaucracies, democracies)
Trick (hype)
Operation
Terms focusing on how AI works
Autonomous system
Complex adaptive system
Evolutionary process
Optimization process
Generative system (generative AI)
Technology base (foundation model)
Agent
Pattern-matcher (autocomplete on steroids, stochastic parrot)
Hidden human labor (fauxtomation)
Relation
Terms focusing on how we relate to AI, as (possible) subject
Tool (just technology)
Animal
Moral patient
Moral agent
Slave
Legal entity (digital person, electronic person, algorithmic entity)
Culturally revealing object (mirror to humanity, blurry JPEG of the web)
Frontier (frontier model)
Our creation (mind children)
Next evolutionary stage or successor
Function
Terms focusing on how AI is or can be used
Companion (social robots, care robots, generative chatbots, cobot)
Advisor (coach, recommender, therapist)
Malicious actor tool (AI hacker)
Misinformation amplifier (computational propaganda, deepfakes, neural fake news)
Vulnerable attack surface
Judge
Weapon (killer robot, weapon of mass destruction)
Critical strategic asset (nuclear weapons)
Labor enhancer (steroids, intelligence forklift)
Labor substitute
New economic paradigm (fourth industrial revolution)
Generally enabling technology (the new electricity / fire / internal combustion engine)
Tool of power concentration or control
Tool for empowerment or resistance (emancipatory assistant)
Global priority for shared good
Impact
Terms focusing on the unintended risks, benefits or side-effects of AI
Source of unanticipated risks (algorithmic black swan)
Environmental pollutant
Societal pollutant (toxin)
Usurper of human decision-making authority
Generator of legal uncertainty
Driver of societal value shifts
Driver of structural incentive shifts
Revolutionary technology
Driver of global catastrophic or existential risk

Introduction

Everyone loves a good analogy like they love a good internet meme—quick, relatable, shareable,[ref 1] memorable, and good for communicating complex topics to family.

Background: As AI systems have become increasingly capable and have had increasingly public impacts, there has been significant public and policymaker debate over the technology. Given the breadth of the technology’s application, many of these discussions have come to deploy—and contest—a dazzling range of analogies, metaphors, and comparisons for AI systems in order to understand, frame, or shape the technologies’ impact and its regulation.[ref 2] Yet the speed with which many often jump to invoke particular metaphors—or to contest the accuracy of others—leads to frequent confusion over these analogies, how they are used, and how they are best evaluated or compared.[ref 3] 

Rationale: Such debates are not just about wordplay—metaphors matter. Framings, metaphors, analogies, and (at the most specific end) definitions can strongly affect many key stages of the world’s response to a new technology, from the initial developmental pathways for technology, to the shaping of policy agendas, to the efficacy of legal frameworks.[ref 4] They have done so consistently in the past, and we have reason to believe they will especially do so for (advanced) AI. Indeed, recent academic, expert, public, and legal contests around AI often already strongly turn on “battles of analogies.”[ref 5] 

Aim: Given this, there is a need for those speaking about AI to better understand (a) when they speak in analogies—that is, when the ways in which AI is described (inadvertently) import one or more foundational analogies; (b) what it does to utilize one or another metaphor for AI; (c) what different analogies could be used instead; (d) how the appropriateness of one or another metaphor is best evaluated; and (e) what, given this, might be the limits or risks of jumping at particular analogies. 

This report aims to respond to these questions and contribute to improved analysis, debate, and policy by providing greater clarity around the role of metaphors in AI governance, the range of possible (alternate) metaphors, and good practices in constructing and using metaphors. 

Caveats: The aim here is not to argue against the use of any analogies in AI policy debates—if that were even possible. Nor is it to prescribe (or dismiss) one or another metaphor for AI as “better” (or “worse”) per se. The point is not that one particular comparison is the best and should be adopted by all, or that another is “obviously” flawed. Indeed, in some sense, a metaphor or analogy cannot be “wrong,” only more tenuous and more or less suitable when considered from the perspective of some values or some (regulatory) purpose. As such, different metaphors may work best in different contexts. Given this, this report highlights the diversity of analogies in current use and provides context for more informed future discourse and policymaking. 

Terminology: Strictly speaking, there is a difference between a metaphor—“an implied comparison between two things of unlike nature that yet have something in common”—and an analogy—“a non-identical or non-literal similarity comparison between two things, with a resulting predictive or explanatory effect.”[ref 6] However, while in legal contexts the two can be used in slightly different ways, cognitive science suggests that humans process information by metaphor and by analogy in similar ways.[ref 7] As a result, within this report, “analogy” and “metaphor” will be used relatively interchangeably to refer to (1) communicated framings of an (AI) issue that describe that issue (2) through terms, similes, or metaphors which rely on, invoke, or importreferences to a different phenomenon, technology, or historical event, which (3) is (assumed to be) comparable in one or more ways (e.g., technical, architectural, political, or moral) (4) which are relevant to evaluating or responding to the (AI) issue at hand. Furthermore, the report will use the term “foundational metaphor” to discuss cases where a particular metaphor for the technology has become deeply established and embedded within larger policy programs, such that the nature of the metaphor as a metaphor may even become unclear.

Structure: Accordingly, this report now proceeds as follows. In Part I, it discusses why and how definitions matter to both the study and practice of AI governance. It reviews five ways in which analogies or definitions can shape technology policy generally. To illustrate this, Part II reviews a range of cases in which deeply ingrained foundational metaphors have shaped internet policy as well as legal responses to various AI uses. In Part III, this report provides an initial atlas of 55 different analogies that have been used for AI in recent years, along with some of their regulatory implications. Part IV briefly discusses the risks of using analogies in unreflexive ways.

I. How metaphors shape technology governance

Given the range of disciplinary backgrounds in debates over AI, we should not be surprised that the technology is perceived and understood differently by many. 

Nonetheless, it matters to get clarity, because terminological and analogical framing effects happen at all stages in the cycle from technological development to societal response. They can shape the initial development processes for technologies as well as the academic fields and programs that study their impacts.[ref 8] Moreover, they can shape both the policymaking processes and the downstream judicial interpretation and application of legislative texts.

1. Metaphors shape innovation

Metaphors and analogies are strongly rooted in human psychology.[ref 9] Even some nonhuman animals think analogically.[ref 10] Indeed, human creativity has even been defined as “the capacity to see or interpret a problematic phenomenon as an unexpected or unusual instance of a prototypical pattern already in one’s conceptual repertoire.”[ref 11]

Given this, metaphors and analogies can shape and constrain the ability of humans to collectively create new things.[ref 12] In this way, technology metaphors can affect the initial human processes of invention and investment that drive the development of AI and other technologies in the first place. It has been suggested that foundational metaphors can influence the organization and direction of scientific fields—and even that all scientific frameworks could to some extent be viewed as metaphors.[ref 13] For example, the fields of cell biology and biotechnology have for decades been shaped by the influential foundational metaphor that sees biological cells as “machines,” which has led to sustained debates over the scientific use and limits of that analogy in shaping research programs.[ref 14] 

More practically, at the development and marketing stage, metaphors can shape how consumers and investors assess proposed startup ideas[ref 15] and which innovation paths attract engineer, activist, and policymaking interest and support. In some such cases, metaphors can support and spur on innovation; for instance, it has been argued that through the early 2000s, the coining of specific IT metaphors for electric vehicles—as a “computer on wheels”—played a significant role in sustaining engineer support for and investment in this technology, especially during an industry downturn in the wake of General Motors’ sudden cancellation of its EV1 electric car.[ref 16] 

Conversely, metaphors can also hold back or inhibit certain pathways of innovation; for instance, in the Soviet Union in the early 1950s, the field of cybernetics (along with other fields such as genetics or linguistics) fell victim to anti-American campaigns, which characterized it as “an ‘obscurantist’, ‘bourgeois pseudoscience’”.[ref 17] While this did not affect the early development of Soviet computer technology (which was highly prized by the state and the military), the resulting ideological rejection of the “man-machine” analogy by Marxist-Leninist philosophers led to an ultimately dominant view, in Soviet sciences, of computers as solely “tools to think with” rather than “thinking machines,” holding back the consolidation of the field (such that even the label “AI” would not be recognized by the Soviet Academy of Sciences until 1987) and shifting research attention into projects that focused on the “situational management” of large complex systems rather than the pursuit of human-like thinking machines.[ref 18] This stood in contrast to US research programs, such as DARPA’s 1983–1993 Strategic Computing Initiative, an extensive, $1 billion program to achieve “machines that think.”[ref 19]

2. Metaphors inform the study of technologies’ impacts

Particular definitions also shape and prime academic fields that study the impacts of these technologies (and which often may uncover or highlight particular developments as issues for regulation). Definitions affect which disciplines are drawn to work on a problem, what tools they bring to hand, and how different analyses and fields can build on one another. For instance, it has been argued that the analogy between software code and legal text has supported greater and more productive engagement by legal scholars and practitioners with such code at the level of its (social) meaning and effects (rather than narrowly on the level of the techniques used).[ref 20] Given this, terminology can affect how AI governance is organized as a field of analysis and study, what methodologies are applied, and what risks or challenges are raised or brought up.

3. Metaphors set the regulatory agenda 

More directly, particular definitions or frames for a technology can set and shape the policymaking agenda in various ways. 

For instance, terms and frames can raise (or suppress) policy attention for an issue, affecting whether policymakers or the public care (enough) about a complex and often highly technical topic in the first place to take it up for debate or regulation. For instance, it has been argued that framings that focus on the viscerality of the injuries inflicted by a new weapon system have in the past boosted international campaigns to ban blinding lasers and antipersonnel mines, yet they ended up being less successful in spurring effective advocacy around “killer robots.”[ref 21] 

Moreover, metaphors—and especially specific definitions—can shape (government) perceptions of the empirical situation or state of play around a given issue. For instance, the particular definition used for “AI” can directly affect which (industrial or academic) metrics are used to evaluate different states’ or labs’ relative achievements or competitiveness in developing the technology. In turn, that directly shapes downstream evaluations of which nation is “ahead” in AI.[ref 22] 

Finally, terms can frame the relevant legal actors and policy coalitions, enabling (or inhibiting) inclusion and agreement at the level of interest or advocacy groups that push for (or against) certain policy goals. For instance, the choice for particular terms or framings that meet with broad agreement or acceptance amongst many actors can make it easier for a diverse set of stakeholders to join together in pushing for regulatory actions. However, such agreement may be fostered by definitional clarity, when terms or frames are transparent and meet with wider acceptance, or because of definitional ambiguity, when a broad term (such as “ethical AI”) allows for sufficient ambiguity that different actors can meet on an “incompletely theorized agreement”[ref 23] to pursue a shared policy program on AI.

4. Metaphors frame the policymaking process

Terms can have a strong overall effect on policy issue-framing, foregrounding different problem portfolios as well as regulatory levers. For instance, early societal debates around nanotechnology were significantly influenced by analogies with asbestos and genetically modified organisms.[ref 24]

Likewise, regulatory initiatives that frame AI systems as “products” imply that these fit easily within product safety frameworks—even if that may be a poor or insufficient model for AI governance, for instance because it is a model that fails to address any risks at the developmental stage[ref 25] or because it fails to accurately focus on fuzzier impacts on fundamental rights if those cannot be easily classified as consumer harms.[ref 26] 

This is not to say that the policy-shaping influence of terms (or explicit metaphors) is absolute and irrevocable. For instance, in a different policy domain, a 2011 study found that using metaphors that described crime as a “beast” led study participants to recommend law-and-order responses, whereas describing it as a “virus” led them to put more emphasis on public-health-style policies. However, even under the latter framing, law-and-order policy responses still prevailed, simply commanding a smaller majority than they would otherwise.[ref 27] 

Nonetheless, metaphors do exert sway throughout the policymaking process. For instance, they can shape perceptions of the feasibility of regulation by certain routes. As an example, framings of digital technologies that emphasize certain traits of technologies—such as the “materiality” or “seeming immateriality,” or the centralization or decentralization, of technologies like submarine cables, smart speakers, search engines, or the bitcoin protocol—can strongly affect perceptions of whether, or by what routes, it is most feasible to regulate that technology at the global level.[ref 28] 

Likewise, different analogies or historical comparisons for proposed international organizations for AI governance—ranging from the IAEA and IPCC to the WTO or CERN—often import tacit analogical comparisons (or rather constitute “reflected analogies”) between AI and those organizations’ subject matter or mandates in ways that shape the perceptions of policymakers and the public regarding which of AI’s challenges require global governance, whether or which new organizations are needed, and whether the establishment of such organizations will be feasible.[ref 29]

5. Metaphors and analogies shape the legislative & judicial response to tech

Finally, metaphors, broad analogies, and specific definitions can frame legal and judicial treatment of a technology in both the ex ante application of AI-focused regulations and the ex post subsequent judicial interpretation of either such AI-specific legislation or of general regulations in the context of cases involving AI. 

Indeed, much of legal reasoning, especially in court systems, and especially in common law jurisdictions, is deeply analogical.[ref 30] This is for various reasons.[ref 31] For one, legal actors are also human, and strong features of human psychology can skew these actors towards the use of analogies that refer to known and trusted categories: as such, as Mandel has argued, “availability and representativeness heuristics lead people to view a new technology and new disputes through existing frames, and the status quo bias similarly makes people more comfortable with the current legal framework.”[ref 32] This is particularly the case because much of legal scholarship and work aims to be “problem-solving” rather than “problem-finding”[ref 33] and to respond to new problems by appealing to pre-existent (ethical or legal) principles, norms, values, codes, or laws.[ref 34] Moreover, from an administrative perspective, it is often easier and more cost-effective to extend existing laws by analogy. 

Finally, and more fundamentally, the resort to analogy by legal actors can be a shortcut that aims to apply the law, and solve a problem, through an “incompletely theorized agreement” that does not require reopening contentious questions or debates over the first principles or ultimate purposes of the law,[ref 35] or renegotiating hard-struck legislative agreements. This is especially the case at the level of international law, where either negotiating new treaties or explicitly amending multilateral treaties to incorporate a new technology within an existing framework can be wrought, drawn-out processes[ref 36] such that many actors may prefer ultimately addressing new issues (such as cyberwar) within existing norms or principles by analogizing them to well-established and well-regulated behaviors.[ref 37]

Given this, when confronted with situations of legal uncertainty—as often happens with a new technology[ref 38]—legal actors may favor the use of analogies to stretch existing law or to interpret new cases as falling within existing doctrine. That does not mean that courts need immediately settle or converge on one particular “right” analogy. Indeed, there are always multiple analogies possible, and these can have significantly different implications for how the law is interpreted and applied. That means that many legal cases involving technology will involve so-called “battles of analogies.”[ref 39] For example, in recent class action lawsuits that have accused generative AI providers such as Stable Diffusion and Midjourney of copyright infringement, plaintiffs have argued that these generative AI models are “essentially sophisticated collage tools, with the output representing nothing more than a mash-up of the training data, which is itself stored in the models as compressed copies.”[ref 40] Some have countered that this analogy suffers some technical inaccuracies, since current generative AI models do not store compressed copies of the training data, such that a better analogy would be that of an “art inspector” that takes every measurement possible—implying that model training either is not governed by copyright law or constitutes fair use.[ref 41] 

Finally, even if specific legislative texts move to adopt clear, specific statutory definitions for AI—in a way that avoids (explicit) comparison or analogy with other technologies or behavior—this may not entirely avoid framing effects. Most obviously, legislative definitions for key terms such as “AI” obviously affect the material scope of regulations and policies that use and define such terms.[ref 42] Indeed, the effects of particular definitions have impacts on regulation not only ex ante but also ex post: in many jurisdictions, legal terms are interpreted and applied by courts based on their widely shared “ordinary meaning.”[ref 43] This means, for instance, that regulations that refer to terms such as “advanced AI,” “frontier AI,” or “transformative AI”[ref 44] might not necessarily be interpreted or applied in ways that are in line with how the term is understood within expert communities.[ref 45] 

All of this underscores the importance of our choice of terms and frames—whether broad and indirect metaphors or concrete and specific legislative definitions—when grappling with the impacts of this technology on society.

II. Foundational metaphors in technology law: Cases

Of course, these dynamics are not new and have been studied in depth in fields such as cyberlaw, law and technology, and technology law.[ref 46] For instance, we can see many of these framing dynamics within societal (and regulator) responses to other cornerstone digital technologies. 

1. Metaphors in internet policy: Three cases

For instance, for the complex sociotechnical system[ref 47] commonly called the internet, foundational metaphors have strongly shaped regulatory debates, at times as much as sober assessments of the nuanced technical details of the artifacts involved have.[ref 48] As noted by Rebecca Crootof: 

“A ‘World Wide Web’ suggests an organically created common structure of linked individual nodes, which is presumably beyond regulation. The ‘Information Superhighway’ emphasizes the import of speed and commerce and implies a nationally funded infrastructure subject to federal regulation. Meanwhile, ‘cyberspace’ could be understood as a completely new and separate frontier, or it could be viewed as yet one more kind of jurisdiction subject to property rules and State control.”[ref 49]

For example, different terms (and the foundational metaphors they entail) have come to shape internet policy in various ways and domains. Take for instance the following cases: 

Institutional effects of framing cyberwar policy within cyber-“space”: For over a decade, the US military framed the internet and related systems as a “cyberspace”—that is, just another “domain” of conflict along with land, sea, air, and space—leading to strong consequences institutionally (expanding the military’s role in cybersecurity and supporting the creation of US Cyber Command) as well as for how international law has subsequently been applied to cyber operations.[ref 50] 

Issue-framing effects of regulating data as “oil,” “sunlight,” “public utility,” or “labor”: Different metaphors for “data” have drastically different political and regulatory implications.[ref 51] The oil metaphor emphasizes data as a valuable traded commodity that is owned by whoever “extracts” it and that, as a key resource in the modern economy, can be a source of geopolitical contestation between states. However, the oil metaphor implies that the history of data prior to its collection is not relevant and so sidesteps questions of any “misappropriation or exploitation that might arise from data use and processing.”[ref 52] Moreover, even within an regulatory approach that emphasizes geopolitical competition over AI, one can still critique the “oil” metaphor as misleading, for instance because of the ways in which it skews debates over how to assess “data competitiveness” in military AI.[ref 53] By contrast, the sunlight metaphor emphasizes data as a ubiquitous public resource that ought to be widely pooled and shared for social good, de-emphasizing individual data privacy claims; the public utility metaphor sees data as an “infrastructure” that requires public investment and new institutions, such as data trusts or personal data stores, to guarantee “data stewardship”; and the labor frame asserts the ownership rights of the individuals generating data against what are perceived as extractive or exploitative practices of “surveillance capitalism.”[ref 54]

Judicial effects of treating search engines as “newspaper editorials” in censorship cases: In the mid-2000s, US court rulings involving censorship on search engines tended to analyze them by analogy to older technologies such as the newspaper editorial.[ref 55] As these examples suggest, different terms and their metaphors matter. They serve as intuition pumps for key audiences (public, policy) that otherwise may have significant disinterest in, lack of expertise in, inferential distance to, or limited bandwidth for new technologies. Moreover, as seen in social media platforms and online content aggregators’ resistance to being described as “media companies” rather than “technology companies,”[ref 56] even seemingly innocuous terms can carry significant legal and policy implications—in doing so, such terms can serve as a legal “sorter,” determining whether a technology (or the company developing and marketing it) is considered as falling into one or another regulatory category.[ref 57]

2. Metaphors in AI law: Three cases

Given the role of metaphors and definitions to strongly shape the direction and efficacy of technology law, we should expect them to likewise play a strong role in affecting the framing and approach of AI regulation in the future, for better or worse. Indeed, in a range of domains, they have already done so:

Autonomous weapons systems under international law: International lawyers often aim to subsume new technologies under (more or less persuasive) analogies to existing technologies or entities that are already regulated.[ref 58] As such, different analogies have been drawn between autonomous weapons systems to weapons, combatants, child soldiers, or animal combatants—all of which lead to very different consequences for their legality under international humanitarian law.[ref 59] 

Release norms for AI models with potential for misuse: In debates over the potential misuse risks from emerging AI systems, efforts to attempt to restrict or slow publication of new systems with potential for misuse have found themselves challenged by framings that pitch the field of AI as being intrinsically an open science (where new findings should be shared whatever the risks) versus those that emphasize analogies to cybersecurity (where dissemination can help defenders protect against exploits). Critically, however, both of these analogies may misstate or underappreciate the dynamics that affect the offense-defense balance of new AI capabilities: while in information security the disclosure of software vulnerabilities has traditionally favored defense, this cannot be assumed for AI research, where (among others) it can be much more costly or intractable to “patch” the social vulnerabilities exploited by AI capabilities.[ref 60]

Liability for inaccurate or unlawful speech produced by AI chatbots, large language models, and other generative AI: In the US, Section 230 of the 1996 Communications Decency Act protects online service providers from liability for user-generated content that they host and has accordingly been considered a cornerstone to the business model of major online platforms and social media companies.[ref 61] For instance, in Spring 2023, the US Supreme Court took up two lawsuits—Gonzales v. Google and Twitter v. Taamneh—which could have shaped Section 230 protections for algorithmic recommendations.[ref 62] While the Court’s rulings on these cases avoided addressing the issue,[ref 63] similar court cases (or legislation) could have strong implications for whether digital platforms or social media companies will be held liable for unlawful speech produced by large language model-based AI chatbots.[ref 64] If such AI chatbots are analogized to existing search engines, they might be able to rely on a measure of protection from Section 230, greatly facilitating their deployment, even if they link to inaccurate information. Conversely, if these chatbot systems are considered so novel and creative that their output goes beyond the functions of a search engine, they might instead be considered as “information content providers” within the remit of the law—or simply held to be beyond the law’s remit (and protection) entirely.[ref 65] This would mean that technology companies would be held legally responsible for their AI’s outputs. If that were the case, this reading of the law would significantly restrict the profitability of many AI chatbots, given the tendency of the underlying LLMs to “hallucinate” facts.[ref 66]

All this again highlights that different definitions or terms for AI will frame how policymakers and courts understand the technology. This creates a challenge for policy, which must address the transformative impact and potential risks of AI as they are (and as they may soon be), and not only as they can be easily analogized to other technologies and fields. What does that mean in the context of developing AI policy in the future?

III. An atlas of AI analogies

Development of policy must contend with the lack of settled definitions for the term “AI,” with the varied concepts and ideas projected onto it, and with the pace at which new terms —from “foundation models” to “generative AI”—are often coined and adopted.[ref 67]

Indeed, this breadth of analogies that are coined around AI should not be surprising, given that even just the term “artificial intelligence” has a number of aspects that support conceptual fluidity (or alternately, confusion). This is for various reasons.[ref 68] In the first place, the term invokes a term—“intelligence”—which is in widespread and everyday use, and which for many people has strong (evaluative or normative) connotations. It is essentially a suitcase word that packages together many competing meanings,[ref 69] even while it hides deep and perhaps even intractable scientific and philosophical disagreement[ref 70] and significant historical and political baggage.[ref 71] 

Secondly, and in contrast to, say, “blockchain ledgers,” AI technology comes with a baggage of decades of depictions in popular culture—and indeed centuries of preceding stories about intelligent machines[ref 72]—resulting in a whole genre of tropes or narratives that can color public perceptions and policymaker debates. 

Thirdly, AI is an evocative general-purpose technology that sees use in a wide variety of domains and accordingly has provoked commentary from virtually every disciplinary angle, including neuroscience, philosophy, psychology, law, politics, and ethics. As a result of this, a persistent challenge in work on AI governance—and indeed, in the broader public debates around AI—has been that different people use the word “AI” to refer to widely different artifacts, practices, or systems, or operate on the basis of definitions or understandings which package together a range of implicit assumptions.[ref 73]

Thus, it is no surprise that AI has been subjected to a diverse range of analogies and frames. To understand potential implications of AI analogies, we can draw a taxonomy of common framings of AI (see Table 2), whereby we can distinguish between analogies that focus on: 

  1. the essence or nature of AI (what AI “is”), 
  2. AI’s operation (how AI works), 
  3. our relation to AI (how we relate to AI as subject), 
  4. AI’s societal function (how AI systems are or can be used), 
  5. AI’s impact (the unintended risks, benefits, and other side-effects of AI).

Table 2: Atlas of AI analogies, with framings and selected policy implications

ThemeFrame (examples)Emphasizes to policy actors (e.g.)
Essence
Terms focusing on what AI is
Field of science[ref 74]Ensuring scientific best practices; improving methodologies, data sharing, and benchmark performance reporting methodologies to avoid replicability problems;[ref 75] ensuring scientific freedom and openness rather than control and secrecy.[ref 76]
IT technology (just better algorithms, AI as a product[ref 77])Business-as-usual; industrial applications; conventional IT sector regulation.

Product acquisition & procurement processes; product safety regulations.
Information technology[ref 78]Economic implications of increasing returns to scale and income distribution vs. distribution of consumer welfare; facilitation of communication and coordination; effects on power balances.
Robots (cyber-physical systems,[ref 79] autonomous platforms)Physicality; embodiment; robotics; risks of physical harm;[ref 80] liability; anthropomorphism; embedment in public spaces.
Software (AI as a service)Virtuality; digitality; cloud intelligence; open-source nature of development process; likelihood of software bugs.[ref 81]
Black box[ref 82]Opacity; limits to explainability of a system; risks of loss of human control and understanding; problematic lack of accountability. But also potentially de-emphasizes human decisions and their value judgments behind an algorithmic system, and presents the technology as monolithic, incomprehensible, and unalterable.[ref 83]
Organism (artificial life)Ecological “messiness”; ethology of causes of “machine behavior” (development, evolution, mechanism, function).[ref 84]
BrainsApplicability of terms and concepts from neuroscience; potential anthropomorphization of AI functionalities along human traits.[ref 85]
Mind (digital minds,[ref 86] idiot savant[ref 87])Philosophical implications; consciousness, sentience, psychology.
Alien (shoggoth[ref 88])Inhumanity, incomprehensibility, deception in interactions
Supernatural entity (god-like AI,[ref 89] demon[ref 90])Force beyond human understanding or control.
Intelligence technology[ref 91] (markets, bureaucracies, democracies[ref 92])Questions of bias, principal-agent alignment and control.
Trick (hype)Potential of AI exaggerated; questions of unexpected or fundamental barriers to progress, friction in deployment; “hype” as smokescreen or distraction from social issues.
Operation
Terms focusing on how AI works
Autonomous systemDifferent levels of autonomy; human-machine interactions; (potential) independence from “meaningful human control”; accountability & responsibility gaps.
Complex adaptive systemUnpredictability; emergent effects; edge case fragility; critical thresholds; “normal accidents”.[ref 93]
Evolutionary processNovelty, unpredictability, or creativity of outcomes;[ref 94] “perverse” solutions and reward hacking.
Optimization process[ref 95]Inapplicability of anthropomorphic intuitions about behavior.[ref 96] Risks of the system optimizing for the wrong targets or metrics;[ref 97] Goodhart’s Law;[ref 98] risks from “reward hacking”.
Generative system (generative AI)Potential “creativity” but also unpredictability of system; resulting “credit-blame asymmetry” where users are held responsible for misuses, but can claim less credit for good uses, shifting workplace norms.[ref 99]
Technology base (foundation model)Adaptability of system to different purposes; potential for downstream reuse and specialization, including for unanticipated or unintended uses; risk that any errors or issues at the foundation-level seep into later or more specialized (fine-tuned) models;[ref 100] questions of developer liability.
Agent[ref 101]Responsiveness to incentives and goals; incomplete-contracting and principal-agent problems;[ref 102] surprising, emergent, and harmful multi-agent interactions[ref 103] systemic, delayed societal harms and diffusion of power away from humans.[ref 104]
Pattern-matcher (autocomplete on steroids,[ref 105] stochastic parrot[ref 106])Problems of bias; mimicry of intelligence; absence of “true understanding”; fundamental limits.
Hidden human labor (fauxtomation[ref 107])Potential of AI exaggerated; “hype” as a smokescreen or distraction from extractive underlying practices of human labor in AI development.
Relation
Terms focusing on how we relate to AI, as (possible) subject
Tool (just technology, intelligent system[ref 108])Lack of any special relation towards AI, as AI is not a subject; questions of reliability and engineering.
Animal[ref 109]Entities capable of some autonomous action, yet lacking full competence or ability of humans. Accordingly may be potentially deserving of empathy and/or (some) rights[ref 110] or protections against abusive treatment, either on their own terms[ref 111] or in light of how abusive treatment might desensitize and affect social behavior amongst humans;[ref 112] questions of legal liability and assignment of responsibility to robots,[ref 113] especially when used in warfare.[ref 114]
Moral patient[ref 115]Potential moral (welfare) claims by AI, conditional on certain properties or behavior.
Moral agentMachine ethics; ability to encode morality or moral rules.
Slave[ref 116]AI systems or robots as fully owned, controlled, and directed by humans; not to be humanized or granted standing.
Legal entity (digital person, electronic person,[ref 117] algorithmic entity[ref 118])Potential of assigning (partial) legal personhood to AI for pragmatic reasons (e.g., economic, liability, or risks of avoiding “moral harm”), without necessarily implying deep moral claims or standing.
Culturally revealing object (mirror to humanity,[ref 119] blurry JPEG of the web[ref 120])Generally, implications of how AI is featured in fictional depictions and media culture.[ref 121] Directly, AI’s biases and flaws as a reflection of human or societal biases, flaws, or power relations. May also imply that any algorithmic bias derives from society rather than the technology per se.[ref 122]
Frontier (frontier model[ref 123])Novelty in terms of both capabilities (increased capability and generality) and/or in form (e.g., scale, design, or architectures) compared to other AI systems; as a result, new risks because of new opportunities for harm, and less well-established understanding by the research community.

Broadly, implies danger and uncertainty but also opportunity; may imply operating within a wild, unregulated space, with little organized oversight.
Our creation (mind children[ref 124])“Parental” or procreative duties of beneficence; humanity as good or bad “example.”
Next evolutionary stage or successorMacro-historical implications; transhumanist or posthumanist ethics & obligations.
Function
Terms focusing on How AI is-, or can be used
Companion (social robots, care robots, generative chatbots, cobot[ref 125])Human-machine interactions; questions of privacy, human over-trust, deception, and human dignity.
Advisor (coach, recommender, therapist)Questions of predictive profiling, “algorithmic outsourcing” and autonomy, accuracy, privacy, impact on our judgment and morals.[ref 126] Questions of patient-doctor confidentiality, as well as “AI loyalty” debates over fiduciary duties that can ensure AI advisors act in their users’ interests.[ref 127]
Malicious actor tool (AI hacker[ref 128])Possible misuse by criminals or terrorist actors. Scaling up of attacks as well as enabling entirely new attacks or crimes.[ref 129]
Misinformation amplifier (computational propaganda,[ref 130] deepfakes, neural fake news[ref 131])Scaling up of online mis- and disinformation; effect on “epistemic security”;[ref 132] broader effects on democracy, electoral integrity.[ref 133]
Vulnerable attack surface[ref 134]Susceptibility to adversarial input, spoofing, or hacking.
Judge[ref 135]Questions of due process and rule of law; questions of bias and potential self-corrupting feedback loops based on data corruption.[ref 136]
Weapon (killer robot,[ref 137] weapon of mass destruction[ref 138])In military contexts, questions of human dignity,[ref 139] compliance with laws of war, tactical effects, strategic effects, geopolitical impacts, and proliferation rates. In civilian contexts, questions of proliferation, traceability, and risk of terror attacks.
Critical strategic asset (nuclear weapons)[ref 140]Geopolitical impacts; state development races; global proliferation.
Labor enhancer (steroids,[ref 141] intelligence forklift[ref 142])Complementarity with existing human labor and jobs; force multiplier on existing skills or jobs; possible unfair advantages & pressure on meritocratic systems.[ref 143]
Labor substituteErosive to or threatening of human labor; questions of retraining, compensation, and/or economic disruption.
New economic paradigm (fourth industrial revolution)Changes in industrial base; effects on political economy.
Generally enabling technology (the new electricity / fire / internal combustion engine[ref 144])Widespread usability; increasing returns to scale; ubiquity; application across sectors; industrial impacts; distributional implications; changing the value of capital vs. labor; impacting inequality.[ref 145]
Tool of power concentration or control[ref 146]Potential for widespread social control through surveillance, predictive profiling, perception control.
Tool for empowerment or resistance (emancipatory assistant[ref 147])Potential for supporting emancipation and/or civil disobedience.[ref 148]
Global priority for shared goodGlobal public good; opportunity; benefit & access sharing.
Impact
Terms focusing on the unintended risks, benefits or side-effects of AI
Source of unanticipated risks (algorithmic black swan[ref 149])Prospects of diffuse societal-level harms or catastrophic tail-risk events, unlikely to be addressed by market forces; accordingly highlights paradigms of “algorithmic preparedness”[ref 150] and risk regulation more broadly.[ref 151]
Environmental pollutantEnvironmental impacts of AI supply chain;[ref 152] significant energy costs of AI training.
Societal pollutant (toxin[ref 153])Erosive effects of AI on quality and reliability of the online information landscape.
Usurper of human decision-making authorityGradual surrender of human autonomy and choice and/or control over the future.
Generator of legal uncertaintyDriver of legal disruption to existing laws;[ref 154] driving new legal developments.
Driver of societal value shiftsDriver of disruption to and shifts in public values;[ref 155] value erosion.
Driver of structural incentive shiftsDriver of changes in our incentive landscape; lock-in effects; coordination problems.
Revolutionary technology[ref 156]Macro-historical effects; potential impact on par with agricultural or industrial revolution.
Driver of global catastrophic or existential riskPotential catastrophic risks from misaligned advanced AI systems or from nearer-term “prepotent” systems;[ref 157] questions of ensuring value-alignment; questions of whether to pause or halt progress towards advanced AI.[ref 158]

Different terms for AI can therefore invoke different frames of reference or analogies. Use of analogies—by policymakers, researchers, or the public—may be hard to avoid, and they can often serve as fertile intuition pumps. 

IV. The risks of unreflexive analogies 

However, while metaphors can be productive (and potentially irreducible) in technology law, they also come with many risks. Given that analogies are shorthands or heuristics that compress or highlight salient features, challenges can creep in the more removed they are from the specifics of the technology in question. 

Indeed, as Crootof and Ard have noted, “[a]n analogy that accomplishes an immediate aim may gloss over critical distinctions in the architecture, social use, or second-order consequences of a particular technology, establishing an understanding with dangerous and long-lasting implications.”[ref 159]

Specifically: 

  1. The selection and foregrounding of a certain metaphor hides that there are always multiple analogies possible for any new technology, and each of these advances different “regulatory narratives.” 
  2. Analogies can be misleading by failing to capture a key trait of the technology or by alleging certain characteristics that do not actually exist. 
  3. Analogies limit our ability to understand the technology—in terms of its possibilities and limits—on its own terms.[ref 160]

The challenge is that unreflexive drawing of analogies in a legal context can lead to ineffective or even dangerous laws,[ref 161] especially once inappropriate analogies become entrenched.[ref 162]

However, even if one tries to avoid explicit analogies between AI and other technologies, apparently “neutral” definitions of AI that seek to focus solely on the technology’s “features” can and still do frame policymaking in ways that may not be neutral. For instance, Kraftt and colleagues found that whereas definitions of AI that emphasize “technical functionality” are more widespread among AI researchers, definitions that emphasize “human-like performance” are more prevalent among policymakers, which they suggest might prime policymaking towards future threats.[ref 163] 

As such, it is not just loose analogies or comparisons that can affect policy, but also (seemingly) specific technical or legislative terms. The framing effects of such terms do not only occur at the level of broad policy debates but can also have strong legal implications. In particular, they can create challenges for law when narrowly specified regulatory definitions are suboptimal.[ref 164]  

This creates twin challenges. On the one hand, picking suitable concepts or categories can be difficult at an early stage of a technology’s development and deployment, when its impacts and limits are not always fully understood.[ref 165] At the same time, the costs of picking and locking in the wrong terms or framings within legislative texts can be significant. 

Specifically, beyond the opportunity costs of establishing better concepts or terms, unreflexively establishing legal definitions for key terms can create the risk of later, downstream “governance misspecification.”[ref 166] Such misspecification can occur when regulation is originally targeted at a particular artifact or (technological) practice through a particular material scope and definition for those objects. The implicit assumption here is that the term in question is a meaningful proxy for the underlying societal or legal goals to be regulated. While that may be appropriate in many cases, there is a risk that the law becomes less efficient, ineffective, or even counterproductive if either initial misapprehension of the technology or subsequent technological developments lead to that proxy term coming apart from the legislative goals.[ref 167] Such misspecification can be seen in various cases of technology governance and regulation, including 1990s US export control thresholds for “high-performance computers” that treated the technology as far too static;[ref 168] the Outer Space Treaty’s inability to anticipate later Soviet Fractional Orbital Bombardment System (FOBS) capabilities, which were able to position nuclear weapons in space without, strictly, putting them “in orbit”;[ref 169] or initial early-2010s regulatory responses to drones or self-driving cars, which ended up operating on under- and overinclusive definitions of these technologies.[ref 170]

Given this, the aim should not be to find the “correct” metaphor for AI systems. Rather, a good policy is to consider when and how different frames can be more useful for specific purposes, or for particular actors and/or (regulatory) agencies. Rather than aiming to come up with better analogies directly, this focuses regulatory debates on developing better processes for analogizing and for evaluating these analogies. For instance, such processes can depart from broad questions, such as: 

  1. What are the foundational metaphors used in this discussion of AI? What features do they focus on? Do these matter in the way they are presented?
  2. What other metaphors could have been chosen for these same features or aspects of AI? 
  3. What aspects or features of AI do these metaphors foreground? Do they capture these features well? 
  4. What features are occluded? What are the consequences of these being occluded?
  5. What are the regulatory implications of these different metaphors? In terms of the coalitions they enable or inhibit, the issue and solution portfolios they highlight, or of how they position the technology within (or out of) the jurisdiction of existing institutions?

Improving these ways in which we analogize AI clearly needs significantly more work. However, it is critical that we do so to improve how we draw on frames and metaphors for AI and to ensure that—whether we are trying to understand AI itself, appreciate its impacts, or govern them effectively—our metaphors aid rather than lead us astray.

Conclusion

As AI systems have received significant attention, many have invoked a range of diverse analogies and metaphors. This has created an urgent need for us to better understand (a) when we speak of AI in ways that (inadvertently) import one or more analogies, (b) what it does to utilize one or another metaphor for AI, (c) what different analogies could be used instead for the same issue, (d) how the appropriateness of one or another metaphor is best evaluated, and (e) what, given this, might be the limits or risks of jumping at particular analogies. 

This report has aimed to contribute to answers to these questions and enable improved analysis, debate, and policymaking for AI by providing greater theoretical and empirical backing to how metaphors and analogies matter for policy. It has reviewed 5 pathways by which metaphors shape and affect policy and reviewed 55 analogies used to describe AI systems. This is not meant as an exhaustive overview but as the basis for future work. 

The aim here has not been to argue against the use of metaphors but for a more informed and reflexive and careful use of these metaphors. Those who engage in debate within and beyond the field should at least have greater clarity about the ways that these concepts are used and understood, and what are the (regulatory) implications of different framings. 

The hope is that this report can contribute foundations for a more deliberate and reflexive choice over what comparisons, analogies, or metaphors we use in talking about AI—and for the ways we communicate and craft policy for these urgent questions.


Also in this series

Concepts in advanced AI governance: a literature review of key terms and definitions

Executive summary

This report provides an overview, taxonomy, and preliminary analysis of many cornerstone ideas and concepts in the emerging field of advanced AI governance. 

Aim: The aim of this report is to contribute to improved analysis, debate, and policy by providing greater clarity around core terms and concepts. Any field of study or regulation can be improved by such clarity. 

As such, this report reviews definitions for four categories of terms: the object of analysis (e.g., advanced AI), the tools for intervention (e.g., “governance” and “policy”), the reflexive definitions of the field of “advanced AI governance”, and its theories of change.

Summary: In sum, this report:

  1. Discusses three different purposes for seeking definitions for AI technology, discusses the importance of such terminology in shaping AI policy and law, and discusses potential criteria for evaluating and comparing such terms.
  2. Reviews concepts for advanced AI, covering a total of 101 definitions across 69 terms, including terms focused on:
    1. the forms of advanced AI, 
    2. the (hypothesized) pathways towards those advanced AI systems, 
    3. the technology’s large-scale societal impacts, and 
    4. particular critical capabilities that advanced AI systems are expected to achieve or enable.
  3. Reviews concepts within “AI governance”, such as nine analytical terms used to define the tools for intervention (e.g., AI strategy, policy, and governance), four terms used to characterize different approaches within the field of study, and five terms used to describe theories of change. 

The terms are summarized below in Table 1. Appendices provide detailed lists of definitions and sources for all the terms covered as well as a list of definitions for nine other auxiliary terms within the field.

Introduction

As AI systems have become increasingly capable and have had increasingly public impacts, the field that focuses on governing advanced AI systems has come into its own. 

While researchers come to this issue with many different motivations, concerns, or hopes about AI—and indeed with many different perspectives on or expectations about the technology’s future trajectory and impacts—there has grown an emerging field of researchers, policy practitioners, and activists concerned with and united by what they see as the increasingly significant and pivotal societal stakes of AI. Along with significant disagreements, many in this emerging community share the belief that shaping the transformative societal impacts of advanced AI systems is a top global priority.[ref 1] However, this field still lacks clarity regarding not only many key empirical and strategic questions but also many key terms that are used.

Background: This lack of clarity matters because the recent wave of progress in AI, driven especially but not exclusively by the dramatic success of large language models (LLMs), has led to an accumulation of a wide range of new terms to describe these AI systems. Yet many of these terms—such as “foundation model”,[ref 2] “generative AI”,[ref 3] or “frontier AI”[ref 4]—do not always have clear distinctions[ref 5] and are often used interchangeably.[ref 6] They moreover emerge on top of and alongside a wide range of past terms, concepts, and words that have been used in the past decades to refer to (potential) advanced AI systems, such as “strong AI”, “artificial general intelligence”, or “transformative AI”. What are we to make of all of these terms?

Rationale: Critically, debates over terminology in and for advanced AI are not just semantics—these terms matter. In a broad sense, framings, metaphors, analogies, and explicit definitions can strongly affect not just developmental pathways for technology but also policy agendas and the efficacy and enforceability of legal frameworks.[ref 7] Indeed, different terms have already become core to major AI governance initiatives—with “general-purpose AI” serving as one cornerstone category in the EU AI Act[ref 8] and “frontier AI models” anchoring the 2023 UK AI Safety Summit.[ref 9] The varying definitions and implications of such terms may lead to increasing contestation,[ref 10] as well they should: Extensive work over the past decade has shown how different terms for “AI” import different regulatory analogies[ref 11] and have implications for crafting legislation.[ref 12] We might expect the same to hold for the new generation of terms used to describe advanced AI and to center and focus its governance.[ref 13] 

Aim: The aim of this report is to contribute to improved analysis, debate, and policy by providing greater clarity around core terms and concepts. Any field of study or regulation can be improved by such clarity. Such literature reviews may not just contribute to a consolidation of academic work, but can also refine public and policy debates.[ref 14] Ideally, they provide foundations for a more deliberate and reflexive choice over what concepts and terms to use (and which to discard), as well as a more productive refinement of the definition and/or operationalization of cornerstone terms. 

Scope: In response, this report considers four types of terms, including potential concepts and definitions for each of the following:

  1. the core objects of analysis—and the targets for policy (i.e., what is the “advanced AI” to be governed?),
  2. the tools for intervention to be used in response (i.e., what is the range of terms such as “policy”, “governance”, or “law”?),
  3. the field or community (i.e., what are current and emerging accounts, projects, or approaches within the broader field of advanced AI governance?), and 
  4. the theories of change of this field (i.e., what is this field’s praxis?).

Disclaimers: This project comes with some important caveats for readers. 

First, this report aims to be relatively broad and inclusive of terms, framings, definitions, and analogies for (advanced) AI. In doing so, it draws from both older and recent work and from a range of sources from academic papers to white papers and technical reports to public fora. 

Second, this report is primarily concerned with mapping the conceptual landscape and with understanding the (regulatory) implications of particular terms. As such, it is less focused on policing the appropriateness or coherence of particular terms or concepts. Consequently, with regard to advanced AI it covers many terms that are still highly debated or contested or for which the meaning is unsettled. Not all the terms covered are equally widely recognized, used, or even accepted as useful in the field of AI research or within the diverse fields of the AI ethics, policy, law, and governance space. Nonetheless, this report will include many of these terms on the grounds that a broad and inclusive approach to these concepts serves best to illuminate productive future debate. After all, even if some terms are (considered to be) “outdated,” it is important to know where such terms and concepts have come from and how they have developed over time. If some terms are contested or considered “too vague,” that should precisely speak in favor of aiming to clarify their usage and relation to other terms. This will either allow the (long overdue) refinement of concepts or will at least enable an improved understanding of when certain terms are not usefully recoverable. In both cases, it will facilitate greater clarity of communication.

Third, this review is a snapshot of the state of debate at one moment. It reviews a wide range of terms, many of which have been coined recently and only some of which may have staying power. This debate has developed significantly in the last few years and will likely continue to do so. 

Fourth, this review will mostly focus on analytical definitions of or for advanced AI along four approaches.[ref 15] In so doing, it will on this occasion mostly omit detailed exploration of a fifth, normative dimension to defining AI, which would focus on reviewing especially desirable types of advanced AI systems that (in the view of some) ought to be pursued or created. Such a review would cover a range of terms such as “ethical AI”,[ref 16] “responsible AI”,[ref 17] “explainable AI”,[ref 18] “friendly AI”,[ref 19] “aligned AI”,[ref 20] “trustworthy AI”,[ref 21] “provably-safe AI”,[ref 22] “human-centered AI”,[ref 23] “green AI”,[ref 24] “cooperative AI”,[ref 25] “rights-respecting AI”,[ref 26] “predictable AI”,[ref 27] “collective intelligence”,[ref 28] and “digital plurality”,[ref 29] amongst many other terms and concepts. At present, this report will not focus in depth on surveying these terms, since only some of them were articulated in the context of or in consideration of especially advanced AI systems. However, many or all of these terms are capability-agnostic and so could clearly be extended to or reformulated for more capable, impactful, or dangerous systems. Indeed, undertaking such a deepening and extension of the taxonomy presented in this report in ways that engage more with the normative dimension of advanced AI would be very valuable future work.

Fifth, this report does not aim to definitively resolve debates—or to argue that all work should adopt one or another term over others. Different terms may work best in different contexts or for different purposes and for different actors. Indeed, given the range of actors interested in AI—whether from a technical engineering, sociotechnical, or regulatory perspective—it is not surprising that there are so many terms and such diversity in definitions even for single terms. Nonetheless, to be able to communicate effectively and learn from other fields, it helps to gain greater clarity and precision in the terms we use, whether these are terms referring to our objects of analysis, our own field and community, or our theory of action. Of course, achieving clarity on terminology is not itself sufficient. Few problems, technical or social or legal, may be solved exclusively by haggling over words. Nonetheless, a shared understanding facilitates problem solving. The point here is not to achieve full or definitive consensus but to understand disagreements and assumptions. As such, this report seeks to provide background on many terms, explore how they have been used, and consider the suitability of these terms for the field.[ref 30] In doing so, this report highlights the diversity of terms in current use and provides context for more informed future study and policymaking. 

Structure: Accordingly, this report now proceeds as follows. 

Part I provides a background to this review by discussing three purposes to defining key terms such as AI. It also discusses why the choice for one or another term matters significantly from the perspective of AI policy and regulation, and finally discusses some criteria by which to evaluate the suitability of various terms and definitions for the specific purpose of regulation. 

In Part II, this report reviews a wide range of terms for “advanced AI”, across different approaches which variably focus on (a) the anticipated forms or design of advanced AI systems, (b) the hypothesized scientific pathways towards these systems, (c) the technology’s broad societal impacts, or (d) the specific critical capabilities particular advanced AI systems are expected to achieve. 

Part III turns from the object of analysis to the field and epistemic community of advanced AI governance itself. It briefly reviews three categories of concepts of use for understanding this field. First, it surveys different terms used to describe AI “strategy”, “policy”, or “governance” as this community understands the available tools for intervention in shaping advanced AI development. It then reviews different paradigms within the field of advanced AI governance as ways in which different voices within it have defined that field. Finally, it briefly reviews recent definitions for theories of change that aim to compare and prioritize interventions into AI governance. 

Finally, three appendices list in detail all the terms and definitions offered, with sources, and offer a list of auxiliary definitions that can aid future work in this emerging field.[ref 31]etail, with sources; and offer a list of auxiliary definitions that can aid future work in this emerging field.

I. Defining ‘advanced AI (governance)’: Background

Any quest for clarifying definitions of “advanced AI” is complicated by the already long-running, undecided debates over how to even define the more basic terms “AI” or, indeed, “intelligence”.[ref 32] 

To properly evaluate and understand the relevance of different terms for AI, it is useful to first set out some background. In the first place, one should start by considering the purposes for which the definition is sought. Why or how do we seek definitions of “(advanced) AI”? 

1. Three purposes for definitions

For instance, rather than trying to consider a universally best definition for AI, a more appropriate approach is to consider the implications of different definitions, or—to invert the question—to ask for what purpose we seek to define AI. We can consider (at least) three different rationales for defining a term like ‘AI’. 

  1. To build it (the technological research purpose): In the first place, AI researchers or scientists may pursue definitions of (advanced) AI by defining it from the “inside,” as a science.[ref 33] The aim of such technical definitions of AI[ref 34] is to clarify or create research-community consensus about (1) the range and disciplinary boundaries of the field—that is, what research programs and what computational techniques[ref 35] count as “AI research” (both internally and externally to research funders or users); (2) the long-range goals of the field (i.e., the technical forms of advanced AI); and/or (3) the intermediate steps the field should take or pursue (i.e., the likely pathways towards such AI). Accordingly, this definitional purpose aligns particularly closely with essence-based definitions (see Part II.1) and/or development-based definitions (see Part II.2) of advanced AI.
  2. To study it (the sociotechnical research purpose): In the second place, experts (in AI, but especially in other fields) may seek to primarily understand AI’s impacts on the world. In doing so, they may aim to define AI from the “outside,” as a sociotechnical system including its developers and maintainers.[ref 36] Such definitions or terms can aid researchers (or governments) who seek to understand the societal impacts and effects of this technology in order to diagnose or analyze the potential dynamics of AI development, diffusion, and application, as well as the long-term sociopolitical problems and opportunities. For instance, under this purpose researchers may aim to get to terms with understanding issues such as (1) (the geopolitics or political economy of) key AI inputs (e.g., compute, data, and labor), (2) how different AI capabilities[ref 37] give rise to a spectrum of useful applications[ref 38] in diverse domains, and (3) how these applications in turn produce or support new behaviors and societal impacts.[ref 39] Accordingly, this purpose is generally better served by sociotechnical definitions of AI systems’ impacts (see Part II.3) or risk-based definitions (see Part II.4).
  3. To regulate it (the regulatory purpose): Finally, regulators or academics motivated by appropriately regulating AI—either to seize the benefits or to mitigate adverse impacts—can seek to pragmatically delineate and define (advanced) AI as a legislative and regulatory target. In this approach, definitions of AI are to serve as useful handles for law, regulation, or governance.[ref 40] In principle, this purpose can be well served by many of the definitional approaches: highly technology-specific regulations for instance can gain from focusing on development-based definitions of (advanced) AI. However, in practice regulation and governance is usually better served by focusing on the sociotechnical impacts or capabilities of AI systems.

Since it is focused on the field of “advanced AI governance,” this report will primarily focus on the second and third of these purposes. However, it is useful to keep all three in mind.

2. Why terminology matters to AI governance

Whether taking a sociotechnical perspective on the societal impacts of advanced AI or a regulatory perspective on adequately governing it, the need to pick suitable concepts and terms becomes acutely clear. Significantly, the implications and connotations of key terms matter greatly for law, policy, and governance. This is because, as reviewed in a companion report,[ref 41] distinct or competing terms for AI—with their meanings and connotations—can influence all stages of the cycle from a technology’s development to its regulation. They do so in both a broad and a narrow sense.

In the broad and preceding sense, the choice of term and definition can, explicitly or implicitly, import particular analogies or metaphors into policy debates that can strongly shape the direction—and efficacy—of the resulting policy efforts.[ref 42] These framing effects can occur even if one tries to avoid explicit analogies between AI and other technologies, since apparently “neutral” definitions of AI still focus on one or another of the technology’s “features” as the most relevant, framing policymaker perceptions and responses in ways that are not neutral, natural, or obvious. For instance, Murdick and others found that the particular definition one uses for what counts as “AI” research directly affects which (industrial or academic) metrics are used to evaluate different states’ or labs’ relative achievements or competitiveness in developing the technology—framing downstream evaluations of which nation is “ahead” in AI.[ref 43] Likewise, Kraftt and colleagues found that whereas definitions of AI that emphasize “technical functionality” are more widespread among AI researchers, definitions that emphasize “human-like performance” are more prevalent among policymakers, which they suggest might prime policymaking towards future threats.[ref 44] 

Beyond the broad policy-framing impacts of technology metaphors and analogies, there is also a narrower sense in which terms matter. Specifically, within regulation, legislative and statutory definitions delineate the scope of a law and of the agency authorization to implement or enforce it[ref 45]—such that the choice for a particular term for (advanced) AI may make or break the resulting legal regime.

Generally, within legislative texts, the inclusion of particular statutory definitions can play both communicative roles (clarifying legislative intent), and performative roles (investing groups or individuals with rights or obligations).[ref 46] More practically, one can find different types of definitions that play distinct roles within regulation: (1) delimiting definitions establish the limits or boundaries on an otherwise ordinary meaning of a term, (2) extending definitions broaden a term’s meaning to expressly include elements or components that might not normally be included in its ordinary meaning, (3) narrowing definitions aim to set limits or expressly exclude particular understandings, and (4) mixed definitions use several of these approaches to clarify components.[ref 47] 

Likewise, in the context of AI law, legislative definitions for key terms such as “AI” obviously affect the material scope of the resulting regulations.[ref 48] Indeed, the effects of particular definitions have impacts on regulation not only ex ante, but also ex post: in many jurisdictions, legal terms are interpreted and applied by courts based on their widely shared “ordinary meaning.”[ref 49] This means, for instance, that regulations that refer to terms such as “advanced AI”, “frontier AI”, or “transformative AI” might not necessarily be interpreted or applied in ways that are in line with how the term is understood within expert communities. All of this underscores the importance of our choice of terms—from broad and indirect metaphors to concrete and specific legislative definitions—when grappling with the impacts of this technology on society.

Indeed, the strong legal effects of different terms mean that there can be challenges for a law when it depends on a poorly or suboptimally specified regulatory term for the forms, types, or risks from AI that the legislation means to address. This creates twin challenges. On the one hand, picking suitable concepts or categories can be difficult at an early stage of a technology’s development and deployment, when its impacts and limits are not always fully understood—the so-called Collingridge dilemma.[ref 50]

At the same time, the cost of picking and locking in the wrong terms within legislative texts can be significant. Beyond the opportunity costs, unreflexively establishing legal definitions for key terms can create the risk of downstream or later “governance misspecification.”[ref 51] 

Such governance misspecification may occur when regulation is originally targeted at a particular artifact or (technological) practice through a particular material scope and definition for those objects. The implicit assumption here is that the term in question is a meaningful proxy for the underlying societal or legal goals to be regulated. While that assumption may be appropriate and correct in many cases, there is a risk that if that assumption is wrong—either because of an initial misapprehension of the technology or because subsequent technological developments lead to that proxy term diverging from the legislative goals—the resulting technology law will less efficient, ineffective, or even counterproductive to its purposes.[ref 52] 

Such cases of governance misspecification can be seen in various cases of technology governance and regulation. For instance: 

Thus, getting greater clarity in our concepts and terminology for advanced AI will be critical in crafting effective, resilient regulatory responses—and in avoiding brittle missteps that are easily misspecified.

Given all the above, the aim in this report is not to find the “correct” definition or frame for advanced AI. Rather, it considers that different frames and definitions can be more useful for specific purposes or for particular actors and/or (regulatory) agencies. In that light, we can explore a series of broad starting questions, such as: 

  1. What different definitions have been proposed for advanced AI? What other terms could we choose? 
  2. What aspects of advanced AI (e.g., its form and design, the expected scientific principles of its development pathways, its societal impacts, or its critical capabilities) do these different terms focus on? 
  3. What are the regulatory implications of different definitions?

In sum, this report is premised on the idea that exploring definitions of AI (and related terms) matters, whether we are trying to understand AI, understand its impacts, or govern them effectively.

3. Criteria for definitions

Finally, we have the question of how to formulate relevant criteria for suitable terms and definitions for advanced AI. In the first place, as discussed above, this depends on one’s definitional purpose. 

Nonetheless, from the specific perspective of regulation and policymaking, what are some good criteria for evaluating suitable and operable definitions for advanced AI? Notably, Jonas Schuett has previously explored legal approaches to defining the basic term “AI”. He emphasizes that to be suitable for the purpose of governance, the choice of terms for AI should meet a series of requirements for all good legal definitions—namely that terms are neither (1) overinclusive nor (2) underinclusive and that they are (3) precise, (4) understandable, (5) practicable, and (6) flexible.[ref 59] Other criteria have been proposed: for instance, it has been suggested that an additional desiderata for a useful regulatory definition for advanced AI might include something like ex ante clarity—in the sense that the definition should allow one to assess, for a given AI model, whether it will meet the criteria for that definition (i.e., whether it will be regulated within some regime), and ideally allow this to be assessed in advance of deployment (or even development) of that model.[ref 60] Certainly, these criteria remain contested and are likely incomplete. In addition, there may be trade-offs between the criteria, such that even if they are individually acceptable, one must still strike a workable balance between them.[ref 61] 

II. Defining the object of analysis: Terms for advanced AI

Having briefly discussed the different definitional purposes, the relevance of terms for regulation, and potential criteria for evaluating definitions, this report now turns to survey the actual terminology for advanced AI. 

Within the literature and public debate, there are many terms used to refer to the conceptual cluster of AI systems that are advanced—i.e., that are sophisticated and/or are highly capable and/or could have transformative impacts on society.[ref 62] However, because of this diversity of terms, not all have featured equally strongly in governance or policy discussions. To understand and situate these terms, it is useful to compare their definitions with others and to review different approaches to defining advanced AI. 

In Schuett’s model for “legal” definitions for AI, he has distinguished four types of definitions, which focus variably on (1) the overarching term “AI”, (2) particular technical approaches in machine learning, (3) specific applications of AI, and (4) specific capabilities of AI systems (e.g., physical interaction, ability to make automated decisions, ability to make legally significant decisions).[ref 63] 

Drawing on Schuett’s framework, this report draws a similar taxonomy for common definitions for advanced AI. In doing so, it compares between different approaches that focus on one of four features or aspects of advanced AI.

  1. The anticipated technical form or design of AI systems (essence-based approaches);
  2. The proposed scientific pathways and paradigms towards creating advanced AI (development-based approaches); 
  3. The broad societal impacts of AI systems, whatever their cognitive abilities (sociotechnical-change-based approach);
  4. The specific critical capabilities[ref 64] that could potentially enable extreme impacts in particular domains (risk-based approaches).

Each of these approaches has a different focus, object, and motivating question (Table 2).

This report will now review these categories of approaches in turn. For each, it will broadly (1) discuss that approach’s core definitional focus and background, (2) list the terms and concepts that are characteristic of it, (3) provide some brief discussion of common themes and patterns in definitions given to these terms,[ref 65] and (4) then provide some preliminary reflections on the suitability of particular terms within this approach, as well as of the approach as a whole, to provide usable analytical or regulatory definitions for the field of advanced AI governance.[ref 66]

1. Essence-based definitions: Forms of advanced AI

Focus of approach: Classically, many definitions of advanced AI focus on the anticipated form, architecture, or design of future advanced AI systems.[ref 67] These definitions as such focus on AI systems that instantiate particular forms of advanced intelligence,[ref 68] for instance by instantiating an “actual mind” (that “really thinks”); by displaying a degree of autonomy; or by being human-like, general-purpose, or both in the ability to think, reason, or achieve goals across domains (see Table 3). 

Terms: The form-centric approach to defining advanced AI accordingly encompasses a variety of terms, including strong AI, autonomous machine (/ artificial) intelligence, general artificial intelligence, human-level AI, foundation model, general-purpose AI system, comprehensive AI services, artificial general intelligence, robust artificial intelligence, AI+, (machine/artificial) superintelligence, superhuman general-purpose AI, and highly-capable foundation models.[ref 69] 

Definitions and themes: While many of these terms are subject to a wide range of different definitions (see Appendix 1A), they combine a range of common themes or patterns (see Table 3).

Suitability of overall definitional approach: In the context of analyzing advanced AI governance, there are both advantages and drawbacks to working with form-centric terms. First, we review five potential benefits. 

Benefit (1): Well-established and recognized terms: In the first place, using form-centric terms has the advantage that many of these terms are relatively well established and familiar.[ref 72] Out of all the terms surveyed in this report, many form-centric definitions for advanced AI, like strong AI, superintelligence, or AGI, have both the longest track record and the greatest visibility in academic and public debates around advanced AI. Moreover, while some of these terms are relatively niche to philosophical (“AI+”) or technical subcommunities (“CAIS”), many of these terms are in fact the ones used prominently by the main labs developing the most disruptive, cutting-edge AI systems.[ref 73] Prima facie, reusing these terms could avoid the problem of having to reinvent the wheel and achieve widespread awareness of and buy-in on newer, more niche terms. 

Benefit (2): Readily intuitive concepts: Secondly, form-centric terms evoke certain properties—such as autonomy, adaptability, and human-likeness—which, while certainly not uncontested, may be concepts that are more readily understood or intuited by the public or policymakers than would be more scientifically niche concepts. At the same time, this may also be a drawback, if the ambiguity of many of these terms opens up greater scope for misunderstanding or flawed assumptions to creep into governance debates. 

Benefit (3): Enables more forward-looking and anticipatory policymaking towards advanced AI systems and their impacts. Thirdly, because some (though not all) form-centric definitions of advanced AI relate to systems that are perceived (or argued) to appear in the future, using these terms could help extend public attention, debate, and scrutiny to the future impacts of yet more general AI systems which, while their arrival might be uncertain, would likely be enormously impactful. This could help such debates and policies to be less reactive to the impacts of each latest AI model release or incident and start laying the foundations for major policy initiatives. Indeed, centering governance analysis on form-centric terms, even if they are (seen as) futuristic or speculative, can help inform more forward-looking, anticipatory, and participatory policymaking towards the kind of AI systems (and the kind of capabilities and impacts) that may be on the horizon.[ref 74]

One caveat here is that to consider this a benefit, one has to strongly assume that these futuristic forms of advanced AI systems are in fact feasible and likely near in development. At the same time, this approach need not presume absolute certainty over which of these forms of advanced AI can or will be developed, or on what timelines; rather, well-established risk management approaches[ref 75] can warrant some engagement with these scenarios even under uncertainty. To be clear, this need not (and should not) mean neglecting or diminishing policy attention for the impacts of existing AI systems,[ref 76] especially as these impacts are already severe and may continue to scale up as AI systems both become more widely implemented and create hazards for existing communities.

Benefit (4): Enables public debate and scrutiny of overarching (professed) direction and destination for AI development. Fourthly, and relatedly, this above advantage to using form-centric terms could still hold, even if one is very skeptical of these types of futuristic AI, because they afford the democratic value of allowing the public and policymakers to chime in on the actual professed long-term goals and aspirations of many (though not all) leading AI labs.[ref 77] 

In this way, the cautious, clear, and reflexive use of terms such as AGI in policy debates could be useful even if one is very skeptical of the actual feasibility of these forms of AI (or believes they are possible but remains skeptical that they will be built anytime soon using extant approaches). This is because there is democratic and procedural value in the public and policymakers being able to hold labs to account for the goals that they in fact espouse and pursue—even if those labs may turn out mistaken about the ability to execute on those plans (in the near term).[ref 78] This is especially the case when these are goals that the public might not (currently) agree with or condone.[ref 79] 

Using these “futuristic” terms could therefore help ground public debate over whether the development of these particular systems is even a societal goal they condone, whether society might prefer for labs or society to pursue a different visions for society’s relation to AI technology,[ref 80] or (if these systems are indeed considered desirable and legitimate goals) what additional policies or guarantees the world should demand.[ref 81]

Benefit (5): Technology neutrality: Fifthly, the use of form-centric terms in debates can build in a degree of technology neutrality[ref 82] in policy responses, since debates need not focus on the specific engineering or scientific pathways by which one or another highly capable and impactful AI system is pursued or developed. This could make the resulting regulatory frameworks more scalable and future-proof.

At the same time, there are a range of general drawbacks to using (any of these) form-focused definitions in advanced AI governance. 

Drawback (1): Connotations and baggage around terms: In the first place, the greater familiarity of some of these terms means that many form-focused terms have become loaded with cultural baggage, associations, or connotations which may mislead, derail, or unduly politicize effective policymaking processes. In particular, many of these terms are contested and have become associated (whether or not necessarily) with particular views or agendas towards building these systems.[ref 83] This is a problem because, as discussed previously, the use of different metaphors, frames, and analogies may be irreducible in (and potentially even essential to) the ways that the public and policymakers make sense of regulatory responses. Yet different analogies—and especially the unreflexive use of terms—also have limits and drawbacks and create risks of inappropriate regulatory responses.[ref 84]

Drawback (2): Significant variance in prominence of terms and constant turnover: In the second place, while some of these terms have held currency at different times in the last decades, many do not see equally common use or recognition in modern debates. For instance, terms such as “strong AI” which dominated early philosophical debates, appear to have fallen slightly out of favor in recent years[ref 85] as the emergence and impact of foundation models generally, and generative AI systems specifically, has revived significantly greater attention to terms such as “AGI”. This churn or turnover in definitions may mean that it may not be wise to attempt to pin down a single term or definition right now, since analyses that focus on one particular anticipated form of advanced AI may be more likely to be rendered obsolete. At the same time, this is likely to be a general problem with any concepts or terminology chosen.

Drawback (3): Contested terms, seen as speculative or futuristic: In the third place, while some form-centric terms (such as “GPAIS” or “foundation model”) have been well established in AI policy debates or processes, others, such as “AGI”, “strong AI”, or “superintelligence”, are more future-oriented, referring to advanced AI systems that do not (yet) exist.[ref 86] Consequently, many of these terms are contested and seen as futuristic and speculative. This perception may be a challenge, because even if it is incorrect (e.g., such that particular systems like “AGI” will in fact be developed within short timelines or are even in some sense “already here”[ref 87]), the mere perception that a technology or term is far-off or “speculative” can serve to inhibit and delay effective regulatory or policy action.[ref 88] 

A related but converse risk of using future-oriented terms for advanced AI policy is that it may inadvertently import a degree of technological determinism[ref 89] in public and policy discussions, as it could imply that one or another particular forms or architectures of advanced AI (“AGI”, “strong AI”) are not just possible but inevitable—thereby shifting public and policy discussions away from the question of whether we should (or can safely) develop these systems (rather than other, more beneficial architectures)[ref 90] towards less ambitious questions over how we should best (safely) reckon with the arrival or development of these technologies.

In response, this drawback could be somewhat mitigated by relying on terms for the forms of advanced AI—such as GPAIS or highly-capable foundation models—that are (a) more present-focused, while (b) not putting any strong presumed ceilings on the capabilities of the systems.

Drawback (4): Definitional ambiguity: In the fourth place, many of these terms, and especially future-oriented terms such as “strong AI”, “AGI”, and “human-level AI”, suffer from definitional ambiguity in that they are used both inconsistently and interchangeably with one another.[ref 91] 

Of course, just because there is no settled or uncontested definition for a term such as “AGI” does not make it prima facie unsuitable for policy or public debate. By analogy, the fact that there can be definitional ambiguity over the content or boundaries of concepts such as “the environment” or “energy” does not render “environmental policy” or “energy policy” meaningless categories or irrelevant frameworks for regulation.[ref 92] Nor indeed does outstanding definitional debate mean that any given term, such as AGI, is “meaningless.”[ref 93] 

Nonetheless, the sheer range of contesting definitions for many of these concepts may reflect an underlying degree of disciplinary or philosophical confusion, or at least suggest that, barring greater conceptual clarification and operationalization,[ref 94] these terms will lead to continued disagreement. Accordingly, anchoring advanced AI governance to broad terms such as “AGI” may make it harder to articulate appropriately scoped legal obligations for specific actors that will not end up being over- or underinclusive.[ref 95] 

Drawback (5): Challenges in measurement and evaluation: In the fifth place, an underlying and related challenge for the form-centric approach is that (in part due to these definitional disagreements and in part due to deeper reasons) it faces challenges around how to measure or operationalize (progress towards) advanced AI systems. 

This matters because effective regulation or governance—especially at the international level[ref 96]—often requires (scientific and political) consensus around key empirical questions, such as when and how we can know that a certain AI system truly achieves some of the core features (e.g., autonomy, agency, generality, and human-likeness) that are crucial to a given term or concept. In practice, AI researchers often attempt to measure such traits by evaluating an AI system’s ability to pass one or more specific benchmark tests (e.g., the Turing test, the Employment test, the SAT, etc.).[ref 97] 

However, such testing approaches have many flaws or challenges.[ref 98] At the practical level, there have been problems with how tests are applied and scored[ref 99] and how their results are reported.[ref 100] Underlying this is a challenge that the way in which some common AI performance tests are constructed may emphasize nonlinear or discontinuous metrics, which can lead to an overtly strong impression that some model skills are “suddenly” emergent properties (rather than smoothly improving capabilities).[ref 101] More fundamentally, there have been challenges to the meaningfulness of applying human-centric tests (such as the bar exam) to AI systems[ref 102] and indeed deeper critiques of the construct validity of leading benchmark tests in terms of whether they actually are indicative of progress towards flexible and generalizable AI systems.[ref 103] 

Of course, that does not mean that there may not be further scientific progress towards the operationalization of useful tests for understanding when particular forms of advanced AI such as AGI have been achieved.[ref 104] Nor is it to suggest that benchmark and evaluation challenges are unique to form-centric definitions of AI—indeed, they may also challenge many approaches focused on specific capabilities of advanced AIs.[ref 105] However, the extant challenges over the operationalization of useful tests mean that overreliance on these terms could muddle debates and inhibit consensus over whether a particular advanced system is within reach (or already being deployed). 

Drawback (6): Overt focus on technical achievement of particular forms may make this approach underinclusive of societal impacts or capabilities: In the sixth place, the focus of future-oriented form-centric approaches on the realization of one or another type of advanced AI system (“AGI”, “human-level AI”), might be adequate if the purpose for our definitions is for technical research.[ref 106] However, for those whose definitional purpose is to understand AI’s societal impacts (sociotechnical research) or to appropriately regulate AI (regulatory), many form-centric terms may miss the point. 

This is because what matters from the perspective of human and societal safety, welfare, and well-being—and from the perspective of law and regulation[ref 107]—is not the achievement of some fully general capacity in any individual system but rather overall sociotechnical impacts or the emergence of key dangerous capabilities—even if they derive from systems that are not yet (fully) general[ref 108] or that develop dangerous emergent capabilities that are not human-like.[ref 109] Given all this, there is a risk that taking a solely form-centric approach leaves advanced AI governance vulnerable to a version of the “AI effect,” whereby “real AGI” is always conceived of as being around the corner but rarely as a system already in production. 

Suitability of different terms within approach: Given the above, if one does aim to draw on this approach, it may be worth considering which terms manage to gain from the strengths of this approach while reducing some of the pitfalls. In this view, the terms “GPAIS” or “foundation model” may be more suitable in many contexts, as they are recognized as categories of (increasingly) general and competent AI systems of which some versions already exist today. In particular, because (versions) of these terms are already used in ongoing policy debates, they could provide better regulatory handles for governing the development of advanced AI—for instance by their relation to the complex supply chain of modern AI development that contains both upstream and downstream developers and users.[ref 110] Moreover, these terms do not presume a ceiling in the system’s capability; accordingly, concepts such as “highly-capable foundation model”,[ref 111] “extremely capable foundation model”, or “threshold foundation model” could help policy debates be cognizant of the growing capabilities of these systems while still being more easily understandable for policymakers.[ref 112]

2. Development-based definitions: Pathways towards advanced AI

Focus of approach: A second cluster of terms focuses on the anticipated or hypothesized scientific pathways or paradigms that could be used to create advanced AI systems. Notably, the goal or target of these pathways is often to build “AGI”-like systems.[ref 113] 

Notes and caveats: Any discussion of proposed pathways towards advanced AI has a number of important caveats. In the first place, many of these proposed paradigms have long been controversial, with pervasive and ongoing disagreement about their scientific foundations and feasibility as paths towards advanced AI (or in particular as paths towards particular forms of advanced AI, such as AGI).[ref 114] Secondly, these approaches are not necessarily mutually exclusive, and indeed many labs combine elements from several in their research.[ref 115] Thirdly, because the relative and absolute prominence and popularity of many of these paradigms have fluctuated over time and because there are often, as in any scientific field, significant disciplinary gulfs between paradigms, there is highly unequal treatment of these pathways and terms. As such, whereas some paradigms (such as the scaling, reinforcement-learning, and, to some extent, brain-inspired approaches) are reasonably widely known, many of the other approaches and terms listed (such as “seed AI”) may be relatively unknown or even very obscure within the modern mainstream machine learning (ML) community.[ref 116] 

Other taxonomies: There have been various other such attempts to create taxonomies of the main theorized pathways that have been proposed to build or implement advanced AI. For instance, Goertzel and Pennachin have defined four different approaches to creating “AGI”, which to different degrees draw on lessons from the (human) brain or mind.[ref 117] More recently, Hannas and others have drawn on this framework and extended it to five theoretical pathways towards “general AI”.[ref 118] 

Further extending such frameworks, one can distinguish between at least 11 proposed pathways towards advanced AI (See Table 4).

Terms: Many of these paradigms or proposed pathways towards advanced AI come with their own assorted terms and definitions (see Appendix 1B). These terms include amongst others de novo AGI, prosaic AGI, frontier (AI) model [compute threshold], [AGI] from evolution, [AGI] from powerful reinforcement learning agents, powerful deep learning models, seed AI, neuroAI, brain-like AGI, neuromorphic AGI, whole-brain emulation, brain-computer interface, [advanced AI based on] a sophisticated embodied agent, or hybrid AI (see Table 4).

Definitions: As noted, these terms can be mapped on 11 proposed pathways towards advanced AI, with their own terms for the resulting advanced AI systems. 

Notably, there are significant differences in the prominence of these approaches—and the resources dedicated to them—at different frontier AI labs today. For instance, while some early work on the governance of advanced AI systems focused on AI systems that would (presumably) be built from first principles, bootstrapping,[ref 121] or neuro-emulated approaches (see Table 4), much of such work has more recently shifted to focus on understanding the risks from and pathways to aligning and governing advanced AI systems created through computational scaling. 

This follows high-profile trends in leading AI labs. While (as discussed above) many research labs are not dedicated to a single paradigm, the last few years (and 2023 in particular) have seen a significant share of resources going towards computational scaling approaches, which have yielded remarkably robust (though not uncontested) performance improvements.[ref 122] As a result, the scaling approach has been prominent in informing the approaches of labs such as OpenAI,[ref 123] Anthropic,[ref 124] DeepMind,[ref 125] and Google Brain (now merged into Google DeepMind).[ref 126] This approach has also been prominent (though somewhat lagging) in some Chinese labs such as Baidu, Alibaba, Tencent, and the Beijing Institute for General Artificial Intelligence.[ref 127] Nonetheless, other approaches continue to be in use. For instance, neuro-inspired approaches have been prominent in DeepMind,[ref 128] Meta AI Research,[ref 129] and some Chinese[ref 130] and Japanese labs,[ref 131] and modular cognitive architecture approaches have informed the work by Goertzel’s OpenCog project,[ref 132] amongst others. 

Suitability of overall definitional approach: In the context of analyzing advanced AI governance, there are both advantages and drawbacks to using concepts that focus on pathways of development. 

Amongst the advantages of this approach are:

Benefit (1): Close(r) grounding in actual technical research agendas aimed at advanced AI: Defining advanced AI systems according to their (envisioned) development pathways has the benefit of keeping advanced AI governance debates more closely grounded in existing technical research agendas and programs, rather than the often more philosophical or ambiguous debates over the expected forms of advanced AI systems. 

Benefit (2): Technological specificity allowing scoping of regulation to approaches of concern: Relatedly, this also allows better regulatory scoping of the systems of concern. After all, the past decade has seen a huge variety amongst AI techniques and approaches, not just in terms of their efficacy but also in terms of the issues they raise, with particular technical approaches raising distinct (safety, interpretability, robustness) issues.[ref 133] At the same time, these correlations might be less relevant in the last few years given the success of scaling-based approaches at creating remarkably versatile and general-purpose systems. 

However, taking the pathways-focused approach to defining advanced AI has its own challenges:

Drawback (1): Brittleness as technological specificity imports assumptions about pathways towards advanced AI: The pathway-centric approach may import strong assumptions about what the relevant pathways towards advanced AI are. As such, governance on this basis may not be robust to ongoing changes or shifts in the field.

Drawback (2): Suitability of terms within this approach: Given this, development-based definitions of pathways towards advanced AI seem particularly valuable if the purpose of definition is technical research but may be less relevant if the purpose is sociotechnical analysis or regulation. Technical definitions of AI might therefore provide an important baseline or touchstone for analysis in many other disciplines, but they may not be fully sufficient or analytically enlightening to many fields of study dealing with the societal consequences of the technology’s application or with avenues for governing these. 

At any rate, one interesting feature of development-based definitions of advanced AI is that the choice of approach (and term) to focus on has significant and obvious downstream implications for framing the policy agendas for advanced AI—in terms of the policy issues to address, the regulatory “surface” of advanced AI (e.g., the necessary inputs or resources to pursue research along a certain pathway), and the most feasible or appropriate tools. For instance, a focus on neuro-integrationist-produced brain-computer interfaces suggests that policy issues for advanced AI will focus less on questions of value alignment[ref 134] and rather around (biomedical) questions of human consent, liability, privacy, (employer) neurosurveillance,[ref 135] and/ormorphological freedom.[ref 136] A focus on embodiment-based approaches towards robotic agents raises more established debates from robot law.[ref 137] Conversely, if one expects that the pathway towards advanced AI still requires underlying scientific breakthroughs, either from first principles or through a hybrid approach, this would imply that very powerful AI systems could be developed suddenly by small teams or labs, which lack large compute budgets.

Similarly, focusing on scalingbased approaches—which seems most suitable given the prominence and success of this approach in driving the recent wave of AI progress—leads to a “compute-based” perspective on the impacts of advanced AI.[ref 138] This suggests that the key tools and levers for effective governance should focus on compute governance—provided we assume that this will remain a relevant or feasible precondition for developing frontier AI. For instance, such an approach underpins the compute-threshold definition for frontier AI, which defines advanced AI with reference to particular technical elements or inputs (such as a compute usage or FLOP threshold, dataset size, or parameter count) used in its development.[ref 139] While a useful referent, this may be an unstable proxy given that it may not reliably or stably correspond to the particular capabilities of concern.

3. Sociotechnical-change based definitions: Societal impacts of advanced AI

Focus of approach: A third cluster of definitions in advanced AI governance mostly brackets out philosophical questions of the precise form of AI systems or engineering questions of the scientific pathways towards their development. Rather, it aims at defining advanced AI in terms of different levels of societal impacts.

Many concepts in this approach have emerged from scholarship that aimed to abstract away from these architectural questions and rather explore the aggregate societal impacts of advanced AI. This includes work on AI technology’s international, geopolitical impacts[ref 140] as well as work on identifying relevant historical precedents for the technology’s societal impacts, strategic stakes, and political economy.[ref 141] Examples of this work are those that identified novel categories of unintended “structural” risks from AI as distinct from “misuse” or “accident” risks,[ref 142] or taxonomies of the different “problem logics” created by AI systems.[ref 143]

Terms: The societal-impact-centric approach to defining advanced AI includes a variety of terms, including: (strategic) general-purpose technology, general-purpose military transformation, transformative AI, radically transformative AI, AGI (economic competitiveness definition), and machine superintelligence.

Definitions and themes: While many of these terms are subject to a wide range of different definitions (see Appendix 1C), they again feature a range of common themes or patterns (see Table 5).

Suitability of approach: Concepts within the sociotechnical-change-based approach may be unsuitable iFocus of approach: Finally, a fourth cluster of terms follows a risk-based approach and focuses on critical capabilities, which certain types of advanced AI systems (whatever their underlying form or scientific architecture) might achieve or enable for human users. The development of such capabilities could then mark key thresholds or inflection points in the trajectory of society. 

Other taxonomies: Work focused on the significant potential impacts or risks of advanced AI systems is of course hardly new.[ref 150] Yet in the past years, as AI capabilities have progressed, there has been renewed and growing concern that these advances are beginning to create key threshold moments where sophisticated AI systems develop capabilities that allow them to achieve or enable highly disruptive impacts in particular domains, resulting in significant societal risks. These risks may be as diverse as the capabilities in question—and indeed discussions of these risks do not always or even mostly presume (as do many form-centric approaches) the development of general capabilities in AI.[ref 151] For instance, many argue that existing AI systems may already contribute to catastrophic risks in various domains:[ref 152] large language models (LLMs) and automated biological design tools (BDTs) may already be used to enable weaponization and misuse of biological agents,[ref 153] the military use of AI systems in diverse roles may inadvertently affect strategic stability and contribute to the risk of nuclear escalation,[ref 154] and existing AI systems’ use in enabling granular and at-scale monitoring and surveillance[ref 155] may already be sufficient to contribute to the rise of “digital authoritarianism”[ref 156] or “AI-tocracy”[ref 157], to give a few examples. 

As AI systems become increasingly advanced, they may steadily and increasingly achieve or enable further critical capabilities in different domains that could be of special significance. Indeed, as leading LLM-based AI systems have advanced in their general-purpose abilities, they have frequently demonstrated emergent abilities that are surprising even to their developers.[ref 158] This has led to growing concern that as these models continue to be scaled up[ref 159] some next generation of these systems could develop unexpected but highly dangerous capabilities if not cautiously evaluated.[ref 160] 

What are these critical capabilities?[ref 161] In some existing taxonomies, critical capabilities could include AI systems reaching key levels of performance in domains such as cyber-offense, deception, persuasion and manipulation, political strategy, building or gaining access to weapons, long-horizon planning, building new AI systems, situational awareness, self-proliferation, censorship, or surveillance,[ref 162] amongst others. Other experts have been concerned about cases where AI systems display increasing tendencies and aptitudes towards controlling or power-seeking behavior.[ref 163] Other overviews identify other sets of hazardous capabilities.[ref 164] In all these cases, the concern is that advanced AI systems that achieve these capabilities (regardless of whether they are fully general, autonomous, etc.) could enable catastrophic misuse by human owners, or could demonstrate unexpected extreme—even hazardous—behavior, even against the intentions of their human principals. 

Terms: Within the risk-based approach, there are a range of domains that could be upset by critical capabilities. A brief survey (see Table 6) can identify at least eight such capability domains—moral/philosophical, economic, legal, scientific, strategic or military, political, exponential, and (extremely) dangerous.[ref 165] Namely, these include:[ref 166] 

Definitions and themes: As noted, many of these terms have different definitions (see Appendix 1D). Nonetheless, a range of common themes and patterns can be distilled (see Table 6).

Suitability of approach: There are a range of benefits and drawbacks to defining advanced AI systems by their (critical) capabilities. These include (in no particular order): 

Benefit (1): Focuses on key capability development points of most concern: A first benefit of adopting the risk-based definitional approach is that these concepts can be used, alone or in combination, to focus on the key thresholds or transition points in AI development that we most care about—not just the aggregate eventual, long-range societal outcomes nor the (eventual) “final” form of advanced AI, but rather the key intermediate (technical) capabilities that would suffice to create (or enable actors to achieve) significant societal impacts: the points of no return.

Benefit (2): Highlighting risks and capabilities can more precisely inform the public understanding: Ensuring that terms for advanced AI systems clearly center on particular risks or capabilities can help the public and policymakers understand the risks or challenges to be avoided, in a way that is far clearer than terms that focus on very general abilities or which are highly technical (i.e., terms within essence- or development-based approaches, respectively). Such terms may also assist the public in comparing the risks of one model to those posed by another.[ref 169]

Benefit (3): Generally (but not universally) clearer or more concrete: While some terms within this approach are quite vague (e.g., “singleton”) or potentially difficult to operationalize or test for (e.g., “artificial consciousness”), some of the more specific and narrow terms within this approach could offer more clarity, and less definitional drift, to regulation. While many of them would need significant further clarification before they could be suitable for use in legislative texts (whether domestic laws or international treaties), they may offer the basis for more circumscribed, tightly defined professional cornerstone concepts for such regulation.[ref 170]

However, there are also a number of potential drawbacks to risk-based definitions.

Drawback (1): Epistemic challenges around “unknown unknown” critical capabilities: One general challenge to this risk-based approach for characterizing advanced AI is that, in the absence of more specific and empirical work, it can be hard to identify and enumerate all relevant risk capabilities in advance (or to know that we have done so). Indeed, aiming to exhaustively list out all key capabilities to watch for could be a futile exercise to undertake.[ref 171] At the same time, this is a challenge that is arguably faced in any domain of (technology) risk mitigation, and it does not mean that doing such analysis to the best of our abilities is void. However, this challenge does create an additional hurdle for regulation, as it heightens the chance that if the risk profile of the technology rapidly changes, regulators or existing legal frameworks will be unsure of how or where to classify that model.

Drawback (2): Challenges around comparing or prioritizing between risk capabilities: A related challenge lies in the difficulty of knowing which (potential) capabilities to prioritize for regulation and policy. However, that need not be a general argument against this approach. Instead, it may simply help us make explicit the normative and ethical debates over what challenges to avoid and prioritize.

Drawback (3): Utilizing many parallel terms focused on different risks can increase confusion: One risk for this approach is that while the use of many different terms for advanced AI systems, depending on their specific critical capabilities in particular domains, can make more for appropriate and context-sensitive discussions (and regulation) within those domains, at an aggregate level this may increase the range of terms that regulators and the public have to reckon with and compare between—with the risk that these actors simply drown in the range of terms.

Drawback (5): Outstanding disagreements over appropriate operationalization of capabilities: One further challenge with these terms may lie in the way that some key terms remain contested or debated—and that even clearer terms are not without challenge. For instance, in 2023, the concept of “frontier model” has become subject to increasing debate over its potential adequacy for regulation.[ref 172] Notably, there are at least three ways of operationalizing this concept. The first, computational threshold, has been discussed above.[ref 173] 

However, a second operationalization for frontier AI focuses on some relative-capabilities threshold. This approach includes recent proposals to define “frontier AI models” in terms of capabilities relative to other AI systems,[ref 174] as models that “exceed the capabilities currently present in the most advanced existing models” or as “models that are both (a) close to, or exceeding, the average capabilities of the most capable existing models.”[ref 175] Taking such a comparative approach to defining advanced AI may be useful in combating the easy tendency of observers to normalize or become used to the rapid pace of AI capability progress.[ref 176] Yet there may be risks with such a comparative approach, especially when tied to a moving wavefront of “the most capable” existing models, as this could easily impose a need on regulators to engage in constant regulatory updating, as well as creating risks of underinclusivity of some foundation models that did not display hazardous capabilities in their initial evaluations, but which once deployed or shared might be reused or recombined in ways that could create or enable significant harms.[ref 177] The risk of embedding this definition of frontier AI in regulation, would be to leave a regulatory gap around significantly harmful capabilities, especially those that are no longer at the technical “frontier,” but which remain unaddressed even so. Indeed, for similar reasons, Seger and others have advocated using the concept “highly-capable foundation models” instead.[ref 178]

A third approach to defining frontier AI models has instead focused more on identifying a set of static and absolute criteria grounded in particular dangerous capabilities (i.e., a dangerous-capabilities threshold). Such definitions might be useful insofar as they help regulators or consumers identify better when a model crosses a safety threshold and in a way that is less susceptible to slippage or change over time. This could make such concepts more suitable (and resulting regulations less at risk of obsolescence or governance misspecification) than operationalizations of “frontier AI model” that rely on indirect technological metrics (such as compute thresholds) as proxies for these capabilities. Even so, as discussed above, anchoring the “frontier AI model” concept on particular dangerous capabilities leaves open questions around how to best operationalize and create evaluation suites that are able to identify or predict such capabilities ex ante.

Given this, while the risk-based approach may be the most promising ground for defining advanced AI systems from a regulatory perspective, it is clear that not all terms in use in this approach are equally suitable, and many require further operationalization and clarification.

III. Defining the advanced AI governance epistemic community

Beyond the object of concern of “advanced AI” (in all its diverse forms), researchers in the emerging field concerned with the impacts and risks of advanced AI systems have begun to specify a range of other terms and concepts, relating to the tools for intervening in and on the development of advanced AI systems in socially beneficial ways, terms by which this community’s members conceive of the overarching approach or constitution of their field, and theories of change

1. Defining the tools for policy intervention

First off, those writing about the risks and regulation of AI have proposed a range of terms in describing the tools, practices, or nature of governance interventions that could be used in response (see Table 7).

Like the term “advanced AI”, these terms set out objects of study in scoping the practices or tools of AI governance. They matter insofar as they link these terms to tools for intervention. 

Nonetheless, these terms do not capture the methodological dimension of how different approaches to advanced AI governance have approached these issues—nor the normative question of why different research communities have been driven to focus on the challenges from advanced AI in the first place.[ref 180]

2. Defining the field of practice: Paradigms

Thus, we can next consider different ways that practitioners have defined the field of advanced AI governance.[ref 181] Researchers have used a range of terms to describe the field of study that focuses on understanding the trajectory to forms of, or impacts of advanced AI and how to shape these. While these have significant overlaps in practice, it is useful to distinguish some key terms or framings of the overall project (Table 8).

However, while these terms show some different focus and emphasis, and different normative commitments, this need not preclude an overall holistic approach. To be sure, work and researchers in this space often hold diverse expectations about the trajectory, form, or risks of future AI technologies; diverse normative commitments and motivations for studying these; and distinct research methodologies given their varied disciplinary backgrounds and epistemic precommitments.[ref 184] However, even so, many of these communities remain united by a shared perception of the technology’s stakes—the shared view that shaping the impacts of AI is and should be a significant global priority.[ref 185] 

As such, one takeaway here is not that scholars or researchers need pick any one of these approaches or conceptions of the field. Rather, there is a significant need for any advanced AI governance field to maintain a holistic approach, which includes many distinct motivations and methodologies. As suggested by Dafoe, 

“AI governance would do well to emphasize scalable governance: work and solutions to pressing challenges which will also be relevant to future extreme challenges. Given all this potential common interest, the field of AI governance should be inclusive to heterogenous motivations and perspectives. A holistic sensibility is more likely to appreciate that the missing puzzle pieces for any particular challenge could be found scattered throughout many disciplinary domains and policy areas.”[ref 186] 

In this light, one might consider and frame advanced AI governance as an inclusive and holistic field, concerned with, broadly, “the study and shaping of local and global governance systems—including norms, policies, laws, processes, and institutions—that affect the research, development, deployment, and use of existing and future AI systems, in ways that help the world choose the role of advanced AI systems in its future, and navigate the transition to that world.”

3. Defining theories of change

Finally, researchers in this field have been concerned not just with studying and understanding the strategic parameters of the development of advanced AI systems,[ref 187] but also with considering ways to intervene upon it, given particular assumptions or views about the form, trajectory, societal impacts, or risky capabilities of this technology.

Thus, various researchers have defined terms that aim to capture the connection between immediate interventions or policy proposals, and the eventual goals they are meant to secure (see Table 9).

Drawing on these terms, one might also articulate new terms that incorporate elements from the above.[ref 196] For instance, one could define a “strategic approach” as a cluster of correlated views on advanced AI governance, encompassing (1) broadly shared assumptions about the key technical and governance parameters of the challenge; (2) a broad theory of victory and impact story about what solving this problem would look like; (3) a broadly shared view of history, with historical analogies to provide comparison, grounding, inspiration, or guidance; and (4) a set of intermediate strategic goals to be pursued, giving rise to near-term interventions that would contribute to reaching these.

Conclusion

The community focused on governing advanced AI systems has developed a rich and growing body of work. However, it has often lacked clarity, not only regarding many key empirical and strategic questions, but also regarding many of its fundamental terms. This includes different definitions for the relevant object of analysis—that is, species of “advanced AI”—as well as different framings for the instruments of policy, different paradigms or approaches to the field itself, and distinct understandings of what it means to have a theory of change to guide action. 

This report has reviewed a range of terms for different analytical categories in the field. It has discussed three different purposes for seeking definitions for core terms, and why and how (under a “regulatory” purpose) the choice of terms matters to both the study and practice of AI governance. It then reviewed analytical definitions of advanced AI used across different clusters which focus on the forms or design of advanced AI systems, the (hypothesized) scientific pathways towards developing these systems, the technology’s broad societal impacts, and the specific critical capabilities achieved by particular AI systems. The report then briefly reviewed analytical definitions of the tools for intervention, such as “policy” and governance”, before discussing definitions of the field and community itself and definitions for theories of change by which to prioritize interventions. 

This field of advanced AI governance has shown a penchant for generating many concepts, with many contesting definitions. Of course, while any emerging field will necessarily engage in a struggle to define itself, this field has seen a particularly broad range of terms, perhaps reflecting the disciplinary range. Eventually, the community may need to more intentionally and deliberately commit to some terms. In the meantime, those who engage in debate within and beyond the field should at least have greater clarity about the ways that these concepts are used and understood, and about the (regulatory) implications of some of these terms. This report has aimed to provide such greater clarity in order to help provide greater context for more informed and clear discussions about questions in and around the field.

Appendix 1: Lists of definitions for advanced AI terms

This appendix provides a detailed list of definitions for advanced AI systems, with sources. These may be helpful for readers to explore work in this field in more detail; to understand the longer history and evolution of many terms; and to consider the strengths and drawbacks of particular terms, and of specific language, for use in public debate, policy formulation, or even in direct legislative texts.

1.A. Definitions focused on the form of advanced AI

Different definitional approaches emphasize distinct aspects or traits that would characterize the form of advanced AI systems—such as that it is ‘mind-like’, performs ‘autonomously’, ‘is general-purpose’, ‘performs like a human’, ‘performs general-purpose like a human’, etc. However, it should be noted that there is significant overlap, and many of these terms are often (whether or not correctly) used interchangeably.

Advanced AI is mind-like & really thinks

Advanced AI is autonomous

General artificial intelligence: “broadly capable AI that functions autonomously in novel circumstances”.[ref 203]

Advanced AI is human-like

Advanced AI is general-purpose 

“asymptotically recursive improvement of AI technologies in distributed systems [which] contrasts sharply with the vision of self-improvement internal to opaque, unitary agents. […] asymptotically comprehensive, superintelligent-level AI services that—crucially—can include the service of developing new services, both narrow and broad, [yielding] a model of flexible, general intelligence in which agents are a class of service-providing products, rather than a natural or necessary engine of progress in themselves.”[ref 214]

Advanced AI is general-purpose & of human-level performance

Robust artificial intelligence: “intelligence that, while not necessarily superhuman or self-improving, can be counted on to apply what it knows to a wide range of problems in a systematic and reliable way, synthesizing knowledge from a variety of sources such that it can reason flexibly and dynamically about the world, transferring what it learns in one context to another, in the way that we would expect of an ordinary adult.”[ref 236]

Advanced AI is general-purpose & beyond-human-performance

1.B. Definitions focused on the pathways towards advanced AI

First-principles pathways: “De novo AGI”

Pathways based on new fundamental insights in computer science, mathematics, algorithms, or software, producing advanced AI systems that may, but need not mimic human cognition.[ref 248]

Scaling pathways: “Prosaic AGI”, “frontier (AI) model” [compute threshold]

Approaches based on “brute forcing” advanced AI,[ref 250] by running (one or more) existing AI approaches (such as transformer-based LLMs)[ref 251] with increasingly more computing power and/or training data, as per the “scaling hypothesis.”[ref 252]

Evolutionary pathways: “[AGI] from evolution”

Approaches based on algorithms competing to mimic the evolutionary brute search process that produced human intelligence.[ref 257]

Reward-based pathways: “[AGI] from powerful reinforcement learning agents”, “powerful deep learning models”

Approaches based on running reinforcement learning systems with simple rewards in rich environments.

Powerful deep learning models: “a powerful neural network model [trained] to simultaneously master a wide variety of challenging tasks (e.g. software development, novel-writing, game play, forecasting, etc.) by using reinforcement learning on human feedback and other metrics of performance.”[ref 260]

Bootstrapping pathways:[ref 261] “Seed AI”

Approaches that pursue a minimally intelligent core system capable of subsequent recursive (self)-improvement,[ref 262] potentially leveraging hardware or data “overhangs.”[ref 263]

Neuro-emulated pathways: “Whole-brain-emulation” (WBE)

Approaches that aim to digitally simulate or recreate the states of human brains at fine-grained level.

Neuro-integrationist pathways: “Brain-computer-interfaces” (BCI)

Approaches to create advanced AI, based on merging components of human and digital cognition.

Embodiment pathways:[ref 276] “Embodied agent” 

Based on providing the AI system with a robotic physical “body” to ground cognition and enable it to learn from direct experience of the world.[ref 277]

Modular cognitive architecture pathways

Used in various fields, including in robotics, where researchers integrate well-tested but distinct state-of-the-art modules (perception, reasoning, etc.) to improve agent performance without independent learning.[ref 279] 

Hybrid pathways 

Approaches that rely on combining deep neural network-based approaches to AI with other paradigms (such as symbolic AI).

1.C. Definitions focused on the aggregate societal impacts of advanced AI

(Strategic) general-purpose technology (GPT)

General-purpose military transformation (GMT)

Transformative AI (TAI):[ref 285] 

Radically transformative AI (RTAI)

AGI [economic competitiveness definition]

Machine superintelligence [form & impact definition]

“general artificial intelligence greatly outstripping the cognitive capacities of humans, and capable of bringing about revolutionary technological and economic advances across a very wide range of sectors on timescales much shorter than those characteristic of contemporary civilization”[ref 296]

1.D. Definitions focused on critical capabilities of advanced AI systems

Systems with critical moral and/or philosophical capabilities

Systems with critical economic capabilities[ref 308]

Systems with critical legal capabilities 

Systems with critical scientific capabilities

Systems with critical strategic or military capabilities[ref 320]

Systems with critical political capabilities

Actually existing AI (AEAI): A paradigm by which the broader ecosystem of AI development, on current trajectories, may produce harmful political outcomes, because “AI as currently funded, constructed, and concentrated in the economy—is misdirecting technological resources towards unproductive and dangerous outcomes. It is driven by a wasteful imitation of human comparative advantages and a confused vision of autonomous intelligence, leading it toward inefficient and harmful centralized architectures.”[ref 326]

Systems with critical exponential capabilities

Duplicator: [digital people or particular forms of advanced AI that would allow] “the ability to make instant copies of people (or of entities with similar capabilities) [leading to] explosive productivity.”[ref 332]

Systems with critical hazardous capabilities

Systems that pose or enable critical levels of (extreme or even existential) risk,[ref 333] regardless of whether they demonstrate a full range of human-level/like cognitive abilities.

Appendix 2: Lists of definitions for policy tools and field

2.A. Terms for tools for intervention

Strategy[ref 352]

Policy

Governance

2.B. Terms for the field of practice

AI governance

Transformative AI governance

Longterm(ist) AI governance

Appendix 3: Auxiliary definitions and terms

Beyond this, it is also useful to clarify a range of auxiliary definitions that can support analysis in the advanced AI governance field. These include, but are not limited to:[ref 375]


Also in this series:

Open-sourcing highly capable foundation models

International AI institutions

Executive summary

This literature review examines a range of institutional models that have been proposed for the international governance of artificial intelligence (AI). The review specifically focuses on proposals that would involve creating new international institutions for AI. As such, it focuses on seven models for international AI institutions with distinct functions. 

Part I consists of the literature review. For each model, we provide (a) a description of each model’s functions and types, (b) the most common examples of the model, (c) some underexplored examples that are not (often) mentioned in the AI governance literature but that show promise, (d) a review of proposals for applying that model to the international regulation of AI, and (e) critiques of the model both generally and in its potential application to AI. 

Part II briefly discusses some considerations for further research concerning the design of international institutions for AI, including the effectiveness of each model at accomplishing its aims, treaty-based regulatory frameworks, other institutional models not covered in this review, the compatibility of institutional functions, and institutional options to host a new international AI governance body.

Overall, the review covers seven models, as well as thirty-five common examples of those models, twenty-four additional examples, and forty-nine proposals of new AI institutions based on those models. Table 1 summarizes these findings.[ref 1]

Introduction

Recent and ongoing progress in artificial intelligence (AI) technology has highlighted that AI systems will have increasingly significant global impacts. In response, the past year has seen intense attention to the question of how to regulate these technologies, both at domestic and international levels. As part of this process, there have been renewed calls for establishing new international institutions to carry out much-needed governance functions and anchor international collaboration on managing the risks as well as realizing the benefits of this technology.

This literature review examines and categorizes a wide range of institutions that have been proposed to carry out the international governance of AI.[ref 2] Before reviewing these models, however, it is important to situate proposals to establish a new international institution on AI within the broader landscape of approaches to the global governance of AI. Not all approaches to AI governance focus on creating new institutions. Rather, the institutional approach is only one of several different approaches to international AI governance—each of them concentrating on different governance challenges posed by AI, and each of them providing different solutions.[ref 3] These approaches include: 

(1) Rely on unilateral extraterritorial regulation. The extraterritorial approach foregoes (or at least does not prioritize) the multilateral pursuit of international regimes, norms, or institutions. Rather, it aims to enact effective domestic regulations on AI developments and then rely on the direct or extraterritorial effects of such regulations to affect the conditions or standards for AI governance in other jurisdictions. As such, this approach includes proposals to first regulate AI within (key) countries, whether by existing laws,[ref 4] through new laws or standards developed by existing institutions, or through new domestic institutions (such as a US “AI Control Council”[ref 5] or a National Algorithms Safety Board[ref 6]). These national policy levers[ref 7] can unilaterally affect the global approach to AI, either directly—for instance, through the effect of export controls on chokepoints in the AI chip supply chains[ref 8]—or because of the way such regulations can spill over to other jurisdictions, as seen in discussions of a “Brussels Effect,” a “California Effect,” or even a “Beijing Effect.”[ref 9]

(2) Apply existing international institutions, regimes, or norms to AI. The norm-application-focused approach argues that because much of international law establishes broad, technology-neutral principles and obligations, and many domains are already subject to a wide set of overlapping institutional activities, AI technology is in fact already adequately regulated in international law.[ref 10] As such, AI governance does not need new institutions or novel institutional models; rather, the aim is to reassert, reapply, extend, and clarify long-existing international institutions and norms. This is one approach that has been taken (with greater and lesser success) to address the legal gaps initially created by some past technologies, such as submarine warfare,[ref 11] cyberwar,[ref 12] or data flows within the digital economy,[ref 13] amongst others. This also corresponds to the approach taken by many international legal scholars, who argue that states should simply recognize that AI is already covered and regulated by existing norms and doctrines in international law, such as the principles of International Human Rights Law,[ref 14] International Humanitarian Law, International Criminal Law,[ref 15] the doctrine of state responsibility,[ref 16] or other regimes.[ref 17]

(3) Adapt existing international institutions or norms to AI. This approach concedes that AI technology is not yet adequately or clearly governed under international law but holds that existing international institutions could still be adapted to take on this role and may already be doing so. This approach includes proposals that center on mapping, supporting, and extending the existing AI-focused activities of existing international regimes and institutions such as the IMO, ICAO, ITU,[ref 18] various UN agencies,[ref 19] or other international organizations.[ref 20] Others explore proposals for refitting existing institutions, such as expanding the G20 with a Coordinating Committee for the Governance of Artificial Intelligence[ref 21] or changing the mandate or composition of UNESCO’s International Research Centre of Artificial Intelligence (ICRAI) or the International Electrotechnical Commission (IEC),[ref 22] to take up a stronger role in AI governance. Finally, others explore how either states (through Explanatory Memoranda or treaty reservations) or treaty bodies (through Working Party Resolutions) could adapt existing treaty regimes to more clearly cover AI systems.[ref 23] The emphasis here is on a “decentralized but coordinated” approach that supports institutions to adapt to AI,[ref 24] rather than necessarily aiming to establish new institutions in an already-crowded existing international “regime complex.”[ref 25] 

(4) Create new international institutions to regulate AI based on the model of past or existing institutions. The institution-re-creating approach argues that AI technology does need new, distinct international institutions to be adequately governed. However, in developing designs or making the case for such institutions, this approach often points to the precedent of past or existing international institutions and regimes that have a similar model. 

(5) Create entirely novel international institutional models to regulate AI. This approach argues not only that AI technology needs new international institutions, but also that past or existing international institutions (mostly) do not provide adequate models to narrowly follow or mimic.[ref 26] This is potentially reflected in some especially ambitious proposals for comprehensive global AI regimes or in suggestions to introduce entirely new mechanisms (e.g., “regulatory markets”[ref 27]) to governance.

In this review, we specifically focus on proposals for international AI governance and regulation that involve creating new international institutions for AI. That is to say, our main focus is on approach 4 and, to a lesser extent, approach 5. 

We focus on new institutions because they might be better positioned to respond to the novelty, stakes, and technical features of advanced AI systems.[ref 28] Indeed, the current climate of global attention on AI seems potentially more supportive of establishing new landmark institutions for AI than has been the case in past years. As AI capabilities progress at an unexpected rate, multiple government representatives and entities[ref 29] as well as international organizations[ref 30] have recently stated their support towards a new international AI governance institution. Additionally, the idea of establishing such institutions has taken root among many of the leading actors in the AI industry.[ref 31] 

With this, our review comes with two caveats. In the first place, our focus on this institutional approach above others does not mean that pursuing the creation of new institutions is necessarily an easy strategy or more feasible than the other approaches listed above. Indeed, proposals for new treaty regimes or international institutions for AI—especially when they draw analogies with organizations that were set up decades ago—may often underestimate how much the ground of global governance has changed in recent years. As such, they do not always reckon fully with the strong trends and forces in global governance which, for better or worse, have come to frequently push states towards relying on extending existing norms (approach 2) or adapting existing institutions (approach 3)[ref 32] rather than creating novel institutions. Likewise, there are further trends shifting US policy towards pursuing international cooperation through nonbinding international agreements rather than treaties[ref 33] as well as concerns that by some trends, international organizations may be taking up a less central role in international relations today than they have in the past.[ref 34] All of these trends should temper, or at least inform, proposals to establish new institutions. 

Furthermore, even if one is determined to pursue establishing a new international institution along one of the models discussed here, many key open questions remain about the optimal route to design and establish that organization, including (a) Given that many institutional functions might be required to adequately govern advanced AI systems, might there be a need for “hybrid” or combined institutions with a dual mandate, like the IAEA?[ref 35] (b) Should an institution be tightly centralized or could it be relatively decentralized, with one or more new institutions orchestrating the AI policy activities of a constellation of many other (existing or new) organizations?[ref 36] (c) Should such an organization be established formally, or are informal club approaches adequate in the first instance?[ref 37] (d) Should voting rules within such institutions work on the grounds of consensus or simple majority? (e) What rules should govern adapting or updating the institution’s mission and mandate to track ongoing developments in AI? This review will briefly flag and discuss some of these questions in Part II but will leave many of them open for future research.

Regarding terminology, we will use both “international institution” and “international organization” interchangeably and broadly to refer to any of (a) formally established formal intergovernmental organizations (FIGOs) founded through a constituent document (e.g., WTO, WHO); (b) treaty bodies or secretariats that have a more limited mandate, primarily supporting the implementation of a treaty or regime (e.g., BWC Implementation Support Unit); and (c) ”informal IGOs” (IIGOs) that consist of loose “task groups” and coalitions of states (e.g., the G7, BRICS, G20).[ref 38] We use “model” to refer to the general cluster of institutions under discussion; we use “function” to refer to a given institutional model’s purpose or role. We use “AI proposals” to refer to the precise institutional models that are proposed for international AI governance.

I. Review of institutional models 

Below, we review a range of institutional models that have been proposed for AI governance. For each model, we discuss its general functions, different variations or forms of the model, a range of examples that are frequently invoked, and explicit AI governance proposals that follow the model. In addition, we will highlight additional examples that have not received much attention but that we believe could be promising. Finally, where applicable, we will highlight existing critiques of a given model.

Model 1: Scientific consensus-building

1.1 Functions and types: The functions of the scientific consensus-building institutional model are to (a) increase general policymaker and public awareness of an issue, and especially to (b) establish a scientific consensus on an issue. The aim of this is to facilitate greater common knowledge or shared perception of an issue amongst states, with the aim to motivate national action or enable international agreements. Overall, the goal of institutions following this model is not to establish an international consensus on how to respond or to hand down regulatory recommendations directly, but simply to provide a basic knowledge base to underpin the decisions of key actors. By design, these institutions are, or aim to be, non-political—as in the IPCC’s mantra to be “policy-relevant and yet policy-neutral, never policy-prescriptive.”[ref 39]

1.2 Common examples: Commonly cited examples of scientific consensus-building institutions include most notably the Intergovernmental Panel on Climate Change (IPCC),[ref 40] the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES),[ref 41] and the Scientific Assessment Panel (SAP) of the United Nations Environment Programme (UNEP).[ref 42] 

1.3 Underexplored examples: An example that has not yet been invoked in the literature but that could be promising to explore is the Antarctic Treaty’s Committee for Environmental Protection (CEP), which provides expert advice to the Antarctic Treaty Consultative Meetings and which combines scientific consensus-building models with risk-management functions, supporting the Protocol on Environmental Protection to the Antarctic Treaty.[ref 43] Another example could be the World Meteorological Organization (WMO), which monitors weather and climatic trends and makes information available.

1.4 Proposed AI institutions along this model: There have been a range of proposals for scientific consensus-building institutions for AI. Indeed, in 2018 the precursor initiative to what would become the Global Partnership on AI (GPAI) was initially envisaged by France and Canada as an Intergovernmental Panel on AI (IPAI) along the IPCC model.[ref 44] This proposal was supported by many researchers: Kemp and others suggest an IPAI that could measure, track, and forecast progress in AI, as well as its use and impacts, to “provide a legitimate, authoritative voice on the state and trends of AI technologies.”[ref 45] They argue that an IPAI could perform structural assessments every three years as well as take up quick-response special-issue assessments. In a contemporaneous paper, Mialhe proposes an IPAI model as an institution that would gather a large and global group of experts “to inform dialogue, coordination, and pave the way for efficient global governance of AI.”[ref 46]

More recently, Ho and others propose an intergovernmental Commission on Frontier AI to “establish a scientific position on opportunities and risks from advanced AI and how they may be managed,” to help increase public awareness and understanding, to “contribute to a scientifically informed account of AI use and risk mitigation [and to] be a source of expertise for policymakers.”[ref 47] Bremmer and Suleyman propose a global scientific body to objectively advise governments and international bodies on questions as basic as what AI is and what kinds of policy challenges it poses.[ref 48] They draw a direct link to the IPCC model, noting that “this body would have a global imprimatur and scientific (and geopolitical) independence […] [a]nd its reports could inform multilateral and multistakeholder negotiations on AI.”[ref 49] Bak-Coleman and others argue in favor of an Intergovernmental Panel on Information Technology, an independent, IPCC-like panel charged with studying the “impact of emerging information technologies on the world’s social, economic, political and natural systems.”[ref 50] In their view, this panel would focus on many “computational systems,” including “search engines, online banking, social-media platforms and large language models” and would have leverage to persuade companies to share key data.[ref 51] 

Finally, Mulgan and others, in a 2023 paper, propose a Global AI Observatory (GAIO) as an institution that “would provide the necessary facts and analysis to support decision-making [and] would synthesize the science and evidence needed to support a diversity of governance responses.”[ref 52] Again drawing a direct comparison to the IPCC, they anticipate that such a body could set the foundation for more serious regulation of AI through six activities: (a) a global standardized incident reporting database, (b) a registry of crucial AI systems, (c) a shared body of data and analysis of the key facts of the AI ecosystem, (d) working groups exploring global knowledge about the impacts of AI on critical areas, (e) the ability to offer legislative assistance and model laws, and (f) the ability to orchestrate global debate through an annual report on the state of AI.[ref 53] They have since incorporated this proposal within a larger “Framework for the International Governance of AI” by the Carnegie Council for Ethics in International Affairs’s Artificial Intelligence & Equality Initiative, alongside other components such as a neutral technical organization to analyze “which legal frameworks, best practices, and standards have risen to the highest level of global acceptance.”[ref 54]

1.5 Critiques of this model: One concern that has been expressed is that AI governance is currently too institutionally immature to support an IPCC-like model, since, as Roberts argues, “the IPCC […] was preceded by almost two decades of multilateral scientific assessments, before being formalised.”[ref 55] He considers that this may be a particular problem for replicating that model for AI, given that some AI risks are currently still subject to significantly less scientific consensus.[ref 56] Separately, Bak-Coleman and others argue that a scientific consensus-building organization for digital technologies would face a far more difficult research environment than the IPCC and IPBES because, as opposed to the rich data and scientifically well-understood mechanisms that characterize climate change and ecosystem degradation, research into the impacts of digital technologies often faces data access restrictions.[ref 57] Ho and others argue that a Commission on Frontier AI would face more general scientific challenges in adequately studying future risks “on the horizon,” as well as potential politicization, both of which might inhibit the ability of such a body to effectively build consensus.[ref 58] Indeed, it is possible that in the absence of decisive and incontrovertible evidence about the trajectory and risks of AI, a scientific consensus-building institution would likely struggle to deliver on its core mission and might instead spark significant scientific contestation and disagreement amongst AI researchers.

Model 2: Political consensus-building and norm-setting 

2.1 Functions and types: The function of political consensus-building and norm-setting institutions is to help states come to greater political agreement and convergence about the way to respond to a (usually) clearly identified and (ideally) agreed issue or phenomenon. These institutions’ aim is to reach the required political consensus necessary to either align national policymaking responses sufficiently well, achieving some level of harmonization that reduces trade restrictions or impedes progress towards addressing the issue, or to help begin negotiations on other institutions that establish more stringent regimes. Political consensus-building institutions do this by providing fora for discussion and debate that can aid the articulation of potential compromises between state interests and by exerting normative pressure on states towards certain goals. In a norm-setting capacity, institutions can also draw on (growing) political consensus to set and share informal norms, even if formal institutions have not yet been created. For instance, if negotiations for a regulatory or control institution are held up, slowed, or fail, political consensus-building institutions can also play a norm-setting function by establishing, as soft law, informal standards for behavior. While such norms are not as strictly specified or as enforceable as hard-law regulations are, they can still carry force and see take-up.

2.2 Common examples: There are a range of examples of political consensus-building institutions. Some of these are broad, such as conferences of parties to a treaty (also known as COPs, the most popular one being that of the United Nations Framework Convention on Climate Change [UNFCCC]).[ref 59] Many others, however, such as the Organization for Economic Co-operation and Development (OECD), the G20, and the G7, reflect smaller, at times more informal, governance “clubs,” which can often move ahead towards policy-setting more quickly because their membership is already somewhat aligned[ref 60] and because many of them have already begun to undertake activities or incorporate institutional units focused on AI developments.[ref 61]

Gutierrez and others have reviewed a range of historical cases of (domestic and global) soft-law governance that they argue could provide lessons for AI. These include a range of institutional activities, such as UNESCO’s 1997 Universal Declaration on the Human Genome and Human Rights, 2003 International Declaration on Human Genetic Data, and 2005 Universal Declaration on Bioethics and Human Rights,[ref 62] the Environmental Management System (ISO 14001), the Sustainable Forestry Practices by the Sustainable Forestry Initiative and Forest Stewardship Council, and the Leadership in Energy and Environmental Design initiative.[ref 63] Others, however, such as the Internet Corporation for Assigned Names and Numbers (ICANN), the Asilomar rDNA Guidelines, the International Gene Synthesis Consortium, the International Society for Stem Cell Research Guidelines, the BASF Code of Conduct, the Environmental Defense Fund, and the DuPont Risk Frameworks, offer greater examples of success.[ref 64] Turner likewise argues that the ICANN, which manages to develop productive internet policy, offers a model for international AI governance.[ref 65] Elsewhere, Harding argues that the 1967 Outer Space Treaty offers a telling case of a treaty regime that quickly crystallized state expectations and policies around safe innovation in a then-novel area of science.[ref 66] Finally, Feijóo and others suggest that “new technology diplomacy” on AI could involve a series of meetings or global conferences on AI, which could draw lessons from experiences such as the World Summits on the Information Society (WSIS).[ref 67]

2.3 Underexplored examples: Examples of norm-setting institutions that formulate and share relevant soft-law guidelines on technology include the International Organization for Standardization (ISO), International Electrotechnical Commission (IEC), the International Telecommunication Union (ITU), and the United Nations Commission on International Trade Law (UNCITRAL)’s Working Group on Electronic Commerce.[ref 68] Another good example of a political consensus-building and norm-setting initiative could be found in the 1998 Lysøen Declaration,[ref 69] an initiative by Canada and Norway that expanded to 11 highly committed states along with several NGOs and which kicked off a “Human Security Network” that achieved a significant and outsized global impact, including the Ottawa Treaty ban on antipersonnel mines, the Rome Treaty establishing the International Criminal Court, the Kimberley Process aimed at inhibiting the flow of conflict diamonds, and landmark Security Council resolutions on Children and Armed Conflict and Women, Peace and Security. Another norm-setting institution that is not yet often invoked in AI discussions but that could be promising to explore is the Codex Alimentarius Commission (CAC), which develops and maintains the Food and Agriculture Organization (FAO)’s Codex Alimentarius, a collection of non-enforceable but internationally recognized standards and codes of practice about various aspects of food production, food labeling, and safety. Another example of a “club” under this model which is not often mentioned but that could be influential is the BRICS group, which recently expanded from 5 to 11 members.

2.4 Proposed AI institutions along this model: Many proposals for political consensus-building institutions on AI tend to not focus on establishing new institutions, arguing instead that it is best to put AI issues on the agenda of existing (established and recognized) consensus-building institutions (e.g., the G20) or of existing norm-setting institutions (e.g., ISO). Indeed, even recent proposals for new international institutions still emphasize that these should link up well with already-ongoing initiatives, such as the G7 Hiroshima Process on AI.[ref 70] 

However, there have been proposals for new political consensus-building institutions. Erdélyi and Goldsmith propose an International AI Organisation (IAIO), “to serve as an international forum for discussion and engage in standard setting activities.”[ref 71] They argue that “at least initially, the global AI governance framework should display a relatively low level of institutional formality and use soft-law instruments to support national policymakers in the design of AI policies.”[ref 72] Moreover, they emphasize that the IAIO “should be hosted by a neutral country to provide for a safe environment, limit avenues for political conflict, and build a climate of mutual tolerance and appreciation.”[ref 73] More recently, the US’s National Security Commission on Artificial Intelligence’s final report includes a proposal for an Emerging Technology Coalition, “to promote the design, development, and use of emerging technologies according to democratic norms and values; coordinate policies and investments to counter the malign use of these technologies by authoritarian regimes; and provide concrete, competitive alternatives to counter the adoption of digital infrastructure made in China.”[ref 74] Recently, Marcus and Reuel propose the creation of an International Agency for AI (IAAI) tasked with convening experts and developing tools to help find “governance and technical solutions to promote safe, secure and peaceful AI technologies.”[ref 75] 

At the looser organizational end, Feijóo and others propose a new technology diplomacy initiative as “a renewed kind of international engagement aimed at transcending narrow national interests and seeks to shape a global set of principles.” In their view, such a framework could “lead to an international constitutional charter for AI.”[ref 76] Finally, Jernite and others propose a multi-party international Data Governance Structure, a multi-party, distributed governance arrangement for improving the global systematic and transparent management of language data at a global level, and which includes a Data Stewardship Organization in order to develop “appropriate management plans, access restrictions, and legal scholarship.”[ref 77] Other proposed organizations are also more focused on supporting states in implementing AI policy, such as through training. For instance, Turner proposes creating an International Academy for AI Law and Regulation.[ref 78]

2.5 Critiques of this model: There have not generally been many in-depth critiques of proposals for new political consensus-building or norm-setting institutions. However, some concerns focus on the difficult tradeoffs that consensus-building institutions face in deciding whether to prioritize breadth of membership and inclusion or depth of mission alignment. Institutions that aim to foster consensus across a very broad swath of actors may be very slow to reach such normative consensus, and even when they do, they may only achieve a “lowest-common-denominator” agreement.[ref 79] On the other hand, others counter that AI consensus-building institutions or fora will need to be sufficiently inclusive—in particular, and possibly controversially, with regard to China[ref 80]—if they do not want to run the risk of producing a fractured and ineffective regime, or even see negotiations implode over the political question of who was invited or excluded.[ref 81] Finally, a more foundational challenge to political consensus-building institutions is that while it may result in (the appearance of) joint narratives, this may not have much teeth if the agreement is not binding.[ref 82]

Model 3: Coordination of policy and regulation

3.1 Functions and types: The functions of this institutional model are to help align and coordinate policies, standards, or norms[ref 83] in order to ensure a coherent international approach to a common problem. There is significant internal variation in the setup of institutions under this model, with various subsidiary functions. For instance, such institutions may (a) directly regulate the deployment of a technology in relative detail, requiring states to comply with and implement those regulations at the national level; (b) assist states in the national implementation of agreed AI policies; (c) focus on the harmonization and coordination of policies; (d) focus on the certification of industries or jurisdictions to ensure they comply with certain standards; or (e) in some cases, take up functions related to monitoring and enforcing norm compliance.

3.2 Common examples: Common examples of policy-setting institutions include the World Trade Organization (WTO) as an exemplar of an empowered, centralized regulatory institution.[ref 84] Other examples given of regulatory institutions include the International Civil Aviation Organization (ICAO), the International Maritime Organization (IMO), the International Atomic Energy Agency (IAEA), and the Financial Action Task Force (FATF).[ref 85] Examples of policy-coordinating institutions may include the United Nations Environment Programme (UNEP), which synchronized international agreements on the environment and facilitated new agreements, including the 1985 Vienna Convention for the Protection of the Ozone Layer.[ref 86] Nemitz points to the example of the institutions created under the United Nations Convention on the Law of the Sea (UNCLOS) as a model for an AI regime, including an international court to enforce the proposed treaty.[ref 87] Finally, Sepasspour proposes the establishment of an “AI Ethics and Safety Unit” within the existing International Electrotechnical Commission (IEC), under a model that is “inspired by the Food and Agriculture Organization’s (FAO) Food Safety and Quality Unit and Emergency Prevention System for Food Safety early warning system.”[ref 88]

3.3 Underexplored examples: Examples that are not yet often discussed but that could be useful or insightful include the European Monitoring and Evaluation Programme (EMEP), which implements the 1983 Convention on Long-Range Transboundary Air Pollution—a regime that has proven particularly adaptive.[ref 89] A more sui generis example is that of international financial institutions, like the World Bank or the International Monetary Fund (IMF), which tend to shape domestic policy indirectly through conditional access to loans or development fund

3.4 Proposed AI institutions along this model: Specific to advanced AI, recent proposals for regulatory institutions include Ho and other’s Advanced AI Governance Organisation, which “could help internationalize and align efforts to address global risks from advanced AI systems by setting governance norms and standards, and assisting in their implementation.”[ref 90] 

Trager and others propose an International AI Organization (IAIO) to certify jurisdictions’ compliance with international oversight standards. These would be enforced through a system of conditional market access in which trade barriers would be imposed on jurisdictions which are not certified or whose supply chains integrate AI from non-IAIO certified jurisdictions. Among other advantages, the authors suggest that this system could be less vulnerable to proliferation of industry secrets by having states establish their own domestic regulatory entities rather than having international jurisdictional monitoring (as is the case with the IAEA). However, the authors also propose that the IAIO could provide monitoring services to governments that have not yet built their own monitoring capabilities. The authors argue that their model has several advantages over others, including agile standard-setting, monitoring, and enforcement.[ref 91] 

In a regional context, Stix proposes an EU AI Agency which, among other roles, could be an analyzer of gaps in AI policy and a developer of policies that could fill such gaps. For this agency to be effective, Stix suggests it should be independent from political agendas by, for instance, having a mandate that does not coincide with election cycles.[ref 92] Webb proposes a “Global Alliance on Intelligence Augmentation” (GAIA), which would bring together experts from different fields to set best practices for AI.[ref 93] 

Chowdhury proposes a generative AI global governance body as a “consolidated ongoing effort with expert advisory and collaborations [which] should receive advisory input and guidance from industry, but have the capacity to make independent binding decisions that companies must comply with.”[ref 94] In her analysis, this body should be funded via unrestricted and unconditional funds by all AI companies engaged in the creation or use of generative AI and it should “cover all aspects of generative AI models, including their development, deployment, and use as it relates to the public good. It should build upon tangible recommendations from civil society and academic organizations, and have the authority to enforce its decisions, including the power to require changes in the design or use of generative AI models, or even halt their use altogether if necessary.”[ref 95]

A proposal for a policy-coordinating institution is Kemp and others’ Coordinator and Catalyser of International AI Law, which would be “a coordinator for existing efforts to govern AI and catalyze multilateral treaties and arrangements for neglected issues.”[ref 96]

3.5 Critiques of this model: Castel and Castel critique international conventions on the grounds that they “are difficult to monitor and control.”[ref 97] More specifically, Ho and others argue that a model like an Advanced AI Governance Organization would face challenges around its ability to set and update standards sufficiently quickly, around incentivizing state participation in adopting the regulations, and in sufficiently scoping the challenges to focus on.[ref 98] Finally, reviewing general patterns in current state activities on AI standard-setting, von Ingersleben notes that “technical experts hailing from geopolitical rivals, such as the United States and China, readily collaborate on technical AI standards within transnational standard-setting organizations, whereas governments are much less willing to collaborate on global ethical AI standards within international organizations,”[ref 99] which suggests potential thresholds to overcoming state disinterest in participating in any international institutions focused on more political and ethical standard-setting.

Model 4: Enforcement of standards or restrictions

4.1 Functions and types: The function of this institutional model is to prevent the production, proliferation, or irresponsible deployment of a dangerous or illegal technology, product, or activity. To fulfill that function, institutions under this model rely on, among other mechanisms, (a) bans and moratoria, (b) nonproliferation regimes, (c) export-control lists, (d) monitoring and verification mechanisms,[ref 100] (e) licensing regimes, and (f) registering and/or tracking of key resources, materials, or stocks. Other types of mechanisms, such as (g) confidence-building measures (CBMs), are generally transparency-enabling.[ref 101] While generally focused on managing tensions and preventing escalations,[ref 102] CBMs can also build trust amongst states in each other’s mutual compliance with standards or prohibitions, and can therefore also support or underwrite standard- and restriction-enforcing institutions.

4.2 Common examples: The most prominent example of this model, especially in discussions of institutions capable of carrying out monitoring and verification roles, is the International Atomic Energy Agency (IAEA)[ref 103]—in particular, its Department of Safeguards. Many other proposals refer to the monitoring and verification mechanisms of arms control treaties.[ref 104] For instance, Baker has studied the monitoring and verification mechanisms for different types of nuclear arms control regimes, reviewing the role of the IAEA system under Comprehensive Safeguards Agreements with Additional Protocols in monitoring nonproliferation treaties such as the Non-Proliferation Treaty (NPT) and the five Nuclear-Weapon-Free-Zone Treaties, the role of monitoring and verification arrangements in monitoring bilateral nuclear arms control limitation treaties, and the role of the International Monitoring System (IMS) in monitoring and enforcing (prospective) nuclear test bans under the Preparatory Commission for the Comprehensive Nuclear-Test-Ban Treaty Organization (CTBTO).[ref 105] Shavit likewise refers to the precedent of the NPT and IAEA in discussing a resource (compute) monitoring framework for AI.[ref 106] 

Examples given of export-control regimes include the Nuclear Suppliers Group, the Wassenaar Arrangement, and the Missile Technology Control Regime.[ref 107] As examples of CBMs, people have pointed to the Open Skies Treaty,[ref 108] which is enforced by the Open Skies Consultative Commission (OSCE). 

There are also examples of global technology control institutions that were not carried through but which are still discussed as inspirations for AI, such as the international Atomic Development Authority (ADA) proposed in the early nuclear age[ref 109] or early- to mid-20th-century proposals for the global regulation of military aviation.[ref 110]

4.3 Underexplored examples: Examples that are not yet often discussed in the context of AI but that could be promising are the Organisation for the Prohibition of Chemical Weapons (OPCW),[ref 111] the Biological Weapons Convention’s Implementation Support Unit, the International Maritime Organization (in its ship registration function), and the Convention on International Trade in Endangered Species of Wild Fauna and Flora’s (CITES) Secretariat, specifically, its database of national import and export reports.

4.4 Proposed AI institutions along this model: Proposals along this model are particularly widespread and prevalent. Indeed, as mentioned, a significant part of the literature on the international governance of AI has made reference to some sort of “IAEA for AI.” For instance, in relatively early proposals,[ref 112] Turchin and others propose a “UN-backed AI-control agency” which “would require much tighter and swifter control mechanisms, and would be functionally equivalent to a world government designed specifically to contain AI.”[ref 113] Ramamoorthy and Yampolskiy propose a “global watchdog agency” that would have the express purpose of tracking AGI programs and that would have the jurisdiction and the lawful authority to intercept and halt unlawful attempts at AGI development.[ref 114] Pointing to the precedent of both the IAEA and its inspection regime, and the Comprehensive Nuclear Test-Ban Treaty Organization (CTBTO)’s Preparatory Commission, Nindler proposes an International Enforcement Agency for safe AI research and development, which would support and implement the provisions of an international treaty on safe AI research and development, with the general mission “to accelerate and enlarge the contribution of artificial intelligence to peace, health and prosperity throughout the world [and … to ensure that its assistance] is not used in such a way as to further any military purpose.”[ref 115] Such a body would be charged with drafting safety protocols and measures, and he suggests that its enforcement could, in extreme cases, be backed up by the use of force under the UN Security Council’s Chapter VII powers.[ref 116] 

Whitfield draws on the example of the UN Framework Convention on Climate Change to propose a UN Framework Convention on AI (UNFCAI) along with a Protocol on AI that would subsequently deliver the first set of enforceable AI regulations. He proposes that these should be supported by three new bodies: an AI Global Authority (AIGA) to provide an inspection regime in particular for military AI, an associated “Parliamentary Assembly” supervisory body that would enhance democratic input into the treaty’s operations and play “a constructive monitoring role,” as well as a multistakeholder Intergovernmental Panel on AI to provide scientific, technical, and policy advice to the UNFCAI.[ref 117]

More recently,[ref 118] Ho and others propose an “Advanced AI Governance Organization” which, in addition to setting international standards for the development of advanced AI (as discussed above), could monitor compliance with these standards through, for example, self-reporting, monitoring practices within jurisdictions, or detection and inspection of large data centers.[ref 119] Altman and others propose an AIEA for Superintelligence” consisting of “an international authority that can inspect systems, require audits, test for compliance with safety standards, place restrictions on degrees of deployment and levels of security.”[ref 120] In a very similar vein, Guest (based on an earlier proposal by Karnofsky)[ref 121] calls for an “International Agency for Artificial Intelligence (IAIA)” to conduct “extensive verification through on-chip mechanisms [and] on-site inspections” as part of his proposal for a “Collaborative Handling of Artificial Intelligence Risks with Training Standards (CHARTS).”[ref 122] Drawing together elements from several models—and referring to the examples of the IPCC, Interpol, and the WTO’s dispute settlement system—Gutierrez proposes a “multilateral AI governance initiative” to mitigate “the shared large-scale high-risk harms caused directly or indirectly by AI.”[ref 123] His proposal envisions an organizational structure consisting of (a) a forum for member state representation (which adopts decisions via supermajority); (b) technical bodies, such as an external board of experts, and a permanent technical and liaison secretariat that works from an information and enforcement network and which can issue “red notice” alerts; and (c) an arbitration board that can hear complaints by non-state AI developers who seek to contest these notices as well as by member states.[ref 124]

In a 2013 paper, Wilson proposes an “Emerging Technologies Treaty”[ref 125] that would address risks from many emerging technologies. In his view, this treaty could either be housed under an existing international organization or body or established separately, and it would establish a body of experts that would determine whether there was a “reasonable grounds for concern” about AI or other dangerous research, after which states would be required to regulate or temporarily prohibit research.[ref 126] Likewise drawing on the IAEA model, Chesterman proposes an International Artificial Intelligence Agency (IAIA) as an institution with “a clear and limited normative agenda, with a graduated approach to enforcement,” arguing that “the main ‘red line’ proposed here would be the weaponization of AI—understood narrowly as the development of lethal autonomous weapon systems lacking ‘meaningful human control’ and more broadly as the development of AI systems posing a real risk of being uncontrollable or uncontainable.”[ref 127] In practice, this organization would draw up safety standards, develop a forensic capability to identify those responsible for “rogue” AI, serve as a clearinghouse to gather and share information about such systems, and provide early notification of emergencies.[ref 128] Chesterman argues that one organizational cause that could be adopted for this IAIA is to learn from the IAEA, where its Board of Governors (rather than the annual General Conference) has ongoing oversight of its operations.

Other authors endorse an institution more directly aimed at preventing or limiting proliferation of dangerous AI systems. Jordan and others propose a “NPT+” model,[ref 129] and the Future of Life Institute (FLI) proposes “international agreements to limit particularly high-risk AI proliferation and mitigate the risks of advanced AI.”[ref 130] PauseAI proposes an international agreement that sets up an “International AI Safety Agency” that would be in charge of granting approvals for deployments of AI systems and new training runs above a certain size.[ref 131] The Elders, a group of independent former world leaders, have recently called on countries to request, via the UN General Assembly, that the International Law Commission draft an international treaty to establish a new “International AI Safety Agency,”[ref 132] drawing on the models of the NPT and the IAEA, “to manage these powerful technologies within robust safety protocols [and to …] ensure AI is used in ways consistent with international law and human rights treaties.”[ref 133] More specific monitoring provisions are also entertained; for instance, Balwit briefly discusses an advanced AI chips registry, potentially organized by an international agency.[ref 134]

At the level of transparency-supporting agreements, there are many proposals for confidence-building measures for (military) AI. Such proposals focus on bilateral arrangements that build confidence amongst states and contribute to stability (as under Model 5), but which lack distinct institutions. For instance, Shoker and others discuss an “international code of conduct for state behavior.”[ref 135] Scharre, Horowitz, Khan and others discuss a range of other AI CBMs,[ref 136] including the marking of autonomous weapons systems, geographic limits, and limits on particular (e.g., nuclear) operations of AI.[ref 137] They propose to group these under an International Autonomous Incidents Agreement (IAIA) to “help reduce risks from accidental escalation by autonomous systems, as well as reduce ambiguity about the extent of human intention behind the behavior of autonomous systems.”[ref 138] In doing so, they point to the precedent of arrangements such as the 1972 Incidents at Sea Agreement[ref 139] as well as the 12th–19th century development of Maritime Prize Law.[ref 140] Imbrie and Kania propose an “Open Skies on AI” agreement.[ref 141] Bremmer & Suleyman propose a bilateral US-China regime to foster cooperation between the US and Beijing on AI, envisioning this “to create areas of commonality and even guardrails proposed and policed by a third party.”[ref 142] 

4.5 Critiques of this model: Many critiques of the enforcement model have ended up focusing (whether fairly or not) on the appropriateness of the basic analogy between nuclear weapons and AI that is explicit or implicit in proposals for an IAEA- or NPT-like regime. For instance, Kaushik and Korda have opposed what they see as aspirations to a “wholesale ban” on dangerous AI and argue that “attempting to regulate artificial intelligence indiscriminately would be akin to regulating the concept of nuclear fission itself.”[ref 143] 

Others critique the appropriateness of an IAEA-modeled approach: Stewart suggests that the focus on the IAEA’s safeguards is inadequate since AI systems cannot be safeguarded in the same way, and he suggests that, rather, better lessons might be found in the IAEA’s International Physical Protection Advisory Service (IPPAS) missions, which allow it to serve as an independent third party to assess the regulatory preparedness of countries that aim to develop nuclear programs.[ref 144] Drexel and Depp argue that even if this IAEA model could work on a technical level, it will likely be prohibitively difficult to negotiate such an intense level of oversight.[ref 145] Further, Sepasspour as well as Law note that rather than a straightforward setup, there were years of delay between the IAEA’s establishment (1957), its adoption of the INFCIRC 26 safeguards document (1961), its taking of a leading role in nuclear nonproliferation upon the adoption of the NPT (1968), and its eventual further empowerment of its verification function through the Additional Protocol (1997).[ref 146] Such a slow aggregation might not be adequate given the speed of advanced AI development. Finally, another issue is that the strength of an IAEA agency depends on the existence of supportive international treaties as well as specific incentives for participation.

Others question whether this model would be desirable, even if achievable. Howard generally critiques many governance proposals that would involve centralized control (whether domestic or global) over the proliferation of and access to frontier AI systems, arguing that such centralization would end up only advantaging currently powerful AI labs as well as malicious actors willing to steal models, with the concern that this would have significant illiberal effects.[ref 147]

Model 5: Stabilization and emergency response 

5.1 Functions and types: The function of this institutional model is to ensure that an emerging technology or an emergency does not have a negative impact on social stability and international peace.

Such institutions can serve various subsidiary functions, including (a) performing general stability management by assessing and mitigating systemic vulnerabilities that are susceptible to incidents or accidents; (b) providing early warning of—and response coordination to—incidents and emergencies, providing timely warning, and creating common knowledge of an emergency;[ref 148] (c) generally stabilizing relations, behavior, and expectations around AI technology to encourage transparency and trust around state activities in a particular domain and to avoid inadvertent military conflict.

5.2 Common examples: Examples of institutions involved in stability management include the Financial Stability Board (FSB), an entity “composed of central bankers, ministries of finance, and supervisory and regulatory authorities from around the world.”[ref 149] Another example might be the United Nations Office for Disaster Risk Reduction (UNDRR), which focuses on responses to natural disasters.[ref 150] Gutierrez invokes Interpol’s “red notice” alert system as an example of a model by which an international institution could alert global stakeholders about the dangers of a particular AI system.[ref 151]

5.3 Underexplored examples: Examples that are not yet invoked, but that could be promising examples of early warning functions include WHO’s “public health emergency of international concern” early warning mechanism and the procedure established in the IAEA’s 1986 Convention on Early Notification of a Nuclear Accident.

5.4 Proposed AI institutions along this model: AI proposals along the early warning model include Pauwels’ paper describing a Global Foresight Observatory as a multistakeholder platform aimed at fostering greater cooperation in technological and political preparedness for the impacts of innovation in various fields, including AI.[ref 152] Brenner and Suleyman propose a Geotechnology Stability Board which “could work to maintain geopolitical stability amid rapid AI-driven change” based on the coordination of national regulatory authorities and international standard-setting bodies. At other times, such a body would help prevent global technology actors from “engaging in regulatory arbitrage or hiding behind corporate domiciles.” Finally, it could also take up responsibility for governing open-source AI and censoring uploads of highly dangerous models.[ref 153] 

5.5 Critiques of this model: As there have been relatively limited numbers of proposals for this model, there are not yet many critiques. However, possible critiques might focus on the potential adequacy of relying on international institutions to respond to (rather than prevent) situations where dangerous AI systems have already seen deployment, as coordinating, communicating, and implementing effective countermeasures in those situations might either be very difficult or far too slow to respond adequately to countering a misaligned AI system. 

Model 6: International joint research

6.1 Functions and types: The function of this institutional model is to start a bilateral or multilateral collaboration between states or state entities to solve a common problem or achieve a common goal. Most institutions following this model would focus on accelerating the development of a technology or exploitation of a resource by particular actors in order to avoid races. Others would aim at speeding up the development of safety techniques. 

In some proposals, an institution like this aims not just to rally and organize a major research project, but simultaneously to include elements of an enforcing institution in order to exclude all other actors from conducting research and/or creating capabilities around that problem or goal, creating a de facto or an explicit international monopoly on an activity. 

6.2 Common examples: Examples that are pointed to as models of an international joint scientific program include the European Organization for Nuclear Research (CERN),[ref 154] ITER, the International Space Station (ISS), and the Human Genome Project.[ref 155] Example models of a (proposed) international monopoly include the Acheson-Lilienthal Proposal[ref 156] and the resulting Baruch Plan, which called for the creation of an Atomic Development Authority.[ref 157] 

6.3 Underexplored examples: Examples that are not yet discussed in the literature but that could be promising are the James Webb Space Telescope and the Laser Interferometer Gravitational-Wave Observatory (LIGO),[ref 158] which is organized internationally through the LIGO Scientific Collaboration (LSC).

6.4 Proposed AI institutions along this model: Explicit AI proposals along the joint scientific program model are various.[ref 159] Some proposals focus primarily on accelerating safety. Lewis Ho and others suggest an “AI Safety Project” to “promote AI safety R&D by promoting its scale, resourcing and coordination.” To ensure AI systems are reliable and less vulnerable to misuse, this institution would have access to significant compute and engineering capacity as well as to AI models developed by AI companies. Contrary to other international joint scientific programs like CERN or ITER, which are strictly intergovernmental, Ho and others propose that the AI Safety Project comprise other actors as well (e.g., civil society and the industry). The authors also suggest that, to prevent replication of models or diffusion of dangerous technologies, the AI Safety Project should incorporate information and security measures such as siloing information, structuring model access, and designing internal review processes.[ref 160] Neufville and Baum point out that “a clearinghouse for research into AI” could solve the collective problem of underinvestment in basic research, AI ethics, and safety research.[ref 161] More ambitiously, Ramamoorthy and Yampolskiy propose a “Benevolent AGI Treaty,” which involves “the development of AGI as a global, non-strategic humanitarian objective, under the aegis of a special agency within the United Nations.”[ref 162] 

Other proposals suggest intergovernmental collaboration for the development of AI systems more generally. Daniel Zhang and others at Stanford University’s HAI recommend that the United States and like-minded allies create a “Multilateral Artificial Intelligence Research Institute (MAIRI)” to facilitate scientific exchanges and promote collaboration on AI research—including the risks, governance, and socio-economic impact of AI—based on a foundational agreement outlining agreed research practices. The authors suggest that MAIRI could also strengthen policy coordination around AI.[ref 163] Fischer and Wenger add that a “neutral hub for AI research” should have four functions: (a) fundamental research in the field of AI, (b) research and reflection on societal risks associated with AI, (c) development of norms and best practices regarding the application of AI, and (d) further education for AI researchers. This hub could be created by a conglomerate of like-minded states but should eventually be open to all states and possibly be linked to the United Nations through a cooperation agreement, according to the authors.[ref 164] Other authors posit that an international collaboration on AI research and development should include all members of the United Nations from the start, as similar projects like the ISS or the Human Genome Project have done. They suggest that this approach might reduce the possibility of an international conflict.[ref 165] In this vein, Kemp and others call for the foundation of a “UN AI Research Organization (UNAIRO),” which would focus on “building AI technologies in the public interest, including to help meet international targets […] [a] secondary goal could be to conduct basic research on improving AI techniques in the safest, careful and responsible environment possible.”[ref 166]

Philipp Slusallek, Scientific Director of the German Research Center for Artificial Intelligence, suggests a “CERN for AI”—“a collaborative, scientific effort to accelerate and consolidate the development and uptake of AI for the benefit of all humans and our environment.” Slusallek promotes a very open and transparent design for this institution, in which data and knowledge would flow freely between collaborators.[ref 167] Similarly, the Large-scale Artificial Intelligence Open Network (LAION) calls for a CERN-like open-source collaboration among the United States and allied countries to establish an international “supercomputing research facility” hosting “a diverse array of machines equipped with at least 100,000 high-performance state-of-the-art accelerators” that can be overseen by democratically elected institutions from participating countries.[ref 168] Daniel Dewey goes a step further and suggests a potential joint international AI project with a monopoly over hazardous AI development in the same spirit of the 1946 Baruch Plan, which proposed an international Atomic Development Authority with a monopoly over nuclear activities. However, Dewey admits this proposal is possibly politically intractable.[ref 169] In another proposal for monopolized international development, Miotti suggests a “Multilateral AGI Consortium” (MAGIC), which would be an international organization mandated to run “the world’s only advanced and secure AI facility focused on safety-first research and development of advanced AI.”[ref 170] This organization would only share breakthroughs with the outside world once proven demonstrably safe and would therefore be coupled with a global moratorium on the creation of AI systems exceeding a set compute-governance threshold.

The proposals for an institution analogous to CERN discussed thus far envision a grand institution that draws talent and resources for research and development of AI projects in general. Other proposals have a narrower focus. Charlotte Stix, for example, suggests that a more decentralized version of this model could be more beneficial. Stix argues that a “European Artificial Intelligence megaproject could be composed of a centralized headquarters to overview collaborations and provide economies of scale for AI precursors within a network of affiliated AI laboratories that conduct most of the research.[ref 171] Other authors argue that rather than focus on AI research in general, an international research collaboration could focus on the use of AI to solve problems in a specific field, such as climate change, health, privacy-enhancing technologies, economic measurement, or the sustainable development goals.[ref 172] 

6.5 Critiques of this model: In general, there have been few sustained critiques of this institutional model. However, Ho and others suggest that an international collaboration to conduct technical AI-safety research might face challenges in that it might pull safety researchers away from the frontier AI developers, reducing in-house safety expertise. In addition, there are concerns that any international project that would need to access advanced AI models would run risks over security concerns and model leaking.[ref 173]

Moreover, more fundamental critiques do exist; for instance, Kaushik and Korda critique the feasibility of a “Manhattan Project-like undertaking to address the ‘alignment problem’,” arguing that massively accelerating AI-safety research through any large-scale governmental project is infeasible. Moreover, they argue that it is an inappropriate analogy because the Manhattan Project offered a singular goal, whereas AI safety faces a situation where ‘“ten thousand researchers have ten thousand different ideas on what it means and how to achieve it.”[ref 174] 

Model 7: Distribution of benefits and access 

7.1 Functions and types: The function of this institutional model is to provide access to the benefits of a technology or a global public good to those states or individuals who do not yet have it due to geographic or economic reasons, among others. Very often, the aim of such an institution is to facilitate unrestricted access or even access schemes targeted to the most needy and deprived. When the information or goods being shared can potentially pose a risk or be misused, yet responsible access is still considered a legitimate, necessary, or beneficial goal, institutions under this model tend to create a system for conditional access. 

7.2 Common examples: Examples of unrestricted benefit-distributor institutions include international public-private partnerships like Gavi, the Vaccine Alliance, and the Global Fund to Fight AIDS, Tuberculosis and Malaria.[ref 175] Examples of conditional benefit-distributor institutions might include the IAEA’s nuclear fuel bank.[ref 176]

7.3 Underexplored examples: Examples that are not yet invoked in the AI context but that could be promising include the Nagoya Protocol’s Access and Benefit-sharing Clearing-House (ABS Clearing-House),[ref 177] the UN Climate Technology Centre and Network,[ref 178] and the United Nations Industrial Development Organization (UNIDO), which is tasked with helping build up industrial capacities in developing countries. 

7.4 Proposed AI institutions along this model: Stafford and Trager draw an analogy between the NPT and a potential international regime to govern transformative AI. The basis for this comparison is that both technologies are dual-use, both present risks even in civilian applications, and there are significant gaps in the access different states have to these technologies. Just like in the case of nuclear energy, in a scenario where there is a clear leader in the race to develop AI while others are lagging, it is mutually beneficial for the actors to enter a technology-sharing bargain. This way, the leader can ensure it will continue to be at the front of the race, while the laggards secure access to the technology. Stafford and Trager call this the “Hopeless Laggard effect.” To enforce this technology-sharing bargain in the sphere of transformative AI, an international institution would have to be created to conduct similar functions to the IAEA’s Global Nuclear Safety and Security Network, which transfers knowledge from countries with mature nuclear energy programs to those who are just starting to develop one. As an alternative, the authors suggest that the leader in AI could prevent the laggards from engaging in a race by sharing the wealth resulting from transformative AI.[ref 179] 

The US’s National Security Commission on Artificial Intelligence’s final report included a proposal for an International Digital Democracy Initiative (IDDI) “with allies and partners to align international assistance efforts to develop, promote, and fund the adoption of AI and associated technologies that comports with democratic values and ethical norms around openness, privacy, security, and reliability.”[ref 180] 

Ho and others envision a model that incorporates the private sector into the benefit-distribution dynamic. A “Frontier AI Collaborative” could spread the benefits of cutting-edge AI—including global resilience to misused or misaligned AI systems—by acquiring or developing AI systems with a pool of resources from member states and international development programs, or from AI laboratories. This form of benefit-sharing could have the additional advantage of incentivizing states to join an international AI governance regime in exchange for access to the benefits distributed by the collaborative.[ref 181] More broadly, the Elders suggest creating an institution analogous to the IAEA to guarantee that AI’s benefits are “shared with poorer countries.”[ref 182] In forthcoming work, Adan sketches key features for a Fair and Equitable Benefit Sharing Model, to “foster inclusive global collaboration in transformative AI development and ensure that the benefits of AI advancements are equitably shared among nations and communities.”[ref 183] 

7.5 Critiques of this model: One challenge faced by benefit-distributor institutions is how to balance the risk of proliferation with ensuring meaningful benefits and take-up from its technology-promotional and distributive work.[ref 184] For instance, Ho and others suggest that proposals such as their Frontier AI Collaborative proposal could risk inadvertently diffusing dangerous dual-use technologies while simultaneously encountering barriers and obstacles to effectively empowering underserved populations with AI.[ref 185] 

More fundamentally, potential challenges or concerns with global benefit- and access-providing institutions—especially those that involve some forms of conditional access—will likely see challenges (and critiques) on the basis of how they organize participation. In recent years, several researchers have argued that the global governance of AI is seeing only limited participation by states from the Global South;[ref 186] Veale and others have recently critiqued many initiatives to secure “AI for Good” or “responsible AI,” arguing that these have fallen into a “paradox of participation,” one involving “the surface-level participation of Global South stakeholders without providing the accompanying resources and structural reforms to allow them to be involved meaningfully.”[ref 187] It is likely that similar critiques will be raised against benefit-distributing institutions.

II. Directions for further research

In light of the literature review conducted in Part I, we can consider a range of additional directions for further research. Without intending to be exhaustive, this section discusses some of those directions briefly, offering some initial thoughts on the existing gaps in the current literature and how each line of research might be helpful to inform key decisions around the international governance of AI—around whether or when to create international institutions, what specific institutional models to prioritize, how to establish these institutions, and how to design them for effectiveness, amongst others.

Direction 1: Effectiveness of institutional models

In the above summary, we have outlined potential institutional models for AI without always making an assessment of their weaknesses or their effectiveness in meeting their stated goals. We believe such further analysis could be critical, however, to filter out models that would be apt to govern the risks from AI and reduce such risks de facto (not just de jure).

There is, of course, a live debate on the “effectiveness” of international law and institutions, with an extensive literature that tries to assess patterns of state compliance with different regimes in international law[ref 188] as well as more specific patterns affecting the efficacy of international organizations[ref 189] or their responsiveness to shifts in the underlying problem.[ref 190]

Such work has highlighted the imperfect track record of many international treaties in meeting their stated purposes,[ref 191] the various ways in which states may aim to evade obligations even while complying with the letter of the law,[ref 192] the ways in which states may aim to use international organizations to promote narrow national interests rather than broader organizational objectives,[ref 193] and the situations under which states aim to exit, shift away from, or replace existing institutions with new alternatives.[ref 194] Against such work, other studies have explored the deep normative changes that international norms have historically achieved in topics such as the role of territorial war,[ref 195] the transnational and domestic mechanisms by which states are pushed to commit to and comply with different treaties,[ref 196] the more nuanced conditions that may induce greater or lesser state compliance with norms or treaties,[ref 197] the effective role that even nonbinding norms may play,[ref 198] as well as arguments that a narrow focus on state compliance with international rules understates the broader effects that those obligations may have on the way that states bargain in light of those norms (even when they aim to bend them).[ref 199] Likewise, there is a larger body of foundational work that considers whether traditional international law, based in state consent, might be an adequate tool to secure global public goods such as those around AI, even if states complied with their obligations.[ref 200] 

Work to evaluate the (prospective) effectiveness of international institutions on AI could draw on this widespread body of literature to learn lessons from the successes and failures of past regimes, as well as on scholarship on the appropriate design of different bodies[ref 201] and measures to improve the decision-making performance of such organizations,[ref 202] in order to understand when or how any given institutional model might be most appropriately designed for AI.

Direction 2: Multilateral AI treaties without institutions 

While our review has focused on international AI governance proposals that would involve the establishment of some forms of international institutions, there are of course other models of international cooperation. Indeed, some types of treaties do not automatically establish distinct international organizations[ref 203] and primarily function by setting shared patterns of expectations and reciprocal behavior amongst states in order (ideally) to become self-enforcing. As discussed, our literature review omits discussing this type of regime. However, analyzing them in combination with the models we have outlined could be useful to determine international governance alternatives for AI, including whether or when state initiatives to establish such multilateral normative regimes that lack treaty bodies would likely be effective or might likely fall short.

Such an analysis could draw from a rich vein of existing proposals for new international treaties on AI. There have of course been proposals for new treaties for autonomous weapons.[ref 204] There are also proposals for international conventions to mitigate extreme risks from technology. Some of these, such as Wilson’s “Emerging Technologies Treaty”[ref 205] or Verdirame’s Treaty on Risks to the Future of Humanity,[ref 206] would address many types of potential existential risks simultaneously, including potential risks from AI. 

Other treaty proposals are focused more specifically on regulating AI risk in particular. Dewey discusses a potential “AI development convention” that would set down “a ban or set of strict safety rules for certain kinds of AI development.”[ref 207] Yet others address different types of risks from AI, such as Metzinger’s proposal for a global moratorium on artificial suffering.[ref 208] Carayannis and Draper discuss a “Universal Global Peace Treaty (UGPT), which would commit states “not to declare or engage in interstate war, especially via existential warfare, i.e., nuclear, biological, chemical, or cyber war, including AI- or ASI-enhanced war.” They would see this treaty supported by a separate Cyberweapons and AI Convention, which would commit as its main article that “each State Party to this Convention undertakes never in any circumstances to develop, produce, stockpile or otherwise acquire or retain: (1) cyberweapons, including AI cyberweapons; and (2) AGI or artificial superintelligence weapons.”[ref 209] 

Notwithstanding these proposals, there are significant gaps in the scholarship surrounding the design of an international treaty for AI regulation. Some issues that we believe should be explored include, but are not limited to, the effects of reciprocity on the behavior of state parties, the relationship between the specificity of a treaty and its pervasiveness, the success and adaptability of the framework convention model (a broad treaty and protocols which specify the initial treaty’s obligations) in accomplishing their goals, and adjudicatory options for conflicts between state parties. 

Direction 3: Additional institutional models not covered in detail in this review

There are many other institutional models that this literature review does not address, as they are (currently) rarely proposed in the specific context of international AI governance. These include, but are not limited to:[ref 210]

Accordingly, future lines of research could focus on exploring what such models could look like for AI and what they might contribute to international AI governance.

Direction 4: Compatibility of institutional functions 

There are multiple instances of compatibility between the functions of institutions proposed by the literature explored in this review. Getting a better sense of those areas of compatibility could be advantageous when designing an international institution for AI that borrows the best features from each model rather than copying a single model. Further research could explore hybrid institutions that combine functions from several models. Some potential combinations include, but are not limited to:

Direction 5: Potential fora for an international AI organization

This review omits establishing patterns among different proposals on their preferred fora to negotiate or host an international AI organization. While we do not expect there to be much commentary on this, it might be a useful additional element to take into consideration when designing an international AI institution. For example, some fora that have been proposed are:

This does not exhaust the available or feasible avenues, however. In many cases, significant additional work will have to be undertaken to evaluate these pathways in detail.

Conclusion

This literature review analyzed seven models for the international governance of AI, discussing common examples of those models, underexplored examples, specific proposals of their application to AI in existing scholarship, and critiques. We found that, while the literature covers a wide range of options for the international governance of AI, most of the time proposals are vague and do not develop the specific attributes an international institution would need to have in order to garner the benefits and curb the risks associated with AI. Thus, we proposed a series of pathways for further research that we expect should contribute to the design of such an international institution.


Also in this series

International governance of civilian AI

Defining “frontier AI”

What are legislative and administrative definitions?

Congress usually defines key terms like “Frontier AI” in legislation to establish the scope of agency authorization. The agency then implements the law through regulations that more precisely set forth what is regulated, in terms sufficiently concrete to give notice to those subject to the regulation. In doing so, the agency may provide administrative definitions of key terms and provide specific examples or mechanisms.

Who can update these definitions?

Congress can amend legislation and might do so to supersede regulatory or judicial interpretations of the legislation. The agency can amend regulations to update its own definitions and implementation of the legislative definition.

Congress can also expressly authorize an agency to further define a term. For example, the Federal Insecticide, Fungicide, and Rodenticide Act defines “pest” to include any organism “the Administrator declares to be a pest” pursuant to 7 U.S.C. § 136.

What is the process for updating administrative definitions?

For a definition to be legally binding, by default an agency must follow the rulemaking process in the Administrative Procedure Act (APA). Typically, this requires that the agency go through specific notice-and-comment proceedings (informal rulemaking). 

Congress can change the procedures an agency must follow to make rules, for example by dictating the frequency of updates or by authorizing interim final rulemaking, which permits the agency to accept comments after the rule is issued instead of before.

Can a technical standard be incorporated by reference into regulations and statutes?

Yes, but incorporation by reference in regulations is limited. The agency must specify what version of the standard is being incorporated, and regulations cannot dynamically update with a standard. Incorporation by reference in federal regulations is also subject to other requirements. When Congress codifies a standard in a statute, it may incorporate future versions directly, as it did in the Federal Food, Drug, and Cosmetic Act, defining “drug” with reference to the United States Pharmacopoeia. 21 U.S.C. § 321(g). Congress can instead require that an agency use a particular standard. For example, the U.S. Consumer Product Safety Improvement Act effectively adopted ASTM International Standards on toy safety as consumer product safety standards and required the Consumer Product Safety Commission to incorporate future revisions into consumer product safety rules. 15 U.S.C. § 2056b(a) & (g).

How frequently could the definition be updated?

By default the rulemaking process is time-consuming. While the length of time needed to issue a rule varies, estimates from several agencies range from 6 months to over 4 years; the internal estimate of the average for the Food and Drug Administration (FDA) is 3.5 years and for the Department of Transportation is 1.5 years. Less significant updates, such as minor changes to a definition or list of regulated models, might take less time. However, legislation could impliedly or expressly allow updates to be made in a shorter time frame than permitted by the APA.

An agency may bypass some or all of the notice-and-comment process “for good cause” if to do otherwise would be “impracticable, unnecessary, or contrary to the public interest,” 5 U.S.C. § 553(b)(3)(B), such as in the interest of an emergent national security issue or to prevent widespread disruption of flights. It may also bypass the process if the time required would harm the public or subvert the underlying statutory scheme, such as when an agency relied on the exemption for decades to issue weekly rules on volume restrictions for agricultural commodities because it could not reasonably “predict market and weather conditions more than a month in advance” as the 30-day advance notice would require (Riverbend Farms, 9th Cir. 1992).

Congress can also implicitly or explicitly waive the APA requirements. While mere existence of a statutory deadline is not sufficient, a stringent deadline that makes compliance impractical might constitute good cause. 

What existing regulatory regimes may offer some guidance?

  1. The Federal Select Agents Program (FSAP) regulates biological agents that threaten public health, maintains a database of such agents, and inspects entities using such agents. FSAP also works with the FBI to evaluate entity-specific security risks. Finally, FSAP investigates incidents of non-compliance. FSAP provides a model for regulating technology as well as labs. The Program has some drawbacks worthy of study, including risks of regulatory capture (entity investigations are often not done by an independent examiner), prioritization issues (high-risk activities are often not prioritized), and resource allocation (entity investigations are often slow and tedious).
  2. The FDA approves generic drugs by comparing their similarity in composition and risk to existing, approved drugs. Generic drug manufacturers attempt to show sufficient similarity to an approved drug so as to warrant a less rigorous review by the FDA. This framework has parallels with a relative, comparative definition of Frontier AI.

What are the potential legal challenges?

  1. Under the major questions doctrine, courts will not accept an agency interpretation of a statute that grants the agency authority over a matter of great “economic or political significance” unless there is a “clear congressional authorization” for the claimed authority. Defining “frontier AI” in certain regulatory contexts could plausibly qualify as a “major question.” Thus, an agency definition of “Frontier AI” could be challenged under the major questions doctrine if issued without congressional authorization.
  2. The regulation could face a non-delegation doctrine challenge, which limits congressional delegation of its legislative power. The doctrine requires Congress to include an “intelligible principle” on how to exercise its delegated authority. In practice, this is a lenient standard; however, some commentators believe that the Supreme Court may strengthen the doctrine in the near future. Legislation that provides more specific guidance regarding policy decisions is less problematic from a nondelegation perspective than legislation that confers a great deal of discretion on the agency and provides little or no guidance on how the agency should exercise it.

LawAI’s thoughts on proposed updates to U.S. federal benefit-cost analysis

This analysis is based on a comment submitted in response to the Request for Comment on proposed Circular A-4, “Regulatory Analysis”.

We support the many important and substantial reforms to the regulation review process in the proposed Circular A-4. The reforms, if adopted, would reduce the odds of regulations imposing undue costs on vulnerable, underrepresented, and disadvantaged communities both now and well into the future. In this piece, we outline a few additional changes that would further reduce those odds: expanding the scope of analysis to include catastrophic and existential risks, including those far in the future; including future generations in distributional analysis; providing more guidance regarding model uncertainty and regulations that involve irreversible outcomes; lowering the discount rate to zero for irreversible effects; and in a narrow set of cases or, minimally, lowering the discount rate in proportion to the temporal scope of a regulation.

1. Circular A-4 contains many improvements, including consideration of global impacts, expanding the temporal scope of analysis, and recommendations on developing an analytical baseline.

Circular A-4 contains many improvements on the current approach to benefit-cost analysis (BCA). In particular, the proposed reforms would allow for a more comprehensive understanding of the myriad risks posed by any regulation. The guidance for analysis to include global impacts[ref 1] will more accurately account for the effects of a regulation on increasingly interconnected and interdependent economic, political, and environmental systems. Many global externalities, such as pandemics and climate change, require international regulatory cooperation; in these cases, efficient allocation of global resources, which benefits the United States and its citizens and residents, requires all countries to consider global costs and benefits.[ref 2]

The instruction to tailor the time scope of analysis to “encompass all the important benefits and costs likely to result from regulation” will likewise bolster the quality of a risk assessment[ref 3]—though, as mentioned below, a slight modification to this instruction could aid regulators in identifying and mitigating existential risks posed by regulations. 

The recommendations on developing an analytic baseline have the potential to increase the accuracy and comprehensiveness of BCA by ensuring that analysts integrate current and likely technological developments and the resulting harms of those developments into their baseline.[ref 4]

A number of other proposals would also qualify as improvements on the status quo. A litany of commentors have discussed those proposals, so the remainder of this piece is reserved for suggested amendments and recommendations for topics worthy of additional consideration.

2. The footnote considering catastrophic risks is a welcome addition that could be further strengthened with a minimum time frame of analysis and clear inclusion of catastrophic and existential threats in “important” and “likely” benefits and costs.

The proposed language will lead to a more thorough review of the benefits and costs of a regulation by expanding the time horizon over which those effects are assessed.[ref 5] We particularly welcome the footnote encouraging analysts to consider whether a regulation that involves a catastrophic risk may impose costs on future generations.[ref 6]

We recommend two suggestions to further strengthen the purpose of this footnote in encouraging the consideration of catastrophic and existential risks and the long-run effects of related regulation. First, we recommend mandating consideration of long-run effects of a regulation.[ref 7] Given the economic significance of a regulation that triggers review under Executive Orders 12866 and 13563, as supplemented and reaffirmed by Executive Order 14094, the inevitable long-term impacts deserve consideration—especially because regulations of such size and scope could affect catastrophic and existential risks that imperil future generations. Thus, the Office should consider establishing a minimum time frame of analysis to ensure that long-run benefits and costs are adequately considered, even if they are sometimes found to be negligible or highly uncertain.

Second, the final draft should clarify what constitutes an “important” benefit and cost as well as when those effects will be considered “likely”.[ref 8] We recommend that those concepts clearly encompass potential catastrophic or existential threats, even those that have very low likelihood.[ref 9] An expansive definition of both qualifiers would allow the BCA to provide stakeholders with a more complete picture of the regulation’s short- and long-term impact.

3. Distributional analysis should become the default of regulatory review and include future generations as a group under consideration.

The potential for disparate effects of regulations on vulnerable, underrepresented, and disadvantaged groups merits analysis in all cases. Along with several other commentors, we recommend that distributional analysis become the default of any regulatory review. When possible, we further recommend that such analysis include future generations among the demographic categories.[ref 10] Future generations have no formal representation and will bear the costs imposed by any regulation for longer than other groups.[ref 11]

The Office should also consider making this analysis mandatory, with no exceptions. Such a mandate would reduce the odds of any group unexpectedly bearing a disproportionate and unjust share of the costs of a regulation. The information generated by this analysis would also give groups a more meaningfully informed opportunity to engage in the review of regulations. 

4. Treatment of uncertainty is crucial for evaluating long-term impacts and should include more guidance regarding models, model uncertainty, and regulations that involve irreversible outcomes.

Circular A-4 directs agencies to seek out and respond to several different types of uncertainty from the outset of their analysis.[ref 12] This direction will allow for a more complete understanding of the impacts of a regulation both in the short- and long- term. Greater direction would accentuate those benefits. 

The current model uncertainty guidance, largely confined to a footnote, nudges agencies to “consider multiple models to establish robustness and reduce model uncertainty.”[ref 13] The brevity of this instruction conflicts with the complexity of this process. Absent more guidance, agencies may be poorly equipped to assess and treat uncertainty, which will frustrate the provision of “useful information to decision makers and the public about the effects and the uncertainties of alternative regulatory actions.”[ref 14] A more participatory, equitable, and robust regulation review process hinges on that information. 

We encourage the agency to provide further examples and guidance on how to prepare models and address model uncertainty, in particular regarding catastrophic and existential risks, as well as significant benefits and costs in the far future.[ref 15] A more robust approach to responding to uncertainty would include explicit instructions on how to identify, evaluate, and report uncertainty regarding the future. Several commentors highlighted that estimates of costs and benefits become more uncertain over time. We echo and amplify concerns that regulations with forecasted effects on future generations will require more rigorous treatment of uncertainty.

We similarly recommend that more guidance be offered with respect to regulations that involve irreversible outcomes, such as exhaustion of resources or extinction of a species.[ref 16] The Circular notes that such regulations may benefit from a “real options” analysis; however, this simple guidance is inadequate for the significance of the topic. The Circular acknowledges that “[t]he costs of shifting the timing of regulatory effects further into the future may be especially high when regulating to protect against irreversible harms.” We agree that preserving option value for future generations is of immense value. How to value those options should receive more attention in subsequent drafts. Likewise, guidance on how to identify irreversible outcomes and conduct real options analysis merits more attention in forthcoming iterations.

We recommend similar caution for regulations involving harms that are persistent and challenging to reverse, but not irreversible.

5. A lower discount rate and declining discount rate are necessary to account for the impact of regulations with significant and long-term effects on future generations.

The discount rate in a BCA is one signal of how much a society values the future. We join a chorus of commentors in applauding both the overall lowering of the discount rate as well as the idea of a declining discount rate schedule. 

The diversity of perspectives in those comments, however, indicate that this topic merits further consideration. In particular, we would welcome further discussion on the merits of a zero discount rate. Though sometimes characterized as a blunt tool to attempt to assist future generations,[ref 17] zero discount rates may become necessary when evaluating regulations that involve irreversible harm.[ref 18] In cases involving irreversibility, a fundamental assumption about discounting breaks down—specifically, that the discounted resource has more value in the present because it can be invested and, as a result, generate more resources in subsequent periods.[ref 19] If the regulation involves the elimination of certain resources, such as nonrenewable resources, rather than their preservation or investment, then the value of the resources remain constant across time periods.[ref 20] Several commentors indicated that they share our concern about such harms, suggesting that they would welcome this narrow use case for zero discount rates.[ref 21]

We likewise support the general concept of declining discount rates and further conversations regarding the declining discount rate (DDR) schedule,[ref 22] given the importance of such schedules in accounting for the impact of regulations with significant and long-term effects on future generations.[ref 23] US adoption of a DDR schedule would bring us into alignment with two peers—namely, the UK and France.[ref 24] The former, which is based on the Ramsey formula rather than a fixed DDR schedule proposed, deserves particular attention given that it estimates time preference ρ as the sum of “pure time preference (δ , delta) and catastrophic risk (L)”,[ref 25] defined in the previous Green Book as the “likelihood that there will be some event so devastating that all returns from policies, programmes or projects are eliminated”.[ref 26] This approach to a declining discount schedule demonstrates the sort of risk aversion, considering catastrophic and existential risk, that is necessary in light of regulations that present significant uncertainty.

6. Regulations that relate to irreversible outcomes, catastrophic risk, or existential risk warrant review as being significant under Section 3(f)(1).

In establishing thresholds for which regulations will undergo regulatory analysis, Section 3(f)(1) of Executive Order 12866 includes a number of sufficient criteria in addition to the increased monetary threshold. We note that regulations that might increase or reduce catastrophic or existential risk should be reviewed as having the potential to “adversely affect in a material way the economy, a sector of the economy, productivity, competition, jobs, the environment, public health or safety, or State, local, territorial, or tribal governments or communities.”[ref 27] Even “minor” regulations can have unintended consequences with major ramifications on our institutions, systems, and norms—those that might influence such grave risks are of particular import. For similar reasons, the Office should also review any regulation that has a reasonable chance of causing irreversible harm to future generations.[ref 28]

7. Conclusion

Circular A-4 contains important and substantial reforms to the regulation review process. The reforms, if adopted, would reduce the odds of regulations imposing undue costs on vulnerable, underrepresented, and disadvantaged communities both now and well into the future. A few additional changes would further reduce those odds—specifically, expanding the scope of analysis to include catastrophic and existential risks, including those far in the future; including future generations in distributional analysis; providing more guidance regarding model uncertainty and regulations that involve irreversible outcomes; lowering the discount rate to zero for irreversible effects; and in a narrow set of cases or, minimally, lowering the discount rate in proportion to the temporal scope of a regulation.