Advanced AI governance
A literature review of problems, options, and proposals
Abstract
As the capabilities of AI systems have continued to improve, the technology’s global stakes have become increasingly clear. In response, an “advanced AI governance” community has come into its own, drawing on diverse bodies of research to analyze the potential problems this technology poses, map the options available for its governance, and articulate and advance concrete policy proposals. However, this field still faces a lack of internal and external clarity over its different research programmes. In response, this literature review provides an updated overview and taxonomy of research in advanced AI governance. After briefly setting out the aims, scope, and limits of this project, this review covers three major lines of work: (I) problem-clarifying research aimed at understanding the challenges advanced AI poses for governance, by mapping the strategic parameters (technical, deployment, governance) around its development and by deriving indirect guidance from history, models, or theory; (II) option-identifying work aimed at understanding affordances for governing these problems, by mapping potential key actors, their levers of governance over AI, and pathways to influence whether or how these are utilized; (III) prescriptive work aimed at identifying priorities and articulating concrete proposals for advanced AI policy, on the basis of certain views of the problem and governance options. The aim is that, by collecting and organizing the existing literature, this review will contribute to greater analytical and strategic clarity, enabling more focused and productive research, public debate, and policymaking on the critical challenges of advanced AI.
Executive Summary
This literature review provides an overview and taxonomy of past and recent research in the emerging field of advanced AI governance.
Aim: The aim of this review is to help disentangle and consolidate the field, improve its accessibility, enable clearer conversations and better evaluations, and contribute to overall strategic clarity or coherence in public and policy debates.
Summary: Accordingly, this review is organized as follows:
The introduction discusses the aims, scope, selection criteria, and limits of this review and provides a brief reading guide.
Part I reviews problem-clarifying work aimed at mapping the parameters of the AI governance challenge, including lines of research to map and understand:
- Key technical parameters constituting the technical characteristics of advanced AI technology and its resulting (sociotechnical) impacts and risks. These include evaluations of the technical landscape of advanced AI (its forms, possible developmental pathways, timelines, trajectories), models for its general social impacts, threat models for potential extreme risks (based on general arguments and direct and indirect threat models), and the profile of the technical alignment problem and its dedicated research field.
- Key deployment parameters constituting the conditions (present and future) of the AI development ecosystem and how these affect the distribution and disposition of the actors that will (first) deploy such systems. These include the size, productivity, and geographic distribution of the AI research field; key AI inputs; and the global AI supply chain.
- Key governance parameters affecting the conditions (present and future) for governance interventions. These include stakeholder perceptions of AI and trust in its developers, the default regulatory landscape affecting AI, prevailing barriers to effective AI governance, and effects of AI systems on the tools of law and governance themselves.
- Other lenses on characterizing the advanced AI governance problem. These include lessons derived from theory, from abstract models and wargames, from historical case studies (of technology development and proliferation, of its societal impacts and societal reactions, of successes and failures in historical attempts to initiate technology governance, and of successes and failures in the efficacy of different governance levers at regulating technology), and lessons derived from ethics and political theory.
Part II reviews option-identifying work aimed at mapping potential affordances and avenues for governance, including lines of research to map and understand:
- Potential key actors shaping advanced AI, including actors such as or within AI labs and companies, the digital AI services and compute hardware supply chains, AI industry and academia, state and governmental actors (including the US, China, the EU, the UK, and other states), standard-setting organizations, international organizations, and public, civil society, and media actors.
- Levers of governance available to each of these actors to shape AI directly or indirectly.
- Pathways to influence on each of these key actors that may be available to (some) other actors in aiming to help inform or shape the key actors’ decisions around whether or how to utilize key levers of governance to improve the governance of advanced AI.
Part III reviews prescriptive work aimed at putting this research into practice in order to improve the governance of advanced AI (for some view of the problem and of the options). This includes lines of research or advocacy to map, articulate, and advance:
- Priorities for policy given theories of change based on some view of the problem and of the options.
- Good heuristics for crafting AI policy. These include general heuristics for good regulation, for (international) institutional design, and for future-proofing governance.
- Concrete policy proposals for the regulation of advanced AI, and the assets or products that can help these be realized and implemented. This includes proposals to regulate advanced AI using existing authorities, laws, or institutions; proposals to establish new policies, laws, or institutions (e.g., temporary or permanent pauses on AI development; the establishment of licensing regimes, lab-level safety practices, or governance regimes on AI inputs; new domestic governance institutions; new international AI research hubs; new bilateral agreements; new multilateral agreements; and new international governance institutions).
Introduction
This document aims to review, structure, and organize existing work in the field of advanced AI governance.
Background: Despite being a fairly young and interdisciplinary field, advanced AI governance offers a wealth of productive work to draw on and is increasingly structured through various research agendas1 and syllabi.2 However, while technical research on the possibility, impacts, and risks of advanced AI has been mapped in various literature reviews and distillations,3 few attempts have been made to comprehensively map and integrate existing research on the governance of advanced AI.4 This document aims to provide an overview and taxonomy of work in this field.
Aims: The aims of this review are several:
- Disentangle and consolidate the field to promote greater clarity and legibility regarding the range of research, connections between different research streams and directions, and open gaps or underexplored questions. Literature reviews can contribute to such a consolidation of academic work;5
- Improve the field’s accessibility and reduce some of its “research debt”6 to help those new to the field understand the existing literature, in order to facilitate a more cohesive and coordinated research field with lower barriers to entry, which reduces duplication of effort or work;
- Enable clearer conversations between researchers exploring different questions or lines of research, discussing how and where their insights intersect or complement one another;
- Enable better comparison between different approaches and policy proposals; and
- Contribute to greater strategic clarity or coherence,7 improving the quality of interventions, and refining public and policy debates.
Scope: While there are many ways of framing the field, one approach is to define advanced AI governance as:
Advanced AI governance: “the study and shaping of local and global governance systems—including norms, policies, laws, processes, and institutions—that affect the research, development, deployment, and use of existing and future AI systems, in ways that help the world choose the role of advanced AI systems in its future, and navigate the transition to that world.”8
However, the aim of this document is not to engage in restrictive boundary policing of which research is part of this emerging field, let alone the “core” of it. The guiding heuristic here is not whether a given piece of research is directly, explicitly, and exclusively focused on certain “right” problems (e.g., extreme risks from advanced AI), nor whether it is motivated by certain political orientations or normative frameworks, nor even whether it explicitly uses certain terminology (e.g., “Transformative AI,” “AGI,” “General-Purpose AI System,” or “Frontier AI”).9 Rather, the broad heuristic is simply whether the research helps answer a part of the advanced AI governance puzzle.
Accordingly, this review aims to cast a fairly broad net to cover work that meets any of the following criteria:
- Explicitly focuses on the governance of future advanced, potentially transformative AI systems, in particular with regard to their potential significant impacts or extreme risks.
- Focuses on the governance of today’s AI systems, where (at least some of) the authors are interested in the implications of the analysis for the governance of future AI systems;
- Focuses on today’s AI systems, where the original work is (likely) not directly motivated by a concern over (risks from) advanced AI but nonetheless offers lessons that are or could be drawn upon by the advanced AI governance community to inform insights for the governance of advanced AI systems; and
- Focuses on (the impacts or governance of) non-AI technologies or issues (such as historical case studies of technology governance), where the original work is not directly motivated by questions around AI but nonetheless offers lessons that are or could be drawn upon by the advanced AI governance community to inform insights for the governance of advanced AI systems.
Limitations: With this in mind, there are also a range of limitations or shortcomings for this review:
- Preliminary survey: A literature review of this attempted breadth will inevitably fall short of covering all relevant work and sub-literatures in sufficient depth. In particular, given the speed of development in this field, a project like this will inevitably miss key work, so it should not be considered exhaustive. Indeed, because of the breadth of this report, I do not aim to go into the details of each topic, but rather to organize and list sources by topic. Likewise, there is some unbalance in that there has to date been more organized (technical) literature on (Part 1) characterizing the problem of advanced AI governance, than there has been on drafting concrete proposals (Part 3). As such, I invite others to produce “spin-offs” of this report which go into the detail of the content for each topic or sub-section in order to produce more in-depth literature reviews.10
- Broad scope: In accordance with the above goal to cast a “broad net,” this review covers both work that is core to and well established in the existing advanced AI governance field, and adjacent work that could be or has been considered by some as of significant value, even if it has not been as widely recognized yet. It also casts a broad net in terms of the type of sources surveyed, covering peer reviewed academic articles, reports, books, and more informal digital resources such as web fora.
- Incomplete in scope: By and large, this review focuses on public and published analyses and mostly omits currently in-progress, unpublished, or draft work.11 Given that a significant portion of relevant and key work in this field is unpublished, this means that this review likely will not capture all research directions in this field. Indeed, I estimate that this review captures at best ~70% of the work and research undertaken on many of these questions and subfields, and likely less. I therefore welcome further, focused literature reviews.
- A snapshot: While this review covers a range of work, the field is highly dynamic and fast-moving, which means that this project will become outdated before long. Attempts will be made to update and reissue the report occasionally.
Finally, a few remaining disclaimers: (1) inclusion does not imply endorsement of a given article’s conclusions; (2) this review aims to also highlight promising directions, such as issues or actors, that are not yet discussed in depth in the literature. As such, whenever I list certain issues (e.g., “actors” or “levers”) without sources, this is because I have not yet found (or have missed out on) much work on that issue, suggesting there is a gap in the literature—and room for future work. Overall, this review should be seen as a living document that will be occasionally updated as the field develops. To that end, I welcome feedback, criticism, and suggestions for improvement.
Reading guide: In general, I recommend that rather than aiming to read this from the top, readers instead identify a theme or area of interest and jump to that section. In particular, this review may be most useful to readers (a) that already have a specific research question and want to see what work has been done and how a particular line of work would fit into the larger landscape; (b) that aim to generate or distill syllabi for reading groups or courses; or (c) that aim to explore the broader landscape or build familiarity with fields or lines of research they have not previously explored. All the research presented here is collected from prior work, and I encourage readers to consult and directly cite those original sources named here.
I. Problem-clarifying work: Understanding the AI governance challenge
Most object-level work in the field of advanced AI governance has sought to disambiguate and reduce uncertainties around relevant strategic parameters of the AI governance challenge.12
AI governance strategic parameters can be defined as “features of the world, such as the future AI development trajectory, the prevailing deployment landscape, and applicable policy conditions, which significantly determine the strategic nature of the advanced AI governance challenge.”13
Strategic parameters serve as highly decision-relevant or even crucial considerations, determining which interventions or solutions are appropriate, necessary, viable, or beneficial for addressing the advanced AI governance challenge. Different views of these parameters constitute underlying cruxes for different theories of actions and approaches. This review discusses three types of strategic parameters:14
- Technical parameters of the advanced AI challenge (i.e., what are the future technical developments in AI, on what timelines and on what trajectory will progress occur, why or how might such systems pose risks, and how difficult is the alignment challenge);
- Deployment parameters of who is most likely to develop advanced AI systems and how they are likely to develop and use them (i.e., whose development decisions are to be governed); and
- Governance parameters of how, when, and why governance interventions to shape advanced AI development and deployment are most likely to be viable, effective, or productive.
Accordingly, research in this subfield includes:
- Empirical and theoretical work aiming to identify or get better estimates of each of these parameters as they apply to advanced AI (Sections 1, 2, 3).
- Work applying other lenses to the advanced AI governance problem, drawing on other fields (existing theories, models, historical case studies, political and ethical theory) in order to derive crucial insights or actionable lessons (Section 4).
1. Technical parameters
An initial body of work focuses on mapping the relevant technical parameters of the challenge for advanced AI governance. This includes work on a range of topics relating to understanding the future technical landscape, understanding the likelihood of catastrophic risks given various specific threat models, and understanding the profile of the technical alignment problem and the prospects of it being solved by existing technical alignment research agendas.15
1.1. Advanced AI technical landscape
One subfield involves research to chart the future technical landscape of advanced AI systems.16 Work to map this landscape includes research on the future form, pathways, timelines, and trajectories of advanced AI.
Forms of advanced AI
Work exploring distinct potential forms of advanced AI,17 including:
- strong AI,18 autonomous machine intelligence,19 general artificial intelligence,20 human-level AI (HLAI),21 general-purpose AI system (GPAIS),22 comprehensive AI services (CAIS),23 highly capable foundation models,24 artificial general intelligence (AGI),25 robust artificial intelligence,26 AI+,27 (machine/artificial) superintelligence,28 and superhuman general purpose AI,29 amongst others.
Developmental paths towards advanced AI
This includes research and debate on a range of domains. In particular, such work focuses on analyzing different hypothesized pathways towards achieving advanced AI based on different paradigms or theories.30 Note that many of these are controversial and contested, and there is pervasive disagreement over the feasibility of many (or even all) of these approaches for producing advanced AI.
Nonetheless, some of these paradigms include programs to produce advanced AI based on:
- First principles: Approaches that aim to create advanced AI based on new fundamental insights in computer science, mathematics, algorithms, or software, producing AI systems that may, but need not, mimic human cognition.31
- Direct/Scaling: Approaches that aim to “brute force” advanced AI32 by running (one or more) existing AI approaches with increasingly greater computing power and/or training data to exploit observed “scaling laws” in system performance.33
- Evolutionary: Approaches that aim to create advanced AI based on algorithms that compete to mimic the evolutionary brute search process that produced human intelligence.34
- Reward-based: Approaches that aim to create advanced AI by running reinforcement learning systems with simple rewards in rich environments.35
- Bootstrapping: Approaches that aim to create some minimally intelligent core system capable of subsequent recursive (self)-improvement as a “seed AI.”36
- Neuro-inspired: Various forms of biologically-inspired, brain-inspired, or brain-imitative approaches that aim to draw on neuroscience and/or “connectomics” to reproduce general intelligence.37
- Neuro-emulated: Approaches that aim to digitally simulate or recreate the states of human brains at a fine-grained level, possibly producing whole-brain-emulation.38
- Neuro-integrationist: Approaches that aim to create advanced AI based on merging components of human and digital cognition.
- Embodiment: Approaches that aim to create advanced AI by providing the AI system with a robotic physical “body”’ to ground cognition and enable it to learn from direct experience of the world.39
- Hybrid: Approaches that rely on combining deep neural network-based approaches to AI with other paradigms (such as symbolic AI).40
Notably, of these approaches, recent years have seen most sustained attention focused on the direct (scaling) approach and whether current approaches to advanced AI, if scaled up with enough computing power or training data, will suffice to produce advanced or transformative AI capabilities. There have been various arguments both in favor of and against this direct path.
- Arguments in favor of a direct path: “scaling hypothesis,”41 “prosaic AGI,”42 and “Human feedback on diverse tasks (HFDT)”;43
- Arguments against a direct path, highlighting various limits and barriers: “deep limitations,”44 “the limits of machine intelligence,”45 “why AI is harder than we think,”46 and other skeptical arguments;47
- Discussion of the possible features of “engineering roadmaps” for AGI-like systems.48
Advanced AI timelines: Approaches and lines of evidence
A core aim of the field is to chart the timelines for advanced AI development across the future technical development landscape.49 This research focuses on various lines of evidence,50 which are here listed in order from more abstract to more concrete and empirical, and from relying more on outside-view arguments to relying more on inside-view arguments,51 with no specific ranking on the basis of the strength of individual lines of evidence.
Outside-view analyses of timelines
Outside-view analyses of AI development timelines, including:
- Estimates based on philosophical arguments and anthropic reasoning:
- Prima facie likelihood that we (of all generations) are the ones to find ourselves living in the “most important” century, one that we can expect to contain things such as transformative technologies.52
- Estimates based on extrapolating historical (growth) trends:
- Insights from endogenous growth theory on AI development dynamics;53
- Likelihood of explosive economic growth occurring this century, for some reason (plausibly technological, plausibly AI54), given analyses of long-run economic history;55
- The accelerating historical rate of development of new technologies56 as well as potential changes in the historical rate of increase in the economy;57
- The historical patterns of barriers to technology development,58 including unexpected barriers or delays in innovation,59 as well as lags in subsequent deployment or diffusion.60
- Estimates based on extrapolating from historical trends in efforts dedicated to creating advanced AI:
- External “semi-informative priors” (i.e., only basic information regarding how long people have attempted to build advanced, transformative AI and what resources they have used, and comparing it to how long it has taken other comparable research fields to achieve their goals given certain levels of funding and effort);
- Arguments extrapolating from “significantly increased near-future investments in AI progress” given that (comparatively) moderate past investments already yielded significant progress.
- Estimates based on meta-induction from the track record of past predictions:
- The general historical track record of past technological predictions, especially those made by futurists63 as well as those made in professional long-range forecasting exercises,[ref64] to understand the frequency of over- or underconfidence and of periods of excessive optimism (hype) or excessive pessimism (counterhype);65
- The specific historical track record of past predictions around AI development66 and the frequency of past periods’ excessive optimism (hype) or excessive pessimism (counterhype or “underclaiming”67).68
Judgment-based analyses of timelines
Judgment-based analyses of timelines, including:
Estimates based on (specialist) expert opinions:
- Expert opinion surveys of anticipated rates of progress;69
- Expert elicitation techniques (e.g., Delphi method).70
- Estimates based on (generalist) estimates from information aggregation mechanisms (financial markets; forecaster prediction markets):71
- Forecasters’ predictions of further AI progress on prediction platforms72 or forecasting competitions;73
- Current financial markets’ real interest rates, assuming the efficient market hypothesis, suggesting that markets reject short timelines.74
Inside-view models on AI timelines
Inside-view models-based analyses of timelines, including:
- Estimates based on first-principle estimates of minimum resource (compute, investment) requirements for a “transformative” AI system, compared against estimated trends in these resources:
- The “biological anchors” approach:75 Comparison with human biological cognition by comparing projected trends in the falling costs of training AI models to the expected minimum amount of computation needed to train an AI model as large as the human brain;76
- The “direct approach”:77 Analysis of empirical neural scaling laws in current AI systems to upper bound the compute needed to train a transformative model. In order to provide estimates of the system’s development, this analysis can be combined with estimates of future investment in model training, hardware price-performance, and algorithmic progress78 as well as with potential barriers in the (future) availability of the data and compute needed to train these models.79
- Estimates based on direct evaluation of outputs (progress in AI systems’ capabilities):
- Debates over the significance and implications of specific ongoing AI breakthroughs for further development;80
- Operationalizing and measuring the generality of existing AI systems.81
Methodological debates on AI-timelines analysis
Various methodological debates around AI-timelines analysis:
- On the potential pitfalls in many of the common methods (forecasting methods,82 extrapolation, expert predictions83) in forecasting AI;
- On the risk of misinterpreting forecasters who are depending on poor operationalization;84
- On the risk of deference cycles in debates over AI timelines85 because the opinions and analyses of a small number of people end up tacitly informing the evaluations of a wide range of others in ways that create the impression of many people independently achieving similar conclusions;86
- On the (potentially) limited utility of further discourse over and research into AGI timelines: arguments that all low-hanging fruit may already have been plucked87 and counterarguments that specific timelines remain relevant to prioritizing strategies.88
Advanced AI trajectories and early warning signals
A third technical subfield aims at charting the trajectories of advanced AI development, especially the potential for rapid and sudden capability gains, and whether there will be advanced warning signs:
- Exploring likely AGI “takeoff speeds”:89
- From first principles: arguments in favor of “fast takeoff”90 vs. arguments for slow(er), more continuous development;91
- By analogy: exploring historical precedents for sudden disjunctive leaps in technological capabilities.92
- Mapping the epistemic texture of the AI development trajectory in terms of possible advance warning signs of capability breakthroughs93 or the lack of any such fire alarms.94
1.2. Impact models for general social impacts from advanced AI
Various significant societal impacts that could result from advanced AI systems:95Potential for advanced AI systems to drive significant, even “explosive” economic growth96 but also risks of significant inequality or corrosive effects on political discourse;97
- Significant impacts on scientific progress and innovation;98
- Significant impacts on democracy;99
- Lock-in of harmful socio-political dangers as a result of the increasing role of centralization and optimization;100
- Impacts on geopolitics and international stability.101
This is an extensive field that spans a wide range of work, and the above is by no means exhaustive.
1.3. Threat models for extreme risks from advanced AI
A second subcluster of work focuses on understanding the threat models of advanced AI risk,102 based on indirect arguments for risks, specific threat models for direct catastrophe, or takeover,103 or on specific threat models for indirect risks.104
General arguments for risks from AI
Analyses that aim to explore general arguments (by analogy, on the basis of conceptual argument, or on the basis of empirical evidence from existing AI systems) over whether or why we might have grounds to be concerned about advanced AI.105
Analogical arguments for risks
Analogies106 with historical cases or phenomena in other domains:
- Historical cases of intelligence enabling control: emergence of human dominion over the natural world: “second species argument”107 and “the human precedent as indirect evidence of danger”;108
- Historical cases where actors were able to achieve large shifts in power despite only wielding relatively minor technological advantages: conquistadors;109
- Historical cases of “lock-in” of suboptimal or bad societal trajectories based on earlier choices and exacerbated by various mechanisms for lock-in: climate change, the agricultural revolution, and colonial projects.110
Analogies with known “control problems” observed in other domains:
- Analogies with economics principal-agent problems;111
- Analogies with constitutional law “incomplete contracting” theorems;112 in particular, the difficulty of specifying adequate legal responses to all situations or behaviors in advance because it is hard to specify specific and concrete rules for all situations (or in ways that cannot be gamed), whereas vague standards (such as the “reasonable person test”) may rely on intuitions that are widely shared but difficult to specify and need to be adjudicated ex post;113
- Analogies to economic systems114 and to bureaucratic systems and markets, and their accordant failure modes and externalities;115
- Analogies to “Goodhart’s Law,” where a proxy target metric is used to improve a system so far that further optimization becomes ineffective or harmful;116
- Analogies to the “political control problem”—the problem of the alignment and control of powerful social entities (corporations, militaries, political parties) with (the interests of) their societies, a problem that remains somewhat unsolved, with societal solutions relying on patchwork and fallible responses that cannot always prevent misalignment (e.g., corporate malfeasance, military coups, or unaccountable political corruption);117
- Analogies with animal behavior, such as cases of animals responding to incentives in ways that demonstrate specification gaming;118
- Illustration with thought experiments and well-established narrative tropes: “sorcerer’s apprentice,”119 “King Midas problem,”120 and “paperclip maximizer.”121
Conceptual arguments for risks
Conceptual and theoretical arguments based on existing ML architectures:
- Arguments based on the workings of modern deep learning systems.122
Conceptual and theoretical arguments based on the competitive environment that will shape the evolutionary development of AIs:
- Arguments suggesting that competitive pressures amongst AI developers may lead the most successful AI agents to likely have (or be given) undesirable traits, which creates risks.123
Empirical evidence for risks
Empirical evidence of unsolved alignment failures in existing ML systems, which are expected to persist or scale in more advanced AI systems:124
- “Faulty reward functions in the wild,”125 “specification gaming,”126 and reward model overoptimization;127
- “Instrumental convergence,”128 goal misgeneralization, and “inner misalignment” in reinforcement learning;129
- Language model misalignment130 and other unsolved safety problems in modern ML,131 and the harms from increasingly agentic algorithmic systems.132
Empirical examples of elements of AI threat models that have already occurred in other domains or with simpler AI systems:
- Situational awareness: cases where a large language model displays awareness that it is a model, and it can recognize whether it is currently in testing or deployment;133
- Acquisition of a goal to harm society: cases of AI systems being given the outright goal of harming humanity (ChaosGPT);
- Acquisition of goals to seek power and control: cases where AI systems converge on optimal policies of seeking power over their environment;134
- Self-improvement: examples of cases where AI systems improve AI systems;135
- Autonomous replication: the ability of simple software to autonomously spread around the internet in spite of countermeasures (various software worms and computer viruses);136
- Anonymous resource acquisition: the demonstrated ability of anonymous actors to accumulate resources online (e.g., Satoshi Nakamoto as an anonymous crypto billionaire);137
- Deception: cases of AI systems deceiving humans to carry out tasks or meet goals.138
Direct threat models for direct catastrophe from AI
Work focused at understanding direct existential threat models.139 This includes:
- Various overviews and taxonomies of different accounts of AI risk: Barrett & Baum’s “model of pathways to risk,”140 Clarke et al.’s Modelling Transformative AI Risks (MTAIR),141 Clarke & Martin on “Distinguishing AI Takeover Scenarios,”142 Clarke & Martin’s “Investigating AI Takeover Scenarios,”143 Clarke’s “Classifying Sources of AI X-Risk,”144 Vold & Harris “How Does Artificial Intelligence Pose an Existential Risk?,”145 Ngo “Disentangling Arguments for the Importance of AI Safety,”146 Grace’s overview of arguments for existential risk from AI,147 Nanda’s “threat models,”148 and Kenton et al.;149
- Analysis of potential dangerous capabilities that may be developed by general-purpose AI models, such as cyber-offense, deception, persuasion and manipulation, political strategy, weapons acquisition, long-horizon planning, AI development, situational awareness, and self-proliferation.150
Scenarios for direct catastrophe caused by AI
Other lines of work have moved from providing indirect arguments of risk, to instead sketching specific scenarios in and through which advanced AI systems could directly inflict existential catastrophe.
Scenario: Existential disaster because of misaligned superintelligence or power-seeking AI
- Older accounts, including by Yudkowsky,151 Bostrom,152 Sotala,153 Sotala and Yampolskiy,154 and Alexander;155
- Newer accounts, such as Cotra & Karnofsky’s “AI takeover analysis,”156 Christiano’s account of “What Failure Looks Like,”157 Carlsmith on existential risks from power-seeking AI,158 Ngo on “AGI Safety From First Principles,”159 and “Minimal accounts” of AI takeover scenarios;160
- Skeptical accounts: various recent critiques of AI takeover scenarios.161
Scenario: Gradual, irretrievable ceding of human power over the future to AI systems
- Christiano’s account of “What Failure Looks Like, (1).”162
Scenario: Extreme “suffering risks” because of a misaligned system
- Various accounts of “worst-case AI safety”;163
- Potential for a “suffering explosion” experienced by AI systems.164
Scenario: Existential disaster because of conflict between AI systems and multi-system interactions
- Disasters because of “cooperation failure”165 or “multipolar failure.”166
Scenario: Dystopian trajectory lock-in because of misuse of advanced AI to establish and/or maintain totalitarian regimes;
- Use of advanced AI to establish robust totalitarianism;167
- Use of advanced AI to establish lock-in of the future values.168
Scenario: Failures in or misuse of intermediary (non-AGI) AI systems, resulting in catastrophe
- Deployment of “prepotent” AI systems that are non-general but capable of outperforming human collective efforts on various key dimensions;169
- Militarization of AI enabling mass attacks using swarms of lethal autonomous weapons systems;170
- Military use of AI leading to (intentional or unintentional) nuclear escalation, either because machine learning systems are directly integrated in nuclear command and control systems in ways that result inescalation171 or because conventional AI-enabled systems (e.g., autonomous ships) are deployed in ways that result in provocation and escalation;172
- Nuclear arsenals serving as an arsenal “overhang” for advanced AI systems;173
- Use of AI to accelerate research into catastrophically dangerous weapons (e.g., bioweapons);174
- Use of AI to lower the threshold of access to dual-use biotechnology, creating risks of actors misusing it to create bioweapons.175
Other work: vignettes, surveys, methodologies, historiography, critiques
- Work to sketch vignettes reflecting on potential threat models:
- AI Impacts’ AI Vignettes project;176
- FLI Worldbuilding competition;177
- Wargaming exercises;178
- Other vignettes or risk scenarios.179
- Surveys of how researchers rate the relative probability of different existential risk scenarios from AI;180
- Developing methodologies for AI future developments and risk identification,181 such as red-teaming,182 wargaming exercises,183 and participatory technology assessment,184 as well as established risk identification techniques (scenario analysis, fishbone method, and risk typologies and taxonomies), risk analysis techniques (causal mapping, Delphi technique, cross-impact analysis, bow tie analysis, and system-theoretic process analysis), and risk evaluation techniques (checklists and risk matrices);185
- Historiographic accounts of changes in AI risk arguments and debates over time:
- General history of concerns around AI risk (1950s–present);186
- Early history of the rationalist and AI risk communities (1990s–2010);187
- Recent shifts in arguments (e.g., 2014–present);188
- Development and emergence of AI risk “epistemic community.”189
- Critical investigations of and counterarguments to the case for extreme AI risks, including object-level critiques of the arguments for risk190 as well as epistemic arguments, arguments about community dynamics, and argument selection effects.191
Threat models for indirect AI contributions to existential risk factors
Work focused at understanding indirect ways in which AI could contribute to existential threats, such as by shaping societal “turbulence”192 and other existential risk factors.193 This covers various long-term impacts on societal parameters such as science, cooperation, power, epistemics, and values:194
- Destabilizing political impacts from AI systems in areas such as domestic politics (e.g., polarization, legitimacy of elections), international political economy, or international security195 in terms of the balance of power, technology races and international stability, and the speed and character of war ;
- Hazardous malicious uses;196
- Impacts on “epistemic security” and the information environment;197
- Erosion of international law and global governance architectures;198
- Other diffuse societal harms.199
1.4. Profile of technical alignment problem
- Work mapping different geographical or institutional hubs active on AI alignment: overview of the AI safety community and problem,200 and databases of active research institutions201 and of research;202
- Work mapping current technical alignment approaches;203
- Work aiming to assess the (relative) efficacy or promise of different approaches to alignment, insofar as possible:204 Cotra,205 Soares,206 and Leike.207
- Mapping the relative contributions to technical AI safety by different communities208 and the chance that AI safety problems get “solved by default”;209
- Work mapping other features of AI safety research, such as the need for minimally sufficient access to AI models under API-based “structured access” arrangements.210
2. Deployment parameters
Another major part of the field aims to understand the parameters of the advanced AI deployment landscape by mapping the size and configuration of the “game board” of relevant advanced AI developers—the actors whose (ability to take) key decisions (e.g., around whether or how to deploy particular advanced AI systems, how much to invest in alignment research, etc.) may be key in determining risks and outcomes from advanced AI.
As such, there is significant work on mapping the disposition of the AI development ecosystem and how this will determine who is (or will likely be) in the position to develop and deploy the most advanced AI systems. Some work in this space focuses on mapping the current state of these deployment parameters; other work focuses on the likely future trajectories of these deployment parameters over time.
2.1. Size, productivity, and geographic distribution of the AI research field
- Mapping the current size, activity, and productivity of the AI research field;211
- Mapping the global geographic distribution of active AGI programs,212 including across key players such as the US or China.213
2.2. Geographic distribution of key inputs in AI development
- Mapping the current distribution of relevant inputs in AI development, such as the distribution of computation,214 semiconductor manufacturing,215 AI talent,216 open-source machine learning software,217 etc.
- Mapping and forecasting trends in relevant inputs for AI,218 such as:
- Trends in compute inputs scaling219 and in the training costs and GPU price-performance of machine learning systems over time;220
- Trends in dataset scaling and potential ceilings;221
- Trends in algorithmic progress, including their effect on the ability to leverage other inputs, e.g., the relative importance of CPUs versus specialized hardware;222
- Mapping and forecasting trends in input criticality for AI, such as trends in data efficiency223 and the degree to which data becomes the operative constraint on language model performance.224
2.3. Organization of global AI supply chain
- Mapping the current shape of the AI supply chain;225
- Mapping and forecasting dominant actors in the future AI ecosystem, in terms of:
- different actors’ control of and access to key inputs and/or chokepoints;226
- future shape of the AI supply chain (e.g., level of integration and monopoly structure);227
- shape of AI deployment landscape (e.g., dominance of key operators of generative models vs. copycat models).
2.4. Dispositions and values of advanced AI developers
- Anticipating the likely behavior or attitude of key advanced AI actors with regard to their caution about and investment in safety research, such as expecting AI companies to “race forward” and dedicate “naive safety effort.”228
2.5. Developments in converging technologies
- Mapping converging developments in adjacent, potentially intersecting or relevant technologies, such as cryptography,229 nanotechnology,230 and others.
3. Governance parameters
Work on governance parameters aims to map (1) how AI systems are currently being governed, (2) how they are likely to be governed by default (given prevailing perceptions and regulatory initiatives), as well as (3) the conditions for developing and implementing productive governance interventions on advanced AI risk.
Some work in this space focuses on mapping the current state of these governance parameters and how they affect AI governance efforts initiated today. Other work focuses on the likely future trajectories of these governance parameters.
3.1. Stakeholder perceptions of AI
Surveys of current perceptions of AI among different relevant actors:
- Public perceptions of the future of AI,231 of AI’s societal impacts,232 of the need for caution and/or regulation of AI,233 and of the rights or standing of AI entities;234
- Policymaker perceptions of AI235 and the prominence of different memes, rhetorical frames, or narratives around AI;236
- Expert views on best practices in AGI lab safety and governance.237
Predicting future shifts in perceptions of AI among relevant actors given:
- The spread of ongoing academic conversations concerned about advanced AI risk;238
- The effects of “warning shots,”239 or other “risk awareness moments”;240
- The effect of motivated misinformation or politicized AI risk skepticism.241
3.2. Stakeholder trust in AI developers
- Public trust in different actors to responsibly develop AI;242
- AI-practitioner trust in different actors to responsibly develop AI243 and Chinese AI researchers’ views on the development of “strong AI.”244
3.3. Default landscape of regulations applied to AI
This work maps the prevailing (i.e., default, “business-as-usual”) landscape of regulations that will be applied to AI in the near term. These matter as they will directly affect the development landscape for advanced AI and indirectly bracket the space for any new (AI-specific) governance proposals.245 This work includes:
- Existing industry norms and practices applied to AI in areas such as release practices around generative AI systems;246
- General existing laws and governance regimes which may be extended to or affect AI development, such as anticompetition law;247 national and international standards;248 international law norms, treaties, and regimes;249 and existing global governance institutions.250
- AI-specific governance regimes currently under development, such as:
- EU: the EU AI Act 251 and the AI Liability Directive,252 amongst others;
- US: the US AI policy agenda,253 such as various federal legislative proposals relating to generative AI,254 or President Biden’s executive order,255 amongst others.;
- International: such as the 2019 OECD AI Principles (nonbinding);256 the 2021 UNESCO Recommendation on the Ethics of Artificial Intelligence (nonbinding);257 the 2023 G7 Hiroshima guidelines (nonbinding);258 and the Council of Europe’s draft (framework) Convention on Artificial Intelligence, Human Rights, Democracy and the Rule of Law (potentially binding),259 amongst others.
3.4. Prevailing barriers to effective AI governance
- Definitional complexities of AI as target for regulation;260
- Potential difficulties around building global consensus given geopolitical stakes and tensions;261
- Potential difficulty around building civil society consensus given outstanding disagreements and tensions between different expert communities;262
- Potential challenges around cultivating sufficient state capacity to effectively implement and enforce AI legislation.263
3.5. Effects of AI systems on tools of governance
Predicting the impact of future technologies on governance and the ways these could shift the possibility frontier of what kind of regimes will be politically viable and enforceable:
- Effects of AI on general cooperative capabilities;264
- Effects of AI on international law creation and enforcement;265
- Effects of AI on arms control monitoring.266
4. Other lenses on the advanced AI governance problem
Other work aims to derive key strategic lessons for advanced AI governance, not by aiming to empirically map or estimate first-order facts about the key (technical, deployment, or governance) strategic parameters, but rather by drawing indirect (empirical, strategic, and/or normative) lessons from abstract models, historical cases, and/or political theory.
4.1. Lessons derived from theory
Work characterizing the features of advanced AI technology and of its governance challenge, drawing on existing literatures or bodies of theory:
Mapping clusters and taxonomies of AI’s governance problems:
- AI creating distinct types of risk deriving from (1) accidents, (2) misuse, and (3) structure;267
- AI creating distinct problem logics across domains: (1) ethical challenges, (2) safety risks, (3) security threats, (4) structural shifts, (5) common goods, and (6) governance disruption;268
- AI driving four risk clusters: (1) inequality, turbulence, and authoritarianism; (2) great-power war; (3) the problems of control, alignment, and political order; and (4) value erosion from competition.269
Mapping the political features of advanced AI technology:
- AI as general-purpose technology, highlighting radical impacts on economic growth, disruption to existing socio-political relations, and potential for backlash and social conflict;270
- AI as industry-configured general-purpose tech (low fixed costs and private sector dominance), highlighting challenges of rapid proliferation (compared to “prestige,” “public,” or “strategic” technologies);271
- AI as information technology, highlighting challenges of increasing returns to scale driving greater income inequality, impacts on broad collective identities as well as community fragmentation, and increased centralization of (cybernetic) control;272
- AI as intelligence technology, highlighting challenges of bias, alignment, and control of the principal over the agent;273
- AI as regulation-resistant technology, rendering coordinated global regulation difficult.274
Mapping the structural features of the advanced AI governance challenge:
- In terms of its intrinsic coordination challenges: as a global public good,275 as a collective action problem,276 and as a matter of “existential security”;277
- In terms of its difficulty of successful resolution: as a wicked problem278 and as a challenge akin to “racing through a minefield”;279
- In terms of its strategic dynamics: as a technology race,280 whether motivated by security concerns or by prestige motivations,281 or as an arms race282 (but see also critiques of the arms race framing on definitional grounds,283 on empirical grounds,284 and on grounds of rhetorical or framing risks285);
- In terms of its politics and power dynamics: as a political economy problem.286
Identifying design considerations for international institutions and regimes, from:
- General theory on the rational design of international institutions;287
- Theoretical work on the orchestration and organization of regime complexes of many institutions, norms, conventions, etc.288
4.2. Lessons derived from models and wargames
Work to derive or construct abstract models for AI governance in order to gather lessons from these for understanding AI systems’ proliferation and societal impacts. This includes models of:
- International strategic dynamics in risky technology races,289 and theoretical models of the role of information sharing,290 agreement, or incentive modeling;291
- AI competition and whether and how AI safety insights will be applied under different AI safety-performance tradeoffs,292 including collaboration on safety as a social dilemma293 and models of how compute pricing factors affect agents’ spending on safety (“safety tax”) meant to reduce the danger from the new technology;294
- The offense-defense balance of increasing investments in technologies;295
- The offense-defense balance of scientific knowledge in AI with potential for misuse;296
- Lessons from the “epistemic communities” lens, on how coordinated expert networks can shape policy;297
- Lessons from wargames and role-playing exercises.298
4.3. Lessons derived from history
Work to identify and study relevant historical precedents, analogies, or cases and to derive lessons for (AI) governance.299 This includes studies where historical cases have been directly applied to advanced AI governance as well as studies where the link has not been drawn but which might nevertheless offer productive insights for the governance of advanced AI.
Lessons from the history of technology development and spread
Historical cases that (potentially) provide insights into when, why, and how new technologies are pursued and developed—and how they subsequently (fail to) spread.
Historical rationales for technology pursuit and development
Historical rationales for actors pursuing large-scale scientific or technology development programs:
- Development of major transformative technologies during wartime: US development of the atom bomb;300
- Pursuit of strategically valuable megaprojects: the Apollo Program and the Manhattan Project;301
- Technologies pursued for prestige reasons: Ming Dynasty treasure fleets,302 the US/USSR space race,303 and the French nuclear weapons program;304
- Risk of races being started by possibly incorrect perceptions that a rival is actively pursuing a technology: the Manhattan Project (1939–1945), spurred by the Einstein Letter; the “missile gap” project to build up a US ICBM capability (1957–1962).305
Historical strategies of deliberate large-scale technology development projects
Historical strategies for unilateral large-scale technology project development:
- Crash recruitment and resource allocation for a large strategic program: “Operation Paperclip,” the post-WWII effort to recruit 1,600 German scientists and engineers, fast-tracking the US space program as well as several programs aimed at other Cold War weapons of mass destruction;306
- Different potential strategies for pursuing advanced strategic technologies: the distinct nuclear proliferation strategies (“hedging, sprinting, sheltered pursuit, hiding”) taken by different countries in pursuing nuclear weapons;307
- Government-industry collaborations to boost development of strategic technologies: the 1980’s SEMATECH collaborative research consortium to boost the US semiconductor industry;308
- Nations achieving early and sustained unilateral leads in developing key strategic technologies: the US program to develop stealth aircraft;309
- Surprisingly rapid leaps from the political decision to run a big technology program to the achievement: Apollo 8 (134 days between NASA decision to go to the moon and launch),310 UAE’s “Hope” Mars mission (set up its space agency UAESA in 2014, was only able to design its own satellite (KhalifaSat) in 2018, and launched its “Hope” Mars Mission in July 2020, less than six years after establishment),311 and various other examples including BankAmericard (90 days), P-80 Shooting Star (first USAF jet fighter) (143 days), Marinship (197 days), The Spirit of St. Louis (60 days), the Eiffel Tower (2 years and 2 months), Treasure Island, San Francisco (~2 years), the Alaska Highway (234 days), Disneyland (366 days), the Empire State Building (410 days), Tegel Airport and the Berlin Airlift (92 days),312 the Pentagon (491 days), Boeing 747 (930 days), the New York Subway (4.7 years), TGV (1,975 days), USS Nautilus (first nuclear submarine) (1,173 days), JavaScript (10 days), Unix (21 days), Xerox Alto (first GUI-oriented computer) (4 months), iPod (290 days), Amazon Prime (6 weeks), Git (17 days), and COVID-19 vaccines (3-45 days).313
Historical strategies for joint or collaborative large-scale technology development:
- International “big science” collaborations: CERN, ITER, International Space Station, Human Genome Project,314 and attempted collaborations on Apollo-Soyuz between the US and Soviet space programs.315
Historical instances of sudden, unexpected technological breakthroughs
Historical cases of rapid, historically discontinuous breakthroughs in technological performance on key metrics:
- “Large robust discontinuities” in historical technology performance trends:316
- the Pyramid of Djoser (2650 BC—structure height trends);
- the SS Great Eastern (1858—ship size trends);
- the first and second telegraphs (1858, 1866—speed of sending a message across the Atlantic Ocean);
- the first nonstop transatlantic flight (1919—speed of passenger or military payload travel);
- first nuclear weapons (1945—relative effectiveness of explosives);
- first ICBM (1958—average speed of military payload);
- the discovery of YBa2Cu3O7 as a superconductor (1987—warmest temperature of superconduction).317
- “Bolt-from-the-blue” technology breakthroughs that were held to be unlikely or impossible even shortly before they happened: Invention of flight;318 of penicillin, nuclear fission, nuclear bombs, or space flight;319 of internet hyperlinks and effective internet search.320
Historical patterns in technological proliferation and take-up
Historical cases of technological proliferation and take-up:321
- Patterns in the development, dissemination and impacts of major technological advancements: flight, the telegraph, nuclear weapons, the laser, penicillin, the transistor, and others;322
- Proliferation and penetration rates of other technologies in terms of time between invention and widespread use: steam engine (80 years), electricity (40 years), IT (20 years),323 and mobile phones;
- Role of state “diffusion capacity” in supporting the diffusion or wide adoption of new innovations: the US in the Second Industrial Revolution and the Soviet Union in the early postwar period;324
- Role of espionage in facilitating critical technology diffusion: early nuclear proliferation325 and numerous information leaks in modern IT systems;326
- Constrained proliferation of technological insights (even under compromised information security conditions): surprisingly limited track record of bioweapon proliferation: the American, Soviet, Iraqi, South African, and Aum Shinrikyo bioweapon programs ran into a range of problems which resulted in programs that failed if not totally then at least to make effective steps towards weaponization. This suggests that tacit knowledge and organizational conditions can be severely limiting and prevent proliferation even when some techniques are available in the public scientific literature.327 The (1991–2018) limited success of China in re-engineering US fifth-generation stealth fighters in spite of extensive espionage that included access to blueprints, recruitment of former engineers, and even access to the wreck of a F-117 aircraft that had crashed in Serbia;328
- Various factors contributing to technological delay or restraint with many examples of technologies being slowed or abandoned or having their uptake inhibited, including weapon systems, nuclear power, geoengineering, and genetically modified (GM) crops, as a result of (indirect) regulations, public opposition, and historical contingency;329
- Supply chain evolution of previous general-purpose technologies: studies of railroads, electricity, and cloud computing industries, where supply chains were initially vertically integrated but then evolved into a fully disintegrated natural monopoly structure with a handful of primary “upstream” firms selling services to many “downstream” application sectors.330
Lessons from the historical societal impacts of new technologies
Historical cases that (potentially) provide insights into when, why, and how new technologies can have (unusually) significant societal impacts or pose acute risks.
Historical cases of large-scale societal impacts from new technologies
Historical cases of large-scale societal impacts from new technologies:331
- Impacts of previous narrowly transformative technologies: impact of nuclear weapons on warfare, and electrification of militaries as driver of “general-purpose military transformation”;332
- Impacts of previous general-purpose technologies: general electrification,333 printing, steam engines, rail transport, motor vehicles, aviation, and computing;334
- Impacts of previous “revolutionary” or “radically transformative”335 technologies: domesticated crops and the steam engine;336
- Impacts of previous information technologies: speech and culture, writing, and the printing press; digital services; and communications technologies;337
- Impacts of previous intelligence technologies: price mechanisms in a free market, language, bureaucracy, peer review in science, and evolved institutions like the justice system and law;338
- Impacts of previous labor-substitution technologies as they compare to the possible societal impacts of large language models.339
Historical cases of particular dangers or risks from new technologies
Historical precedents for particular types of dangers or threat models from technologies:
- Human-machine interface risks and failures around complex technologies: various “normal accidents” in diverse industries and domains, most notably nuclear power;340
- Technology misuse risks: the proliferation of easily available hacking tools, such as the “Blackshades Remote Access Tool,”341 but see also the counterexample of non-use of an (apparent) decisive strategic advantage: the brief US nuclear monopoly;342
- Technological “structural risks”: the role of technologies in lowering the threshold for war initiation such as the alleged role of railways in inducing swift, all-or-none military mobilization schedules and precipitating escalation to World War I.343
Historical cases of value changes as a result of new technologies
Historical precedents for technologically induced value erosion or value shifts:
- Shared values eroded by pressures of global economic competition: “sustainability, decentralized technological development, privacy, and equality”;344
- Technological progress biasing the development of states towards welfare-degrading (inegalitarian and autocratic) forms: agriculture, bronze working, chariots, and cavalry;345
- Technological progress biasing the development of states towards welfare-promoting forms: ironworking, ramming warships, and industrial revolution;346
- Technological progress leading to gradual shifts in societal values: changes in the prevailing technology of energy capture driving changes in societal views on violence, equality, and fairness;347 demise of dueling and honor culture after (low-skill) pistols replaced (high-skill) swords; changes in sexual morality after the appearance of contraceptive technology; changes in attitudes towards farm animals after the rise of meat replacements; and the rise of the plough as a driver of diverging gender norms.348
Historical cases of the disruptive effects on law and governance from new technologies
Historical precedents for effects of new technology on governance tools:
- Technological changes disrupting or eroding the legal integrity of earlier (treaty) regimes: submarine warfare;349 implications of cyberwarfare for international humanitarian law;350 the Soviet Fractional Orbital Bombardment System (FOBS) evading the 1967 Outer Space Treaty’s ban on stationing WMDs “in orbit”;351 the mid-2010’s US “superfuze” upgrades to its W76 nuclear warheads, massively increasing their counterforce lethality against missile silos without adding a new warhead, missile, or submarine, formally complying with arms control regimes like New START;352 and various other cases;353
- Technologies strengthening international law: satellites strengthening monitoring with treaty compliance,354 communications technology strengthening the role of non-state and civil-society actors.355
Lessons from the history of societal reactions to new technologies
Historical cases that (potentially) provide insights into how societies are likely to perceive, react to, or regulate new technologies.
Historical reactions to and regulations of new technologies
Historical precedents for how key actors are likely to view, treat, or regulate AI:
- The relative roles of various US actors in shaping the development of past strategic general-purpose technologies: biotech, aerospace tech, and cryptography;356
- Overall US government policy towards perceived “strategic assets”: oil357 and early development of US nuclear power regulation;358
- The historical use of US antitrust law motivated by national security considerations: various cases over the last century;359
- Early regulation of an emerging general-purpose technology: electricity regulation in the US;360
- Previous instances of AI development becoming framed as an “arms race” or competition: 1980’s “race” between the US and Japan’s Fifth Generation Computer Systems (FGCS) project;361
- Regulation of the “safety” of foundational technology industries, public infrastructures, and sectors: UK regulation of sectors such as medicines and medical devices, food, financial services, transport (aviation & road and rail), energy, and communications;362
- High-level state actors buy-in to ambitious early-stage proposals for world control and development of powerful technology: Initial “Baruch Plan” for world control of nuclear weapons (eventually failed);363 extensive early proposals for world control of airplane technology (eventually failed);364 and repeated (private and public) US offers to the Soviet Union for a joint US-USSR moon mission, including a 1963 UN General Assembly offer by John F. Kennedy to convert the Apollo lunar landing program into a joint US-Soviet moon expedition (initially on-track, with Nikita Khruschev eager to accept the offer; however, Kennedy was assassinated a week after the offer, the Soviets were too suspicious of similar offers by the Johnson administration, and Khruschev was removed from office by coup in 1964);365
- Sustained failure of increasingly more powerful technologies to deliver their anticipated social outcomes: sustained failure of the “Superweapon Peace” idea—the recurring idea that certain weapons of radical destructiveness (nuclear and non-nuclear) may force an end to war by rendering it too destructive to contemplate;366
- Strong public and policy reactions to “warning shots” of a technology being deployed: Sputnik launch and Hiroshima bombing;367
- Strong public and policy reactions to publicly visible accidents involving a new technology: Three Mile Island meltdown,368 COVID-19 pandemic,369 and automotive and aviation industries;370
- Regulatory backlash and path dependency: case of genetically modified organism (GMO) regulations in the US vs. the EU;371
- “Regulatory capture” and/or influence of industry actors on tech policy, the role of the US military industrial complex in perpetuating the “bomber gap” and “missile gap” myths,372 and undue corporate influence in the World Health Organisation during the 2009 H1N1 pandemic;373
- State norm “antipreneurship” (actions aiming to preserve the prevailing global normative status quo at the global level against proposals for new regulation or norm-setting): US resistance to proposed global restraints on space weapons, between 2000 and the present, utilizing a range of diplomatic strategies and tactics to preserve a permissive international legal framework governing outer space.374
Lessons from the history of attempts to initiate technology governance
Historical cases that (potentially) provide insights into when efforts to initiate governance intervention on emerging technologies are likely to be successful and into the efficacy of various pathways towards influencing key actors to deploy regulatory levers in response.
Historical failures to initiate or shape technology governance
Historical cases where a fear of false positives slowed (plausibly warranted) regulatory attention or intervention:
- Failure to act in spite of growing evidence: a review of nearly 100 cases of environmental issues where the precautionary principle was raised, concluding that fear of false positives has often stalled action even though (i) false positives are rare and (ii) there was enough evidence to suggest that a lack of regulation could lead to harm.375
Historical cases of excessive hype leading to (possibly) premature regulatory attention or intervention:
- Premature (and possibly counterproductive) legal focus on technologies that eventually took much longer to develop than anticipated: Weather modification technology,376 deep seabed mining,377 self-driving cars,378 virtual and augmented reality,379 and other technologies charted under the Gartner Hype Cycle reports.380
Historical successes for pathways in shaping technology governance
Historical precedents for successful action towards understanding and responding to the risks of emerging technologies, influencing key actors to deploy regulatory levers:
- Relative success in long-range technology forecasting: some types of forecasts for military technology that achieved reasonable accuracy decades out;381
- Success in anticipatory governance: history of “prescient actions” in urging early action against risky new technologies, such as Leo Szilard’s warning of the dangers of nuclear weapons382 and Alexander Fleming’s 1945 warning of the risk of antibiotic resistance;383
- Successful early action to set policy for safe innovation in a new area of science:384 the 1967 Outer Space Treaty, UK’s Warnock Committee and Human Embryology Act 1990, the Internet Corporation for Assigned Names and Numbers (ICANN);
- Governmental reactions and responses to new risks as they emerge: the 1973 Oil Crisis, the 1929–1933 Great Depression,385 the 2007–2009 financial crisis,386 the COVID-19 pandemic;387
- How effectively other global risks motivated action in response, and how cultural and intellectual orientations influence perceptions: biotechnology, nuclear weapons, global warming, and asteroid collision;388
- The impact of cultural media (film, etc.) on priming policymakers to risks:389 the role of The Day After in motivating Cold War efforts towards nuclear arms control,390 of the movies Deep Impact and Armageddon in shaping perceptions of the importance of asteroid defense,391 of the novel Ghost Fleet in shaping Pentagon perceptions of the importance of emerging technologies to war,392 of Contagion in priming early UK policy responses to COVID-19,393 of Mission Impossible: Dead Reckoning: Part One in deepening President Biden’s concerns over AI prior to signing a landmark 2023 Executive Order.394
- The impact of different analogies or metaphors in framing technology policy:395 for example, the US military’s emphasis on framing the internet as “cyberspace” (i.e., just another “domain” of conflict) led to strong consequences institutionally (supporting the creation of the US Cyber Command) as well as for how international law has subsequently been applied to cyber operations;396
- The role of “epistemic communities” of experts in advocating for international regulation or agreements,397 specifically their role in facilitating nonproliferation treaties and arms control agreements for nuclear weapons398 and anti-ballistic missile systems,399 as well as the history of the earlier era of arms control agreements;400
- Attempted efforts towards international control of new technology: early momentum but ultimate failure of the Baruch Plan for world control of nuclear weapons401 and the failure of world control of aviation in 1920s;402
- Policy responses to past scientific breakthroughs, and the role of geopolitics vs. expert engagement: the 1967 UN Outer Space Treaty, the UK’s Warnock Committee and the Human Fertilisation and Embryology Act 1990, the establishment of the Internet Corporation for Assigned Names and Numbers (ICANN), and the European ban on GMO crops;403
- The role of activism and protests in spurring nonproliferation and moratoria in spurring nuclear nonproliferation agreements and nuclear test bans;404 the role of activism (in response to “trigger events”) in achieving a de facto moratorium on genetically modified crops in Europe in the late 1990s;405 in addition, the likely role of protests and public pressure in contributing to abandonment or slowing of various technologies from geoengineering experiments to nuclear weapons, CFCs, and nuclear power;406
- The role of philanthropy and scientists in fostering Track-II diplomacy initiatives: the Pugwash conferences.407
Lessons from the historical efficacy of different governance levers
Historical cases that (potentially) provide insights into when different societal (legal, regulatory, and governance) levers have proven effective in shaping technology development and use in desired directions.
Historical failures of technology governance levers
Historical precedents for failed or unsuccessful use of various (domestic and/or international) governance levers for shaping technology:
- Mixed-success use of soft-law governance tools for shaping emerging technologies: National Telecommunications and Information Administration discussions on mobile app transparency, drone privacy, facial recognition, YourAdChoices, UNESCO declaration on genetics and bioethics, Environmental Management System (ISO 14001), Sustainable Forestry Practices by the Sustainable Forestry Initiative and Forest Stewardship Council, and Leadership in Energy and Environmental Design.408
- Failed use of soft-law governance tools for shaping emerging technologies: Children’s Online Privacy Protection Rule, Internet Content Rating Association, Platform for Internet Content Selection, Platform for Privacy Preferences, Do Not Track system, and nanotechnology voluntary data call-in by Australia, the US, and the UK;409
- Failures of narrowly technology-focused approaches to safety engineering: failure of narrow technology-focused approaches to the design of safe cars and in the design and calibration of pulse oximeters during the COVID pandemic, which were mismatched to—and therefore led to dangerous outcomes for—female drivers and darker-skinned patients, respectively, highlighting the role of incorporating human, psychological, and other disciplines;410
- Failures of information control mechanisms at preventing proliferation: selling of nuclear secrets by A.Q. Khan network,411 limited efficacy of Cold War nuclear secrecy regimes at meaningfully constraining proliferation of nuclear weapons,412 track record of major leaks and hacks of digital information, 2005–present;413
- Failure to transfer (technological) safety techniques, even to allies: in the late 2000s, the US sought to help provide security assistance to Pakistan to help safeguard the Pakistani nuclear arsenal but was unable to transfer permissive action link (PAL) technologies because of domestic legal barriers that forbade export to states that were not part of the Nuclear Non-Proliferation Treaty (NPT);414
- Degradation of previously established export control regimes: Cold War-era US high performance computing export controls struggled to be updated sufficiently quickly to keep pace with hardware advancements.415 The US initially treated cryptography as a weapon under export control laws, meaning that encryption systems could not be exported for commercial purposes even to close allies and trading partners; however, by the late 1990s, several influences—including the rise of open-source software and European indignation at US spying on their communications—led to new regulations that allowed cryptography to be exported with minimal government interference;416
- “Missed opportunities” for early action against anticipated risks: mid-2000s effort to put “killer robots” on humanitarian disarmament issue agenda, which failed as these were seen as “too speculative”;417
- Mixed success of scientific and industry self-regulation: the Asilomar Conference, the Second International Conference on Synthetic Biology, and 2004–2007 failed efforts to develop guidelines for nanoparticles;418
- Sustained failure to establish treaty regimes: various examples, including the international community spending nearly 20 years since 2004 negotiating a new treaty for Biodiversity Beyond National Jurisdiction;419
- Unproductive locking-in of insufficient, “empty” institutions, “face-saving” institutions, or gridlocked mechanisms: history of states creating suboptimal, ill-designed institutions—such as the United Nations Forum on Forests, the Copenhagen Accord on Climate Change, the UN Commission on Sustainable Development, and the 1980 UN Convention on Certain Conventional Weapons—with mandates that may deprive them of much capacity for policy formulation or implementation;420
- Drawn-out contestation of hierarchical and unequal global technology governance regimes: the Nuclear Non-Proliferation Treaty regime has seen cycles of contestation and challenge by other states;421
- Failures of non-inclusive club governance approaches to nonproliferation: the Nuclear Security Summits (NSS) (2012, 2014, 2016) centered on high-level debates over the stocktaking and securing of nuclear materials. These events saw a constrained list of invited states; as a result, the NSS process was derailed because procedural questions over who was invited or excluded came to dominate discussions (especially at the 2016 Vienna summit), politicizing what had been a technical topic and hampering the extension and take-up of follow-on initiatives by other states.422
Historical successes of technology governance levers
Historical precedents for successful use of various governance levers at shaping technology:
- Effective scientific secrecy around early development of powerful new technologies: early development of the atomic bomb423 and early computers (Colossus and ENIAC).424
- Successes in the oversight of various safety-critical technologies: track record of “High Reliability Organisations”425 in addressing emerging risks after initial incidents to achieve very low rates of errors, such as in air traffic control systems, naval aircraft carrier operations,426 the aerospace sector, construction, and oil refineries;427
- Successful development of “defense in depth”428 interventions to lower the risks of accident in specific industries: safe operation of nuclear reactors, chemical plants, aviation, space vehicles, cybersecurity and information security, software development, laboratories studying dangerous pathogens, improvised explosive devices, homeland security, hospital security, port security, physical security in general, control system safety in general, mining safety, oil rig safety, surgical safety, fire management, and health care delivery,429 and lessons from defense-in-depth frameworks developed in cybersecurity for frontier AI risks;430
- Successful safety “races to the top” in selected industries: Improvements in aircraft safety in the aviation sector;431
- Successful use of risk assessment techniques in safety-critical industries: examination of popular risk identification techniques (scenario analysis, fishbone method, and risk typologies and taxonomies), risk analysis techniques (causal mapping, Delphi technique, cross-impact analysis, bow tie analysis, and system-theoretic process analysis), and risk evaluation techniques (checklists and risk matrices) used in established industries like finance, aviation, nuclear, and biolabs, and how these might be applied in advanced AI companies;432
- Susceptibility of different types of digital technologies to (global) regulation: relative successes and failures of global regulation of different digital technologies that are (1) centralized and clearly material (e.g., submarine cables), (2) decentralized and clearly material (e.g., smart speakers); (3) centralized and seemingly immaterial (e.g., search engines), and (4) decentralized and seemingly immaterial (e.g., Bitcoin protocol);433
- Use of confidence-building measures to stabilize relations and expectations: 1972 Incidents at Sea Agreement434 and the 12th–19th century development of Maritime Prize Law;435
- Successful transfer of developed safety techniques, even to adversaries: the US “leaked” PAL locks on nuclear weapons to the Soviet Union;436
- Effective nonproliferation regimes: for nuclear weapons, a mix of norms, treaties, US “strategies of inhibition,”437 supply-side export controls,438 and domestic politics factors439 have produced an imperfect but remarkably robust track record of nonproliferation.440 Indeed, based on IAEA databases there have historically been 74 states that decided to build or use nuclear reactors, of which 69 have at some time been considered potentially able to pursue nuclear weapons, and of which 10 went nuclear and 7 ran but abandoned a program, and for 14–23, evidence exists of a considered decision not to use their infrastructure to pursue nuclear weapons;441
- General design lessons from existing treaty regimes: drawing insights from the design and efficacy of a range of treaties—including the Single Convention on Narcotic Drugs (SCND), the Vienna Convention on Psychotropic Substances (VCPS), the Convention Against Illicit Trafficking of Narcotic Drugs and Psychotropic Substances (CAIT), the Montreal Protocol on Substances that Deplete the Ozone Layer, the Cartagena Protocol on Biosafety to the Convention on Biological Diversity, the Biological Weapons Convention (BWC), the Treaty on the Non-Proliferation of Nuclear Weapons (NPT), the Convention on Nuclear Safety, the Convention on International Trade in Endangered Species (CITES), the Basel Convention on the Control of Transboundary Movements of Hazardous Wastes and their Disposal, and the Bern Convention on the Conservation of European Wildlife and Natural Habitats—to derive design lessons for a global regulatory system dedicated to the regulation of safety concerns from high-risk AI;442
- Effective use of international access and benefit distribution mechanisms in conjunction with proliferation control measures: the efficacy of the IAEA’s “dual mandate” to enable the transfer of peaceful nuclear technology whilst seeking to curtail its use for military purposes;443
- Effective monitoring and verification (M&V) mechanisms in arms control regimes: M&V implementation across three types of nuclear arms control treaties: nonproliferation treaties, US-USSR/Russia arms limitation treaties, and nuclear test bans;444
- Scientific community (temporary) moratoria on research: the Asilomar Conference445 and the H5N1 gain-of-function debate;446
- Instances where treaty commitments, institutional infighting, or bureaucratic politics contributed to technological restraint: a range of cases resulting in cancellation of weapon systems development, including nuclear-ramjet powered cruise missiles, “continent killer” nuclear warheads, nuclear-powered aircraft, “death dust” radiological weapons, various types of anti-ballistic-missile defense, and many others.447
- International institutional design lessons from successes and failures in other areas: global governance successes and failures in the regime complexes for environment, security, and/or trade;448
- Successful use of soft-law governance tools for shaping emerging technologies: Internet Corporation for Assigned Names and Numbers, Motion Picture Association of America, Federal Trade Commission consent decrees, Federal Communications Commission’s power over broadcaster licensing, Entertainment Software Rating Board, NIST Framework for Improving Critical Infrastructure Cybersecurity, Asilomar rDNA Guidelines, International Gene Synthesis Consortium, International Society for Stem Cell Research Guidelines, BASF Code of Conduct, Environmental Defense Fund, and DuPont Risk Framework;449
- Successful use of participatory mechanisms in improving risk assessment: use of scenario methods and risk assessments in climate impact research.450
4.4. Lessons derived from ethics and political theory
Mapping the space of principles or criteria for “ideal AI governance”:451
- Mapping broad normative desiderata for good governance regimes for advanced AI,452 either in terms of outputs or in terms of process;453
- Understanding how to weigh different good outcomes post-TAI-deployment;454
- Understanding the different functional goals and tradeoffs in good international institutional design.455
II. Option-identifying work: Mapping actors and affordances
Strategic clarity requires an understanding not just of the features of the advanced AI governance problem, but also of the options in response.
This entails mapping the range of possible levers that could be used in response to this problem. Critically, this is not just about speculating about what governance tools we may want to put in place for future advanced AI systems mid-transition (after they have arrived). Rather, there might be actions we could take in the “pre-emergence” stage to adequately prepare ourselves.456
Within the field, there has been extensive work on options and areas of intervention. Yet there is no clear, integrated map of the advanced AI governance landscape and its gaps. Sam Clarke proposes that there are different ways of carving up the landscape, such as based on different types of interventions, different geographic hubs, or “Theories of Victory.”457 To extend this, one might segment the advanced AI governance solution space along work which aims to identify and understand, in turn:458
- Key actors that will likely (be in a strong position to) shape advanced AI;
- Levers of influence (by which these actors might shape advanced AI);
- Pathways towards influencing these actors to deploy their levers well.459
1. Potential key actors shaping advanced AI
In other words, whose decisions might especially affect the development and deployment of advanced AI, directly or indirectly, such that these decisions should be shaped to be as beneficial as possible?
Key actors can be defined as “actors whose key decisions will have significant impact on shaping the outcomes from advanced AI, either directly (first-order), or by strongly affecting such decisions made by other actors (second-order).”460
Key decisions can be further defined as “a choice or series of choices by a key actor to use its levers of governance, in ways that directly affect beneficial advanced AI outcomes, and which are hard to reverse.”461
Some work in this space explores the relative importance of (the decisions of) different types of key actors:
- The roles of state vs. firms vs. AI researchers in shaping AI policy;462
- Role of “epistemic communities” of scientific experts,463 especially members of the AI research community;464
- The role of different potentially relevant stakeholders for responsible AI systems across its development chain, from individual stakeholders to organizational stakeholders to national/international stakeholders;465
- The relative role of expert advice vs. public pressure in shaping policymakers’ approach to AI;466
- Role of different actors in and around the corporation in shaping lab policy,467 including actors within the lab (e.g., senior management, shareholders, AI lab employees, and employee activists)468 and actors outside the lab (e.g., corporate partners and competitors, industry consortia, nonprofit organizations, the public, the media, and governments).469
Other work focuses more specifically on mapping particular key actors whose decisions may be particularly important in shaping advanced AI outcomes, depending on one’s view of strategic parameters.
The following list should be taken more as a “landscape” review than a literature review, since coverage of different actors differs amongst papers. Moreover, while the list aims to be relatively inclusive of actors, it is clear that the (absolute and relative) importance of each of these actors obviously differs hugely between worldviews and approaches.
1.1. AI developer (lab & tech company) actors
Leading AI firms pursuing AGI:
- OpenAI,
- DeepMind,
- Anthropic,
- Aleph Alpha,
- Adept,
- Cohere,
- Inflection,
- Keen,
- xAI.470
Chinese labs and institutions researching “general AI”;
- Baidu Research,
- Alibaba DAMO Academy,
- Tencent AI Lab,
- Huawei,
- JD Research Institute,
- Beijing Institute for General Artificial Intelligence;
- Beijing Academy of Artificial Intelligence, etc.471
Large tech companies472 that may take an increasingly significant role in AGI research:
- Microsoft,
- Google,
- Facebook,
- Amazon.
Future frontier labs, currently not known but to be established/achieve prominence (e.g., “Magma”473).
1.2. AI services & compute hardware supply chains
AI services supply chain actors:474
- Cloud computing providers:475
- Globally: Amazon Web Services (32%), Microsoft Azure (21%), and Google Cloud (8%); IBM;
- Chinese market: Alibaba, Huawei, and Tencent.
Hardware supply chain industry actors:476
- Providers of optical components to photolithography machine manufacturers:
- Carl Zeiss AG [Germany], a key ASML supplier of optical lenses;477
- Producers of extreme ultraviolet (EUV) photolithography machines:
- ASML [The Netherlands].478
- Photoresist processing providers:
- Asahi Kasei and Tokyo Ohka Kogyo Co. [Japan].479
- Advanced chip manufacturing:
- TMSC [Taiwan];
- Intel [US];
- Samsung [South Korea].
- Semiconductor intellectual property owners and chip designers:
- Arm [UK];
- Graphcore [UK].
- DRAM integrated circuit chips:
- Samsung (market share 44%) [South Korea];
- SK hynix (27%) [South Korea];
- Micron (22%) [US].
- GPU providers:
- Intel (market share 62%) [US];
- AMD (18%) [US];
- NVIDIA (20%) [US].
1.3. AI industry and academic actors
Industry bodies:
- Partnership on AI;
- Frontier Model Forum;480
- ML Commons;481
- IEEE (Institute of Electrical and Electronics Engineers) + IEEE-SA (standards body);
- ISO (and IEC).
Standard-setting organizations:
- US standard-setting organizations (NIST);
- European Standards Organizations (ESOs), tasked with setting standards for the EU AI Act: the European Committee for Standardisation (CEN), European Committee for Electrotechnical Standardisation (CENELEC), and European Telecommunications Standards Institute (ETSI);482
- VDE (influential German standardization organization).483
Software tools & community service providers:
- arXiv;
- GitHub;
- Colab;
- Hugging Face.
Academic communities:
- Scientific ML community;484
- AI conferences: NeurIPS, AAAI/ACM, ICLR, IJCAI-ECAI, AIES, and FAccT, etc.;
- AI ethics community and various subcommunities;
- Numerous national-level academic or research institutes.
Other active tech community actors:
- Open-source machine learning software community;485
- “Open”/diffusion-encouraging486 AI community (e.g., Stability.ai, Eleuther.ai);487
- Hacker communities;
- Cybersecurity and information security expert communities.488
1.4. State and governmental actors
Various states, and their constituent (government) agencies or bodies that are, plausibly will be, or potentially could be moved to be in powerful positions to shape the development of advanced AI.
The United States
Key actors in the US:489
- Executive Branch actors;490
- Legislative Branch;491
- Judiciary;492
- Federal agencies;493
- Intelligence community;494
- Independent federal agencies;495
- Relevant state and local governments, such as the State of California (potentially significant extraterritorial regulatory effects),496 State of Illinois and State of Texas (among first states to place restrictions on biometrics), etc.
China
Key actors in China:497
- 20th Central Committee of the Chinese Communist Party;
- China’s State Council;
- Bureaucratic actors engaged in AI policy-setting;498
- Actors and institutions engaged in track-II diplomacy on AI.499
The EU
Key actors in the EU:500
- European Commission;
- European Parliament;
- Scientific research initiatives and directorates;501
- (Proposed) European Artificial Intelligence Board and notified bodies.502
The UK
Key actors in the UK:503
- The Cabinet Office;504
- Foreign Commonwealth and Development Office (FCDO);
- Ministry of Defence (MoD);505
- Department for Science, Innovation and Technology (DSIT);506
- UK Parliament;507
- The Digital Regulators Cooperation Forum;
- Advanced Research and Invention Agency (ARIA).
Other states with varying roles
Other states that may play key roles because of their general geopolitical influence, AI-relevant resources (e.g., compute supply chain and significant research talent), or track record as digital norm setters:
- Influential states: India, Russia, and Brazil;
- Significant AI research talent: France, and Canada;
- Hosting nodes in the global hardware supply chain: US (NVIDIA), Taiwan (TSMC), South Korea (Samsung), the Netherlands (ASML), Japan (photoresist processing), UK (Arm), and Germany (Carl Zeiss AG);
- Potential (regional) neutral hubs: Singapore508 and Switzerland;509
- Global South coalitions: states from the Global South510 and coalitions of Small Island Developing States (SIDS);511
- Track record of (digital) norm-setters: Estonia and Norway.512
1.5. Standard-setting organizations
International standard-setting institutions:513
- ISO;
- IEC;
- IEEE;
- CEN/CENELEC;
- VDE (Association for Electrical, Electronic & Information Technologies) and its AI Quality & Testing Hub.514
1.6. International organizations
Various United Nations agencies:515
- ITU;516
- UNESCO;517
- Office of the UN Tech Envoy (conducting the process leading to the Global Digital Compact in 2024);
- UN Science, Technology, and Innovation (STI) Forum;
- UN Executive Office of the Secretary-General;
- UN General Assembly;
- UN Security Council (UNSC);
- UN Human Rights Council;518
- Office of the High Commissioner on Human Rights;519
- UN Chief Executives Board for Coordination;520
- Secretary-General’s High-Level Advisory Board on Effective Multilateralism (HLAB);
- Secretary-General’s High-Level Advisory Body on Artificial Intelligence (“AI Advisory Body”).521
Other international institutions already engaged on AI in some capacity522 (in no particular order):
- OECD;523
- Global Partnership on AI;
- G7;524
- G20;525
- Council of Europe (Ad Hoc Committee on Artificial Intelligence (CAHAI));526
- NATO;527
- AI Partnership for Defense;528
- Global Road Traffic Forum;529
- International Maritime Organisation;
- EU-US Trade and Technology Council (TTC);530
- EU-India Trade and Technology Council;
- Multi-stakeholder fora: World Summit on the Information Society (WSIS), Internet Governance Forum (IGF), Global Summit on AI for Good,531 and World Economic Forum (Centre for Trustworthy Technology).
Other international institutions not yet engaged on AI:
- International & regional courts: International Criminal Court (ICC), International Court of Justice (ICJ), and European Court of Justice.
1.7. Public, Civil Society, & media actors
Civil society organizations:532
- Gatekeepers engaged in AI-specific norm-setting and advocacy: Human Rights Watch, Campaign to Stop Killer Robots,533 and AlgorithmWatch;534
- Civilian open-source intelligence (OSINT) actors engaged in monitoring state violations of human rights and international humanitarian law:535 Bellingcat, NYT Visual Investigation Unit, CNS (Arms Control Wonk), Middlebury Institute, Forensic Architecture, BBC Africa Eye, Syrian Archive, etc.
- Military AI mediation: Centre for Humanitarian Dialogue and Geneva Centre for Security Policy.536
Media actors:
- Mass media;537
- Tech media;
- “Para-scientific media.”538
Cultural actors:
- Film industry (Hollywood, etc.);
- Influential and widely read authors.539
2. Levers of governance (for each key actor)
That is, how might each key actor shape the development of advanced AI?
A “lever (of governance)” can be defined as “a tool or intervention that can be used by key actors to shape or affect (1) the primary outcome of advanced AI development; (2) key strategic parameters of advanced AI governance; (3) other key actors’ choices or key decisions.”540
Research in this field includes analysis of different types of tools (key levers or interventions) available to different actors to shape advanced AI development and use.541
2.1. AI developer levers
Developer (intra-lab)-level levers:542
- Levers for adequate AI model evaluation and technical safety testing:543 decoding; limiting systems, adversarial training; throughout-lifecycle test, evaluation, validation, and verification (TEVV) policies;544 internal model safety evaluations;545 and risk assessments;546
- Levers for safe risk management in AI development process: Responsible Scaling Policies (RSPs),547 the Three Lines of Defense (3LoD) model,548 organizational and operational criteria for adequately safe development,549 and “defense in depth” risk management procedures;550
- Levers to ensure cautious overall decision-making: ethics and oversight boards;551 corporate governance policies that support and enable cautious decision-making,552 such as establishing an internal audit team;553 and/or incorporating as a Public Benefit Corporation to allow the board of directors to balance stockholders’ pecuniary interests against the corporation’s social mission;
- Levers to ensure operational security: information security best practices554 and structured access mechanisms555 at the level of cloud-based AI service interfaces;
- Policies for responsibly sharing safety-relevant information: information-providing policies to increase legibility and compliance: model cards;556
- Policies to ensure organization can pace and/or pause capability research:557 Board authority to pause research and channels to invite external AI scientists to review alignment of systems.558
Developer external (unilateral) levers:
- Use of contracts and licensing to attempt to limit uses of AI and its outputs (e.g., the Responsible AI Licenses (RAIL) initiative);559
- Voluntary safety commitments;560
- Norm entrepreneurship (i.e., lobbying, public statements, or initiatives that signal public concern and/or dissatisfaction with an existing state of affairs, potentially alerting others to the existence of a shared complaint and facilitating potential “norm cascades” towards new expectations or collective solutions).561
2.2. AI industry & academia levers
Industry-level (coordinated inter-lab) levers:
- Self-regulation;562
- Codes of conduct;
- AI ethics principles;563
- Professional norms;564
- AI ethics advisory committees;565
- Incident databases;566
- Institutional, software, and hardware mechanisms for enabling developers to make verifiable claims;567
- Bug bounties;568
- Evaluation-based coordinated pauses;569
- Other inter-lab cooperation mechanisms:570
- Assist Clause;571
- Windfall Clause;572
- Mutual monitoring agreements (red-teaming, incident-sharing, compute accounting, and seconding engineers);
- Communications and heads-up;
- Third-party auditing;
- Bias and safety bounties;
- Secure compute enclaves;
- Standard benchmarks & audit trails;
- Publication norms.573
Third-party industry actors levers:
- Publication reviews;574
- Certification schemes;575
- Auditing schemas.576
Scientific community levers:
- Institutional Review Boards (IRBs);577
- Conference or journal pre-publication impact assessment requirements;578 academic conference practices;579
- Publication and model sharing and release norms;580
- Benchmarks;581
- Differential technological development (innovation prizes);582
- (Temporary) moratoria.583
2.3. Compute supply chain industry levers
Global compute industry-level levers:584
- Stock-and-flow accounting;
- Operating licenses;
- Supply chain chokepoints;585
- Inspections
- Passive architectural on-chip constraints (e.g., performance caps)
- Active architectural on-chip constraints (e.g., shutdown mechanisms)
2.4. Governmental levers
We can distinguish between general governmental levers and the specific levers available to particular key states.
General governmental levers586
Legislatures’ levers:587
- Create new AI-specific regimes, such as:
- Horizontal risk regulation;588
- Industry-specific risk regulatory regimes;
- Permitting, licensing, and market gatekeeping regimes;589
- Bans or moratoria;
- Know-Your-Customer schemes.590
- Amend laws to extend or apply existing regulations to AI:591
- Domain/industry-specific risk regulations;
- Competition/antitrust law,592 including doctrines around merger control, abuse of dominance, cartels, and collusion; agreements on hardware security; and state aid;
- Liability law;593
- Insurance law;594
- Contract law;595
- IP law;596
- Copyright law (amongst others through its impact on data scraping practices);597
- Criminal law;598
- Privacy and data protection law (amongst others through its impact on data scraping practices);
- Public procurement law and procurement processes.599
Executive levers:
- Executive orders;
- Foreign investment restrictions;
- AI R&D funding strategies;600
- Nationalization of firms;
- Certification schemes;
- Various tools of “differential technology development”:601 policies for preferential advancement of safer AI architectures (funding and direct development programs, government prizes, advanced market commitments, regulatory requirements, and tax incentives)602 and policies for slowing down research lines towards dangerous AI architectures (moratoria, bans, defunding, divestment, and/or “stage-gating” review processes);603
- Foreign policy decisions, such as initiating multilateral treaty negotiations.
Judiciaries’ levers:
- Judicial decisions handed down on cases involving AI that extend or apply existing doctrines to AI, shaping economic incentives and setting precedent for regulatory treatment of advanced AI, such as the US Supreme Court ruling on Gonzalez v. Google, which has implications for whether algorithmic recommendations will receive full Section 230 protections;604
- Judicial review, especially of drastic executive actions taken in response to AI risk scenarios;605
- Judicial policymaking, through discretion in evaluating proportionality or balancing tests.606
Expert agencies’ levers:
- A mix of features of other actors, from setting policies to adjudicating disputes to enforcing decisions;607
- Create or propose soft law.608
Ancillary institutions:
- Improved monitoring infrastructures;609
- Provide services in terms of training, insurance, procurement, identification, archiving, etc.610
Foreign Ministries/State Department:
- Set activities and issue agendas in global AI governance institutions;
- Bypass or challenge existing institutions by engaging in “competitive regime creation,”611 “forum shopping,”612 or the strategic creation of treaty conflicts;613
- Initiate multilateral treaty negotiations;
- Advice policymakers about the existence and meaning of international law and which obligations these impose;614
- Conduct state behavior around AI issues (in terms of state policy, and through discussion of AI issues in national legislation, diplomatic correspondence, etc.) in such a way as to contribute to the establishment of binding customary international law (CIL).615
Specific key governments levers
Levers available to specific key governments:
US-specific levers:616
- AI-specific regulations, such as the AI Bill of Rights;617 Algorithmic Accountability Act;618 2023 Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence;619 and various currently pending federal legislative proposals for regulating generative and/or frontier AI;620
- General levers,621 such as federal R&D funding, foreign investment restrictions, export controls,622 visa vetting, expanded visa pathways, secrecy orders, voluntary screening procedures, use of the Defense Production Act,623 antitrust enforcement, the “Born Secret” Doctrine, nationalization of companies or compute hardware, various Presidential Emergency powers,624 etc.
EU-specific levers:
- AI-specific regulations, including:
- The AI Act, which will have direct regulatory effects625 but may also exert extraterritorial impact as part of a “Brussels Effect”;626
- Standard-setting by European Standards Organizations (ESOs);627
- AI Liability Directive.628
China-specific levers:
- AI-specific regulations;629
- Standards;630
- Activities in global AI governance institutions.631
UK-specific levers:632
- National Security and Investment Act 2021;
- Competition Law: 1998 Competition Act;
- Export Control legislation;
- Secrecy orders.
2.5. Public, civil society & media actor levers
Civil Society/activist movement levers:633
- Lab-level (internal) levers:
- Shareholder activism, voting out CEOs;
- Unions and intra-organizational advocacy, strikes, and walkouts;634
- Capacity-building of employee activism via recruitment, political education, training, and legal advice.
- Lab-level (external) levers:
- Stigmatization of irresponsible practices;635
- Investigative journalism, awareness-raising of scandals and incidents, hacking and leaks, and whistleblowing;
- Impact litigation636 and class-action lawsuits;637
- Public protest638 and direct action (e.g., sit-ins).
- Industry-level levers:
- Norm advocacy and lobbying;
- Open letters and statements;
- Mapping and highlighting (compliance) performance of companies; establishing metrics, indexes, and prizes; and certification schemes.639
- Public-focused levers:
- Media content creation;640
- Boycott and divestment;
- Shaming of state noncompliance with international law;641
- Emotional contagion—shaping and disseminating of public emotional dynamics or responses to a crisis.642
- Creating alternatives:
- Public interest technology research;
- Creating alternative (types of) institutions643 and new AI labs.
- State-focused levers:
- Monitor compliance with international law.644
2.6. International organizations and regime levers
International standards bodies’ levers:
- Set technical safety and reliability standards;645
- Undertake “para-regulation,” setting pathways for future regulation not by imposing substantive rules but rather by establishing foundational concepts or terms.646
International regime levers:647
- Setting or shaping norms and expectations:
- Setting, affirming, and/or clarifying states’ obligations under existing international law principles;
- Set fora and/or agenda for negotiation of new treaties or regimes in various formats, such as:
- Broad framework conventions;648
- Nonproliferation and arms control agreements;649
- Export control regimes.650
- Create (technical) benchmarks and focal points for decision-making by both states and non-state actors;651
- Organize training and workshops with national officials.
- Coordinating behavior; reducing uncertainty, improving trust:
- Confidence-building measures;652
- Review conferences (e.g., BWC);
- Conferences of parties (e.g., UNFCCC);
- Establishing information and benefit-sharing mechanisms.
- Creating common knowledge or shared perceptions of problems; establish “fire alarms”:
- Intergovernmental scientific bodies (e.g., Intergovernmental Panel on Climate Change (IPCC) and Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES));
- International warning systems (e.g., WHO’s “public health emergency of international concern” mechanism).
- Adjudicating and arbitrating state disagreements over application of policies, resolving tensions or crises for regimes:
- Arbitral bodies (e.g., WTO Appellate Body);
- Adjudicatory tribunals (e.g., ICJ);
- Treaty bodies (e.g., Human Rights Committee);
- Other dispute resolution mechanisms (e.g., BWC or Resolution 1540 allowing complaints to be lodged at the UNSC).
- Establishing material constraints:
- Supply-side material proliferation controls (e.g., stock-and-flow accounting and trade barriers);
- Fair and equitable treatment standards in international investment law.
- Monitoring state compliance:
- Inspection regimes;
- Safeguards;
- National contributions;
- Network of national contact points.
- Sanctioning noncompliance:
- Inducing direct costs through sanctions;
- Inducing reputational costs,653 in particular through shaming.654
2.7. Future, new types of institutions and levers
Novel governance institutions and innovations:
- “Regulatory markets” and private regulatory authorities;655
- New monitoring institutions and information markets;656
- Quadratic voting and radical markets657
- Blockchain smart contracts.658
3. Pathways to influence (on each key actor)
That is, how might concerned stakeholders ensure that key actors use their levers to shape advanced AI development in appropriate ways?
In this context, a “pathway (to influence)” can be defined as “a tool or intervention by which other actors (that may not themselves be key actors) can affect, persuade, induce, incentivize, or require key actors to make certain key decisions around the governance of AI. This can include interventions that ensure that certain levers of control are (not) used, or used in particular ways.”659
This includes research on the different pathways by which the use of these above levers might be enabled, advocated for, and implemented (i.e., the tools available to affect the decisions by key actors).
This can draw on mappings and taxonomies: “A Map to Navigate AI Governance”660 “The Longtermist AI Governance Landscape”.661
3.1. Pathways to directly shaping advanced AI systems’ actions through law
Directly shaping advanced AI actions through law (i.e., legal systems and norms as an anchor or lodestar for technical alignment approaches):
- “Law-following AI”;662
- Encode “incomplete contracting” as a framework for AI alignment;663
- Negative human rights as technical safety constraint for minimal alignment;664
- Human rights norms as a benchmark for maximal alignment;665
- Encode fiduciary duties towards users into AI systems;666
- Mandatory on-chip controls (monitoring and remote shutdown);
- Legal informatics approach to alignment.667
3.2. Pathways to shaping governmental decisions
Shaping governmental decisions around AI levers at the level of:
- Legislatures:
- Advocacy within the legislative AI policymaking process.668
- Executives:
- Serve as high-bandwidth policy advisor;669
- Provide actionable technical information;670
- Shape, provide, or spread narratives,671 ideas, “memes,”672 framings, or (legal) analogies673 for AI governance.
- Clarify or emphasize established principles within national law (e.g., precautionary principle and cost-benefit analysis674) and/or state obligations under international law (e.g., customary international law,675 IHRL,676 etc.).
3.3. Pathways to shaping court decisions
Shaping court decisions around AI systems that set critical precedent for the application of AI policy to advanced AI:
- Advance legal scholarship with new arguments, interpretations, or analogies and metaphors for AI technology;677
- Clarifying the “ordinary meaning” of key legal terms around AI;678
- Judge seminars and training courses;679
- Online information repositories.680
3.4. Pathways to shaping AI developers’ decisions
Shaping individual lab decisions around AI governance:
- Governmental regulations (e.g., industry risk, liability, criminal, etc.);
- Institutional design choices: establish rules in the charter that enable the board of directors to make more cautious or pro-social choices,681 and establish an internal AI ethics board682 or internal audit functions;683
- Campaigns or resources to educate researchers about AI risk, making AI safety research more concrete and legible, and/or creating common knowledge about researchers’ perceptions of and attitudes towards these risks;684
- Employee activism and pressure,685 and documented communications of risks by employees (which make companies more risk averse because they are more likely to be held liable in court);686
- Human rights norms generally applicable to business activities under the Ruggie Principles,687 which amongst others can directly influence decisions by tech company oversight bodies;688
- Develop and provide clear industry standards and resources for their implementation, such as AI risk management frameworks.689
Shaping industry-wide decisions around AI governance:
- Governmental regulations (as above);
- Ensure competition law frameworks enable cooperation on safety.690
3.5. Pathways to shaping AI research community decisions
Shaping AI research community decisions around AI governance:
- Develop and disseminate clear guidelines and toolsets to facilitate responsible practices, such as:
- Frameworks for pre-publication impact assessment of AI research;691
- “Model cards” for the transparent reporting of benchmarked evaluations of a model’s performance across conditions and for different groups;692
- General risk management frameworks for evaluating and anticipating AI risks.693
- Framing and stigmatization around decisions or practices;694
- Participatory technology assessment processes.695
Shaping civil society decisions around AI governance:
- Work with “gatekeeper” organizations to put issues on the advocacy agenda.696
3.6. Pathways to shaping international institutions’ decisions
Shaping international institutional decisions around AI governance:
- Clarify global administrative law obligations;697
- Influence domestic policy processes in order to indirectly shape transnational legal processes;698
- Scientific expert bodies’ role in informing multilateral treaty-making by preparing evidence for treaty-making bodies, scientifically advising these bodies, and directly exchanging with them at intergovernmental body sessions or dialogical events.699
Shaping standards bodies’ decisions around AI governance:
- Technical experts’ direct participation in standards development;700
- Advancing standardization of advanced AI-relevant safety best practices.701
3.7. Other pathways to shape various actors’ decisions
Shaping various actors’ decisions around AI governance:
- Work to shape broad narratives around advanced AI, such as through compelling narratives or depictions of good outcomes;702
- Work to shape analogies or metaphors used by the public, policymakers, or courts in thinking about (advanced) AI;703
- Pursue specific career paths with key actors to contribute to good policymaking.704
III. Prescriptive work: Identifying priorities and proposing policies
Finally, a third category of work aims to go beyond either analyzing the problem of AI governance (Part I) or surveying potential elements or options for governance solutions analytically (Part II). This category is rather prescriptive in that it aims to directly propose or advocate for specific policies or actions by key actors. This includes work focused on:
- Articulating broad theories of change to identify priorities for AI governance (given a certain view of the problem and of the options available);
- Articulating broad heuristics for crafting good AI regulation;
- Putting forward policy proposals as well as assets that aim to help in their implementation.
1. Prioritization: Articulating theories of change
Achieving an understanding of the AI governance problem and potential options in response is valuable. Yet, this is not enough alone to deliver strategic clarity about which of these actors should be approached or which of these levers should be utilized in what ways. For that, it is necessary to develop more systematic accounts of different (currently held or possible) theories of change or impact.
The idea of exploring and comparing such theories of action is not new. There have been various accounts that aim to articulate the linkages between near-term actions and longer-term goals. Some of these have focused primarily on theories of change (or “impact”) from the perspective of technical AI alignment.705 Others have articulated more specific theories of impact for the advanced AI governance space.706 These include:
- Dafoe’s Asset-Decision model, which focuses on the direction of research activities to help (1) create assets which can eventually (2) inform impactful decisions;707
- Leung’s model for impactful AI strategy research that can shape key decisions by (1) those developing and deploying AI and (2) those actors shaping the environments in which it is developed and deployed (i.e., research lab environment, legislative environment, and market environment).708
- Garfinkel’s “AI Strategy: Pathways for Impact,”709 which highlights three distinct pathways for positively influencing the development of advanced AI: (1) become a decision-maker (or close enough to influence one), (2) spread good memes that are picked up by decision-makers, and (3) think of good memes to spread and make them credible;
- Baum’s framework for “affecting the future of AI governance,” which distinguishes several avenues by which AI policy could shape the long-term:710 (1) improve current AI governance, (2) support AI governance communities, (3) advance research on future AI governance, (4) advance CS design of AI safety and ethics to create solutions, and (5) improve underlying governance conditions.
In addition, some have articulated specific scenarios for what successful policy action on advanced AI might look like,711 especially in the relative near-term future (“AI strategy nearcasting”).712 However much further work is needed.
2. General heuristics for crafting advanced AI policy
General heuristics for making policies relevant or actionable to advanced AI.
2.1. General heuristics for good regulation
Heuristics for crafting good AI regulation:
- Utilizing and articulating suitable terminology for drafting and scoping AI regulations, especially risk-focused terms;713
- Understand implications of different regulatory approaches (ex ante, ex post; risk regulation) for AI regulations;714
- Grounding AI policy within an “all-hazards” approach to managing various other global catastrophic risks simultaneously;715
- Requirements for an advanced AI regime to avoid “perpetual risk”: exclusivity, benevolence, stability, and successful alignment;716
- Establishing monitoring infrastructures to provide governments with actionable information.717
2.2. Heuristics for good institutional design
Heuristics for good institutional design:
- General desiderata and tradeoffs for international institutional design in terms of questions of regime centralization or decentralization;718
- Procedural heuristics for organizing international negotiation processes: ensure international AI governance fora are inclusive of Global South actors;719
- Ideal characteristics of global governance systems for high-risk AI, such as those that (1) govern dual-use technology; (2) take a risk-based approach; (3) provide safety measures; (4) incorporate technically informed, expert-driven, multi-stakeholder processes that enable rapid iteration; (5) where the effects are consistent with the treaty’s intent; and (6) that possess enforcement mechanisms.720
2.3. Heuristics for future-proofing governance
Heuristics for future-proofing governance regimes and desiderata and systems for making existing regulations more adaptive, scalable, or resilient:721
- Traditional (treaty) reform or implementation mechanisms:
- The formal treaty amendment process;722
- Unilateral state actions (explanatory memoranda and treaty reservations) or multilateral responses (Working Party Resolution) to adapt multilateral treaties;723
- The development of lex scripta treaties through the lex posteriori of customary international law, spurred by new state behavior.724
- Adaptive treaty interpretation methods:
- Evolutionary interpretation of treaties;725
- Treaty interpretation under the principle of systemic integration.726
- Instrument choices that promote flexibility:
- Use of framework conventions;727
- Use of informal governance institutions;728
- The subsequent layering of soft law on earlier hard-law regimes;729
- Use of uncorrelated governance instruments to enable legal resilience.730
- Regime design choices that promote flexibility:
- Scope: include key systems (“general-purpose AI systems,” “highly capable foundation models,” “frontier AI systems,” etc.) within the material scope of the regulation;731
- Phrasing: in-text technological neutrality or deliberate ambiguity;732
- Flexibility provisions: textual flexibility provisions733 such as exceptions or flexibility clauses.
- Flexibility approaches beyond the legal regime:
- Pragmatic and informal “emergent flexibility” about the meaning of norms and rules during crises.734
3. Policy proposals, assets and products
That is, what are specific proposals for policies to be implemented? How can these proposals serve as products or assets in persuading key actors to act upon them?
In this context, a “(decision-relevant) asset” can be defined as: “resources that can be used by other actors in pursuing pathways to influence key actors with the aim to induce how these key actors make key decisions (e.g., about whether or how to use their levers). This includes new technical research insights, worked-out policy products, networks of direct advocacy, memes, or narratives.”
A “(policy) product” can be defined as “a subclass of assets; specific legible proposals that can be presented to key actors.”
Specific proposals for advanced AI-relevant policies; note that these are presented without comparison or prioritization. This list is non-exhaustive. Many proposals moreover combine several ideas, falling into different categories.
3.1. Overviews and collections of policies
- Previous collections of older proposals, such as Dewey’s list of “long-term strategies for ending existential risk”735 as well as Sotala and Yampolskiy’s survey of high-level “responses” to AI risk.736
- More recent lists and collections of proposed policies to improve the governance, security, and safety of AI development737 in domains such as compute security and governance; software export controls; licenses;738 policies to establish improved standards, system evaluations, and licensing regimes; procurement rules and funding for AI safety;739 or to establish a multinational AGI consortium to enable oversight of advanced AI, a global compute cap, and affirmative safety evaluations.740
3.2. Proposals to regulate AI using existing authorities, laws, or institutions
In particular, drawing on evaluations of the default landscape of regulations applied to AI (see Section I.3.3), and of the levers of governance for particular governments (see Section II.2.4).
Regulate AI using existing laws or policies
- Strengthen or reformulate existing laws and policies, such as EU competition law,741 contract and tort law,742 etc.;
- Strengthen or reorganize existing international institutions743 rather than establishing new institutions;744
- Extend or apply existing principles and regimes in international law,745 including, amongst others:
- Norms of international peace and security law:
- Prohibitions on the use of force and intervention in the domestic affairs of other states;
- Existing export control and nonproliferation agreements.
- Principles of international humanitarian law, such as:
- Distinction and proportionality in wartime;
- Prohibition on weapons that are by nature indiscriminate or cause unnecessary suffering;
- The requirements of humanity;
- The obligation to conduct legal reviews of new weapons or means of war (Article 36 under Additional Protocol I to the Geneva Conventions).
- Norms of international human rights law746 and human rights and freedoms, including the right to life and freedom from cruel, inhuman, and degrading treatment, among others; the rights to freedom of expression, association, and security of the person, among others; and the principle of human dignity;747
- Norms of international environmental law, including the no-harm principle and the principle of prevention and precaution;
- International criminal law, with regard to war crimes and crimes against humanity and with regard to case law of international criminal courts regarding questions of effective control;748
- Rules on state responsibility,749 including state liability for harm;
- Peremptory norms of jus cogens, outlawing, for example, genocide, maritime piracy, slavery, wars of aggression, and torture;
- International economic law:750 security exception measures under international trade law and non-precludement measures under international investment law, amongst others;751
- International disaster law: obligations regarding disaster preparedness, including forecasting and pre-disaster risk assessment, multi-sectoral forecasting and early warning systems, disaster risk and emergency communication mechanisms, etc. (Sendai Framework);
- Legal protections for the rights of future generations: including existing national constitutional protections for the rights of future generations752 and a potential future UN Declaration on Future Generations.753
- Norms of international peace and security law:
Proposals to set soft-law policy through existing international processes
- Proposals for engagement in existing international processes on AI: support the campaign to ban lethal autonomous weapons systems,754 orchestrate soft-law policy under G20,755 engage in debate about digital technology governance at the UN Summit for the Future,756 etc.
3.3. Proposals for new policies, laws, or institutions
A range of proposals for novel policies.
Impose (temporary) pauses on AI development
- Coordinated pauses amongst AI developers whenever they identify hazardous capabilities;757
- Temporary pause on large-scale system training beyond a key threshold,758 giving time for near-term policy-setting in domains such as robust third-party auditing and certification, regulation of access to computational power, establishment of capable national AI agencies, and establishment of liability for AI-caused harms, etc.;759
- (Permanent) moratoria on developing (certain forms of) advanced AI.760
Establish licensing regimes
- Evaluation and licensing regimes: establishment of a AI regulation regime for frontier AI systems, comprising “(1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models.”761
Establish lab-level safety practices
- Proposals for establishing corporate governance and soft law: establish Responsible Scaling Policies (RSPs)762 and establish corporate governance and AI certification schemes.763
Establish governance regimes on AI inputs (compute, data)
- Compute governance regimes: establish on-chip firmware mechanisms, inspection regimes, and supply chain monitoring and custody mechanisms to ensure no actor can use large quantities of specialized chips to execute ML training runs in violation of established rules;764
- Data governance: establish public data trusts to assert control over public training data for foundation models.765
Establish domestic institutions for AI governance
- Proposals for new domestic institutions: US “AI Control Council”766 or National Algorithms Safety Board,767 and European AI Agency768 or European AI Office.769
Establish international AI research consortia
Proposals to establish new international hubs or organizations aimed at AI research.770
- A diverse range of proposals for international institutions, including: a “CERN for AI,”771 “European Artificial Intelligence megaproject,”772 “Multilateral AI Research Institute (MAIRI),”773 “international large-scale AI R&D projects,”774 a collaborative UN superintelligence research project,775 “international organization that could serve as clearing-house for research into AI,”776 “joint international AI project with a monopoly on hazardous AI development,”777 “UN AI Research Organization,”778 a “good-faith joint US-China AGI project,”779 “AI for shared prosperity,”780 and a proposal for a new “Multinational AGI Consortium.”781
Establish bilateral agreements and dialogues
- Establish confidence-building measures782 and pursue international AI safety dialogues.783
Establish multilateral international agreements
Proposal to establish a new multilateral treaty on AI:784
- “Treaty on Artificial Intelligence Safety and Cooperation (TAISC),”785 global compute cap treaty,786 “AI development convention,”787 “Emerging Technologies Treaty,”788 “Benevolent AGI Treaty,”789 “pre-deployment agreements,”790 and many other proposals.
Establish international governance institutions
Proposals to establish a new international organization, along one or several models:791
- A diverse range of proposals for international institutions, including a Commission on Frontier AI, an Advanced AI Governance Organization, a Frontier AI Collaborative, and an AI Safety Project;792 an International AI Organization (IAIO) to certify state jurisdictions for compliance with international AI oversight standards to enable states to prohibit the imports of goods “whose supply chains embody AI from non-IAIO-certified jurisdictions”;793 a proposal for an “international consortium” for evaluations of societal-scale risks from advanced AI;794 a “Global Organization for High-Risk Artificial Intelligence (GOHAI)”;795 and many other proposals.796
Conclusion
The recent advances in AI have turned global public attention to this technology’s capabilities, impacts, and risks. AI’s significant present-day impacts and the prospect that these will only spread and scale further as these systems get increasingly advanced have firmly fixed this technology as a preeminent challenge for law and global governance this century.
In response, the disparate community of researchers that have explored aspects of these questions over the past years may increasingly be called upon to translate that research into rigorous, actionable, legitimate, and effective policies. They have developed—and continue to produce—a remarkably far-flung body of research, drawing on a diverse range of disciplines and methodologies. The urgency of action around advanced AI accordingly create a need for this field to increase the clarity of its work and its assumptions, to identify gaps in its approaches and methodologies where it can learn from yet more disciplines and communities, to improve coordination amongst lines of research, and to improve legibility of its argument and work to improve constructive scrutiny and evaluation of key arguments and proposed policies.
This review has not remotely achieved these goals—as no single document or review can. Yet by attempting to distill and disentangle key areas of scholarship, analysis, and policy advocacy, it hopes to help contribute to greater analytical and strategic clarity, more focused and productive research, and better-informed public debates and policymaker initiatives on the critical global challenges of advanced AI.
Also in this series
- Maas, Matthijs, and Villalobos, José Jaime. ‘International AI institutions: A literature review of models, examples, and proposals.’ Institute for Law & AI, AI Foundations Report 1. (September 2023). https://www.law-ai.org/international-ai-institutions
- Maas, Matthijs, ‘AI is like… A literature review of AI metaphors and why they matter for policy.’ Institute for Law & AI. AI Foundations Report 2. (October 2023). https://www.law-ai.org/ai-policy-metaphors
- Maas, Matthijs, ‘Concepts in advanced AI governance: A literature review of key terms and definitions.’ Institute for Law & AI. AI Foundations Report 3. (October 2023). https://www.law-ai.org/advanced-ai-gov-concepts