Why give AI agents actual legal duties?
The core proposition of Law-Following AI (LFAI) is that AI agents should be designed to refuse to take illegal actions in the service of their principals. However, as Ketan and I explain in our writeup of LFAI for Lawfare, this raises a significant legal problem:
[A]s the law stands, it is unclear how an AI could violate the law. The law, as it exists today, imposes duties on persons. AI agents are not persons, and we do not argue that they should be. So to say “AIs should follow the law” is, at present, a bit like saying “cows should follow the law” or “rocks should follow the law”: It’s an empty statement because there are at present no applicable laws for them to follow.
Let’s call this the Law-Grounding Problem for LFAI. LFAI requires defining AI actions as either legal or illegal. The problem arises because courts generally cannot reason about the legality of actions taken by an actor without some sort of legally recognized status, and AI systems currently lack any such status.[ref 1]
In the LFAI article, we propose solving the Law-Grounding Problem by making AI agents “legal actors”: entities on which the law actually imposes legal duties, even if they have no legal rights. This is explained and defended more fully in Part II of the article. Let’s call this the Actual Approach to the Law-Grounding Problem.[ref 2] Under the Actual Approach, claims like “that AI violated the Sherman Act” are just as true within our legal system as claims like “Jane Doe violated the Sherman Act.”
There is, however, another possible approach that we did not address fully in the article: saying that an AI agent has violated the law if it took an action that, if taken by a human, would have violated the law.[ref 3] Let’s call this the Fictive Approach to the Law-Grounding Problem. Under the Fictive Approach, claims like “that AI violated the Sherman Act” would not be true in the same way that statements like “Jane Doe violated the Sherman Act.” Instead, statements like “that AI violated the Sherman Act” would be, at best, a convenient shorthand for statements like “that AI took an action that, if taken by a human, would have violated the Sherman Act.”
I will argue that the Actual Approach is preferable to the Fictive Approach in some cases.[ref 4] Before that, however, I will explain why someone might be attracted to the Fictive Approach in the first place.
Motivating the Fictive Approach
To say that something is fictive is not to say that it is useless; legal fictions are common and useful. The Fictive Approach to the Law-Grounding Problem has several attractive features.
The first is its ease of implementation: the Fictive Approach does not require any fundamental rethinking of legal ontology. We do not need to either grant AI agents legal personhood or create a new legal category for them.
The Fictive Approach might also track common language use: when people make statements like “Claude committed copyright infringement,” they probably mean it in the fictive sense.
Finally, the Fictive Approach also mirrors how we think about similar problems, like immunity doctrines. The King of England may be immune from prosecution, but we can nevertheless speak intelligibly of his actions as lawful or unlawful by analyzing what the legal consequences would be if he were not immune.
Why prefer the Actual Approach?
Nevertheless, I think there are good reasons to prefer the Actual Approach over the Fictive Approach.
Analogizing to Humans Might Be Difficult
The strongest reason, in my opinion, is that AI agents may “think” and “act” very differently from humans. The Fictive Approach requires us to take a string of actions that an AI did and ask whether a human who performed the same actions would have acted illegally. The problem is that AI agents can take actions that could be very hard for humans to take, and so judges and jurors might struggle to analyze the legal consequences of a human doing the same thing.
Today’s proto-agents are somewhat humanlike in that they receive instructions in natural language, use computer tools designed for humans, reason in natural language, and generally take actions serially at approximately human pace and scale. But we should not expect this paradigm to last. For example, AI agents might soon:
- Consume the equivalent of dozens of books per day with perfect recall
- Have memories that do not decay over time
- Create copies of themselves and delegate tasks to those copies
- Reason near-perfectly about what other copies of itself are thinking
- Interact simultaneously with hundreds of people
- Erase their own “memory”
- Allow other models to see their neural architecture or activations
- Use tools made specifically for use by AI agents (that cannot be used by humans)
- Communicate in artificial languages
- Reason in latent space
And these are just some of the most foreseeable; over time, AI agents will likely become increasingly alien in their modes of reasoning and action. If so, then the Fictive Approach will become increasingly strained: judges and jurors will find themselves trying to determine whether actions that no human could have taken would have violated the law if performed by a human. At a minimum, this would require unusually good analogical reasoning skills; more likely, the coherence of the reasoning task would break down entirely.
Developing Tailored Laws and Doctrines for AIs
LFAI is motivated in large part by the belief that AI agents that are aligned to “a broad suite of existing laws”[ref 5] would be much safer than AI agents unbound by existing laws. But new laws specifically governing the behavior of AI agents will likely be necessary as AI agents transform society.[ref 6] However, the Fictive Approach would not be effective for new AI-specific laws. Recall that the Fictive Approach says that an action by an AI agent violates a law just in the case that a human who took that action would have violated that law. But if the law in question would only apply to an AI agent, the Fictive Approach cannot be applied: a human could not violate the law in question.
Relatedly, we may wish to develop new AI-specific legal doctrines, even for laws that apply to both humans and AIs. For example, we might wish to develop new doctrines for applying existing laws with a mental state component to AI agents.[ref 7] Alternatively, we may need to develop doctrines for determining when multiple instances of the same (or similar) AI models should be treated as identical actors. But the Fictive Approach is in tension with the development of AI-specific doctrines, since the whole point of the Fictive Approach is precisely to avoid reasoning about AI systems in their own right.
These conceptual tensions may be surmountable. But as a practical matter, a legal ontology that enables courts and legislatures to actually reason about AI systems in their own right seems more likely to lead to nuanced doctrines and laws that are responsive to the actual nature of AI systems. The Fictive Approach, by contrast, encourages courts and legislatures to attempt to map AI actions onto human actions, which may thereby overlook or minimize the significant differences between humans and AI systems.
Grounding Respondeat Superior Liability
Some scholars propose using respondeat superior to impose liability on the human principals of AI agents for any “torts” committed by the latter.[ref 8] However, “[r]espondeat superior liability applies only when the employee has committed a tort. Accordingly, to apply respondeat superior to the principals of an AI agent, we need to be able to say that the behavior of the agent was tortious.”[ref 9] We can only say that the behavior of an AI agent was truly tortious if it had a legal duty to violate. The Actual Approach allows for this; the Fictive Approach does not.
Of course, another option is simply to use the Fictive Approach for the application of respondeat superior liability as well. However, the Actual Approach seems preferable insofar as it doesn’t require this additional change. More generally, precisely because the Actual Approach integrates AI systems into the legal system more fully, it can be leveraged to parsimoniously solve problems in areas of law beyond LFAI.
Optionality for Eventual Legal Personhood
In the LFAI article, we take no position as to whether AI agents should be given legal personhood: a bundle of duties and rights.[ref 10] However, there may be good reasons to grant AI agents some set of legal rights.[ref 11]
Treating AI agents as legal actors under the Actual Approach creates optionality with respect to legal personhood: if the law recognizes an entity’s existence and imposes duties on it, it is easier for the law to subsequently grant that entity rights (and therefore personhood). But, we argue, the Actual Approach creates no obligation to do so:[ref 12] the law can coherently say that an entity has duties but no rights. Since it is unclear whether it is desirable to give AIs rights, this optionality is desirable.
* * *
AI companies[ref 13] and policymakers[ref 14] are already tempted to impose legal duties on AI systems. To make serious policy progress towards this, they will need to decide whether to actually do so, or merely use “lawbreaking AIs” as shorthand for some strained analogy to lawbreaking humans. Choosing the former path—the Actual Approach—is simpler and more adaptable, and therefore preferable.
Protecting AI whistleblowers
In May 2024, OpenAI found itself at the center of a national controversy when news broke that the AI lab was pressuring departing employees to sign contracts with extremely broad nondisparagement and nondisclosure provisions—or else lose their vested equity in the company. This would essentially have required former employees to avoid criticizing OpenAI for the indefinite future, even on the basis of publicly known facts and nonconfidential information.
Although OpenAI quickly apologized and promised not to enforce the provisions in question, the damage had already been done—a few weeks later, a number of current and former OpenAI and Google DeepMind employees signed an open letter calling for a “right to warn” about serious risks posed by AI systems, noting that “[o]rdinary whistleblower protections are insufficient because they focus on illegal activity, whereas many of the risks we are concerned about are not yet regulated.”
The controversy over OpenAI’s restrictive exit paperwork helped convince a number of industry employees, commentators, and lawmakers of the need for new legislation to fill in gaps in existing law and protect AI industry whistleblowers from retaliation. This culminated recently in the AI Whistleblower Protection Act (AI WPA), a bipartisan bill introduced by Sen. Chuck Grassley (R-Iowa) along with a group of three Republican and three Democratic senators. Companion legislation was introduced in the house by Reps. Ted Lieu (D-Calif.) and Jay Obernolte (R-Calif.).
Whistleblower protections such as the AI WPA are minimally burdensome, easy to implement and enforce, and plausibly useful for facilitating government access to the information needed to mitigate AI risks. They also have genuine bipartisan appeal, meaning there is actually some possibility of enacting them. As increasingly capable AI systems continue to be developed and adopted, it is essential that those most knowledgeable about any dangers posed by these systems be allowed to speak freely.
Why Whistleblower Protections?
The normative case for whistleblower protections is simple: Employers shouldn’t be allowed to retaliate against employees for disclosing information about corporate wrongdoing. The policy argument is equally straightforward—company employees often witness wrongdoing well before the public or government becomes aware but can be discouraged from coming forward by fear of retaliation. Prohibiting retaliation is an efficient way of incentivizing whistleblowers to come forward and a strong social signal that whistleblowing is valued by governments (and thus worth the personal cost to whistleblowers).
There is also reason to believe that whistleblower protections could be particularly valuable in the AI governance context. Information is the lifeblood of good governance, and it’s unrealistic to expect government agencies and the legal system to keep up with the rapid pace of progress in AI development. Often, the only people with the information and expertise necessary to identify the risks that a given model poses will be the people who helped create it.
Of course, there are other ways for governments to gather information on emerging risks. Prerelease safety evaluations, third-party audits, basic registration and information-sharing requirements, and adverse event reporting are all tools that help governments develop a sharper picture of emerging risks. But these tools have mostly not been implemented in the U.S. on a mandatory basis, and there is little chance they will be in the near future.
Furthermore, whistleblower disclosures are a valuable source of information even in thoroughly regulated and relatively well-understood contexts like securities trading. In fact, the Securities and Exchange Commission has awarded more than $2.2 billion to more than 444 whistleblowers since its highly successful whistleblower program began in 2012. We therefore expect AI whistleblowers to be a key source of information no matter how sophisticated the government’s other information-gathering authorities (which, currently, are almost nonexistent) become.
Whistleblower protections are also minimally burdensome. A bill like the AI WPA imposes no affirmative obligations on affected companies. It doesn’t prevent them from going to market or integrating models into useful products. It doesn’t require them to jump through procedural hoops or prescribe rigid safety practices. The only thing necessary for compliance is to refrain from retaliating against employees or former employees who lawfully disclose important information about wrongdoing to the government. It seems highly unlikely that this kind of common-sense restriction could ever significantly hinder innovation in the AI industry. This may explain why even innovation-focused, libertarian-minded commentators like Martin Casado of Andreesen Horowitz and Dean Ball have reacted favorably to AI whistleblower bills like California SB 53, which would prohibit retaliation against whistleblowers who disclose information about “critical risks” from frontier AI systems. It’s worth noting that the sponsor of the AI WPA’s House companion bill was introduced by Rep. Obernolte, who has been the driving force behind the controversial AI preemption provision in the GOP reconciliation bill.
The AI Whistleblower Protection Act
Beyond the virtues of whistleblower protections generally, how does the actual whistleblower bill currently making its way through Congress stack up?
In our opinion, favorably. A few weeks ago, we published a piece on how to design AI whistleblower legislation. The AI WPA checks almost all of the boxes we identified, as discussed below.
Dangers to Public Safety
First, and most important, the AI WPA fills a significant gap in existing law by protecting disclosures about “dangers” to public safety even if the whistleblower can’t point to any law violation by their employer. Specifically, the law protects disclosures related to a company’s failure to appropriately respond to “substantial and specific danger[s]” to “public safety, public health, or national security” posed by AI, or about “security vulnerabilit[ies]” that could allow foreign countries or other bad actors to steal model weights or algorithmic secrets from an AI company. This is significant because the most important existing protection for whistleblowers at frontier AI companies—California’s state whistleblower statute—only protects disclosures about law violations.
It’s important to protect disclosures about serious dangers even when no law has been violated because the law, with respect to emerging technologies like AI, often lags far behind technological progress. When the peer-to-peer file sharing service Napster was founded in 1999, it wasn’t immediately clear whether its practices were illegal. By the time court decisions resolved the ambiguity, a host of new sites using slightly different technology had sprung up and were initially determined to be legal before the Supreme Court stepped in and reversed the relevant lower court decisions in 2005. In a poorly understood, rapidly changing, and almost totally unregulated area like AI development, the prospect of risks arising from behavior that isn’t clearly prohibited by any existing law is all too plausible.
Consider a hypothetical: An AI company trains a new cutting-edge model that beats out its competitors’ latest offerings on a wide variety of benchmarks, redefining the state of the art for the nth time in as many months. But this time, a routine internal safety evaluation reveals that the new model can, with a bit of jailbreaking, be convinced to plan and execute a variety of cyberattacks that the evaluators believe would be devastatingly effective if carried out, causing tens of millions of dollars in damage and crippling critical infrastructure. The company, under intense pressure to release a model that can compete with the newest releases from other major labs, implements safeguards that employees believe can be easily circumvented but otherwise ignores the danger and misrepresents the results of its safety testing in public statements.
In the above hypothetical, is the company’s behavior unlawful? An enterprising prosecutor might be able to make charges stick in the aftermath of a disaster, because the U.S. has some very broad criminal laws that can be creatively interpreted to prohibit a wide variety of behaviors. But the illegality of the company’s behavior is at the very least highly uncertain.
Now, suppose that an employee with knowledge of the safety testing results reported those results in confidence to an appropriate government agency. Common sense dictates that the company shouldn’t be allowed to fire or otherwise punish the employee for such a public-spirited act, but under currently existing law it is doubtful whether the whistleblower would have any legal recourse if terminated. Knowing this, they might well be discouraged from coming forward in the first place. This is why establishing strong, clear protections for AI employees who disclose information about serious threats to public safety is important. This kind of protection is also far from unprecedented—currently, federal employees enjoy a similar protection for disclosures about “substantial and specific” dangers, and there are also sector-specific protections for certain categories of private-sector employees such as (for example) railroad workers who report “hazardous safety or security conditions.”
Importantly, the need to protect whistleblowers has to be weighed against the legitimate interest that AI companies have in safeguarding valuable trade secrets and other confidential business information. A whistleblower law that is too broad in scope might allow disgruntled employees to steal from their former employers with impunity and hand over important technical secrets to competitors. The AI WPA, however, sensibly limits its danger-reporting protection to disclosures made to appropriate government officials or internally at a company regarding “substantial and specific danger[s]” to “public safety, public health, or national security.” This means that, for better or worse, reporting about fears of highly speculative future harms will probably not be protected, nor will disclosures to the media or watchdog groups.
Preventing Contractual Waivers of Whistleblower Rights
Another key provision states that contractual waivers of the whistleblower rights created by the AI WPA are unenforceable. This is important because nondisclosure and nondisparagement agreements are common in the tech industry, and are often so broadly worded that they purport to prohibit an employee or former employee from making the kinds of disclosures that the AI WPA is intended to protect. It was this sort of broad nondisclosure agreement (NDA) that first sparked widespread public interest in AI whistleblower protections during the 2024 controversy over OpenAI’s exit paperwork.
OpenAI’s promise to avoid enforcing the most controversial parts of its NDAs did not change the underlying legal reality that allowed OpenAI to propose the NDAs in the first place, and that would allow any other frontier AI company to propose similarly broad contractual restrictions in the future. As we noted in a previous piece on this subject, there is some chance that attempts to enforce such restrictions against genuine whistleblowers would be unsuccessful, because of either state common law or existing state whistleblower protections. Even so, the threat of being sued for violating an NDA could discourage potential whistleblowers even if such a lawsuit might not eventually succeed. A clear federal statutory indication that such contracts are unenforceable would therefore be a welcome development. The AI WPA, which clearly resolves the NDA issue by providing that “[t]he rights and remedies provided for in this section may not be waived or altered by any contract, agreement, policy form, or condition of employment,” would provide exactly this.
Looking Forward
It’s not clear what will happen to the AI Whistleblower Protection Act. It appears as likely to pass as any AI measure we’ve seen, given the substantial bipartisan enthusiasm behind it and the lack of any substantial pushback from industry to date. But it is difficult in general to pass federal legislation, and the fact that there has been very little in the way of vocal opposition to this bill to date doesn’t mean that dissenting voices won’t make themselves heard in the coming weeks.
Regardless of what happens to this specific bill, those who care about governing AI well should continue to support efforts to pass something like the AI WPA. However concerned or unconcerned one may be about the dangers posed by AI, the bill as a whole serves a socially valuable purpose: establishing a uniform whistleblower protection regime for reports about security vulnerabilities and lawbreaking in a critically important industry.
Christoph Winter’s remarks to the European Parliament on AI Agents and Democracy
Summary
On July 17th, LawAI’s Director and Founder, Christoph Winter, was invited to speak before the European Parliament’s Special Committee on the European Democracy Shield with participation of IMCO and LIBE Committee members. Professor Winter was asked to present on AI governance, regulation and democratic safeguards. He spoke about the democratic challenges that AI agents may present and how democracies could approach these challenges.
Two recommendations were made to the Committee:
- Introduce Law-Following AI : AI systems should be built to follow the law. Law-following AI would require AI systems to be architecturally constrained to refuse actions that would be illegal if performed by humans in the same position. Just as AIs are currently trained to decline to help build bombs, they would reject orders to violate constitutional rights or election laws.
- Strengthen the AI Office: The AI Office needs many more skilled people to rigorously analyze what companies submit under the Code of Practice and AI Act—to scrutinize their risk assessments, verify their mitigation measures, and spot gaps in their safety evaluations.
Transcript
Distinguished Members of Parliament, fellow speakers and experts,
Manipulating public opinion at scale used to require vast resources. This situation is changing quickly. During Slovakia’s 2023 election a simple deepfake audio recording of a candidate discussing vote-buying schemes circulated just 48 hours before polls opened, which was too late for fact-checking, but not too late to reach thousands of voters. And deepfakes are really just the beginning.
AI agents, which are autonomous systems that can act on the internet like skilled human workers, are being developed by all major AI companies. And soon they could be able to simultaneously orchestrate large-scale manipulation campaigns, hack electoral systems, and coordinate cyber-attacks on fact-checkers—all while operating 24/7 at unprecedented scale.
Today, I want to propose two solutions to these democratic challenges. First, requiring AI agents to be Law-following by design. And second, strengthening the AI Office’s capacity to understand and address AI risks. Let me explain each.
Law-following AI requires AI systems to be architecturally constrained to refuse actions that would be illegal if performed by humans in the same position. Just as AIs are currently trained to decline to help build bombs, they would reject orders to violate constitutional rights or election laws.
Law-following AI is democratically compelling for three reasons: First, it is democratically legitimate. Laws represent our collective will, refined through democratic deliberation, rather than unilaterally determined corporate values. Second, it enables democratic adaptability. Laws can be changed through democratic processes, and AI agents designed to follow law can automatically adjust their behavior. Third, it offers a democratic shield—because without these constraints, we risk creating AI agents that blindly follow orders, and history has shown where blind obedience leads.
In practice, this would mean that AI agents bound by law would refuse orders to suppress political speech, manipulate elections, blackmail officials, or harass dissidents. This way, law-following AI could prevent authoritarian actors from using obedient AI agents to entrench their power. Of course, it can’t prevent all forms of manipulation—much harmful persuasion operates within legal bounds. But blocking AI agents from illegal attacks on democracy is a critical first step.
The EU’s Code of Practice on General-Purpose AI already recognizes this danger and identifies “lawlessness” as a model propensity that contributes to systemic risk. But just as we currently lack reliable methods to assess how persuasive AI systems are, we currently lack a way to reliably measure AI lawlessness.
And perhaps most concerningly—and this brings me to my second proposal—the AI Office currently lacks the institutional capacity to develop these crucial capabilities.
The AI Office needs sufficient technical, policy, and legal staff to rigorously analyze what companies submit under the Code of Practice and AI Act—to scrutinize their risk assessments, verify their mitigation measures, and spot gaps in their safety evaluations. In other words: When a company claims their AI agent is law-following, the AI Office must have the expertise and resources to independently test that claim. When developers report on persuasion capabilities—capabilities that even they may not fully understand—the AI Office needs experts who can identify what’s missing from those reports.
Rigorous evaluation isn’t just about compliance—it’s about how we learn: each assessment and each gap we identify builds our understanding of these systems. This is why adequate AI Office capacity matters: not just for evaluating persuasion capabilities or Law-following AI today, but for understanding and preparing for risks to democracy that grow with each model release.
To illustrate what the current resource gap looks like: Recent reports suggest Meta offered one AI researcher a salary package of €190 million. The AI Office—tasked with overseeing the entire industry—operates on less.
This gap between private power and public capacity is unsustainable for our democracy. If we’re serious about democracy, we must fund our institutions accordingly.
So to protect democracy, we can start with two things: AI agents bound by human laws, and an AI Office with the capacity to understand and evaluate the risks.
Thank you.
The full video can be watched here (starts 12:01:02).
Future frontiers for research in law and AI
LawAI’s Legal Frontiers team aims to incubate new law and policy proposals that are simultaneously:
- Anticipatory, in that they respond to a reasonable forecast of the legal and policy challenges that further advances in AI will produce
- Actionable, in that we can make progress within these workstreams even under significant uncertainty
- Accommodating to a wide variety of worldviews and technological trajectories, given the shared challenges that AI will create and the uncertainties we have about likely developments
- Ambitious, in that they both significantly reduce some of the largest risks from AI while also enabling society to reap its benefits
Currently, the Legal Frontiers team owns two workstreams:
However, the general vision behind Legal Frontiers is to continuously spin out mature workstreams to free us to identify and incubate new ones. To that end, we recently updated our LawAI’s Workstreams and Research Directions document to list some “Future Frontiers” on which we might work in the future.
However, we don’t want people to wait for us to start working on these questions: they are already ripe for scholarly attention. To that end, we have reproduced those Future Frontiers here.
Regulating Government-Developed Frontier AI
Today, governments primarily act as a consumer of frontier AI technologies. Frontier AI systems are primarily developed by private companies with little or no initial government involvement. Those companies may then tailor their general frontier AI offerings to meet the particular needs of governmental customers.[ref 1] However, the private sector is generally responsible for the primary development of frontier AI models and systems, with governmental steering entering, if at all, later in the commercialization lifecycle.
However, as governments increasingly realize the significant strategic implications of frontier AI technologies, they may wish to become more directly involved in the development of frontier AI systems at earlier stages of the development cycle.[ref 2] This could range from frontier AI systems initially developed under government contract, to a fully governmental effort to develop next-generation frontier AI systems.[ref 3] Indeed, a 2024 report from the U.S.-China Economic and Security Review Commission called for Congress to “establish and fund a Manhattan Project-like program dedicated to racing to and acquiring an Artificial General Intelligence (AGI) capability.”[ref 4]
Existing proposals for the regulation of the development and deployment of frontier AI systems envision the imposition of such regulations on private businesses, under the implicit assumption that frontier AI development and deployment will remain private-led. If and when governments do take a larger role in the development of frontier AI systems, new regulatory paradigms will be needed. Such proposals need to identify and address unique challenges and opportunities that government-led AI development will pose, as compared to today’s private-led efforts.
Examples of possible questions in this workstream could include:
- How are safety and security risks in high-stakes governmental research projects (e.g., the Manhattan Project) usually regulated?
- How might the government steer development of frontier AI technologies if it wished to do so?
- What existing checks and balances would apply to a government program to develop frontier AI technologies?
- How would ideal regulation of government-directed frontier AI development vary depending on the mechanism used for such direction (e.g., contract versus government-run development)?
- How might ideal regulation of government-directed frontier AI development vary depending on whether the development is led by military or civilian parts of the government?
- If necessary to procure key inputs for the program (e.g. compute), how could the US government collaborate with select allies on such programs?[ref 5]
Accelerating Technologies that Defend against Risks from AI
It is likely infeasible[ref 6] and/or undesirable[ref 7] to fully prevent the wide proliferation of many high-risk AI systems. There is therefore increasing interest in developing technologies[ref 8] to defend against possible harms from diffuse AI systems, and remedy those harms where defensive measures fail.[ref 9] Collectively, we call these “defensive technologies.”[ref 10]
Many of the most valuable contributions to the development and deployment of defensive technologies will not come from legal scholars, but rather from some combination of entrepreneurship, technological development research and development, and funders. But legal change may also play a role in more directly accelerating the development and deployment of defensive technologies, such as by removing barriers to their adoption, which raise the costs of research or reduce its rewards.[ref 11]
Examples of general questions that might be valuable to explore include:
- What are examples of existing policies that unnecessarily hinder research and development into defense-enhancing technologies, such as by (a) raising the costs of conducting that research, or (b) reducing the expected profits of deployment of defense-enhancing technologies?[ref 12]
- What are existing legal or policy barriers that inhibit effective diffusion of defensive technologies across society?[ref 13]
- How can the law preferentially[ref 14] accelerate defensive technologies?
Regulating Internal Deployment
Many existing AI policy proposals regulate AI systems at the point when they are first “deployed”: that is, made available for use by persons external to the developer. However, pre-deployment use of AI models by the developing company—“internal deployment”—may also pose substantial risks.[ref 15] However, most policy proposals aimed at reducing large-scale risks from AI primarily regulate AI at or after the point of external deployment. Policy proposals for regulating internal deployment would therefore be valuable.
Example questions in the workstream might include:
- What existing modes of regulation in other AI industries are most analogous to regulation of internal deployment?[ref 16]
- How can the state identify which AI developers are appropriate targets for regulation of internal deployment?
- How can regulation of internal deployment simultaneously reduce risk and allow for appropriate exploration of model capabilities and risks?
- What are the constitutional (e.g., Fourth Amendment) limitations on regulation of internal deployment?
- How can regulation of internal deployment be designed to reduce risks of espionage and information leakage?
Fostering Legal Resilience by Rapidly Patching Legal Loopholes
AI technologies performing legal tasks will likely surface loopholes or gaps in the law: that is, actions permitted by the law but which policymakers would likely prefer to be prohibited. There are several reasons to expect this:
- AI itself constitutes a significant technological change, and technological changes often surface loopholes or gaps in the law.[ref 17]
- AI might accelerate technological change and economic growth,[ref 18] which will similarly often surface gaps or loopholes in the law.
- AI might be more efficient at finding gaps or loopholes in the law, and quickly exploiting them.
Given that lawmaking is a slow and deliberative process, actors can often exploit gaps or loopholes before policymakers can “patch” them. While this dynamic is not new, AI systems may be able to cause more harm or instability by finding or exploiting gaps and loopholes than humans have in the past, due to their greater speed of action, ability to coordinate, dangerous capabilities, and (possibly) lack of internal moral constraints.
This suggests that it may be very valuable for policymakers to “patch” legal gaps and loopholes by quickly enacting new laws. However, constitutional governance is often intentionally slow, deliberative, and decentralized, suggesting that it is unwise and sometimes illegal to accelerate lawmaking in certain ways.
This tension suggests that it would be valuable to research how new legislative and administrative procedures could quickly “patch” legal gaps and loopholes through new law while also complying with the letter and spirit of constitutional limitations on lawmaking.
Responsibly Advancing AI-Enabled Governance
Recent years have seen robust governmental interest in the use of AI technologies for administration and governance.[ref 19] As systems advance in capabilities, this may create significant risks of both misuse,[ref 20] as well as potential safety risks from the deployment of advanced systems in high-stakes governmental infrastructures.
A recent report[ref 21] identifies the dual imperative for governments to:
- Quickly adopt AI technology to enhance state capacity, but
- Take care when doing so.
The report lays out three types of interventions worth considering:
- “‘Win-win’ opportunities” that help with both adoption and safety;[ref 22]
- “Risk-reducing interventions”; and
- “Adoption-accelerating interventions.”
Designing concrete policies in each of these categories is very valuable, especially policies in the first category, or policies in the second and third category that do not come at the expense of the other category.
Responsibly Automating Legal Processes
As AI systems are able to complete more of the tasks typically associated with traditional legal functions—drafting legislation and regulation, adjudicating,[ref 23] litigating, drafting contracts, counseling clients, negotiating, investigating possible violations of law, generating legal research—it will be natural to consider whether and how these tasks should be automated.
We can call AI systems performing such functions “AI lawyers.” If implemented well, AI lawyers could help with many of the challenges that AI could bring. AI lawyers could write new laws to regulate governmental development or use of frontier AI, monitor governmental uses of AI, and craft remedies for violations. AI lawyers could also identify gaps and loopholes in the law, accelerate negotiations between lawmakers, and draft legislative “patches” that reflect lawmakers’ consensus.
However, entrusting ever more power to AI lawyers entails significant risks. If AI lawyers are not themselves law-following, they may abuse their governmental station to the detriment of citizens. If such systems are not intent-aligned,[ref 24] entrusting AI systems with significant governmental power may make it easier for those systems to erode humanity’s control over human affairs. Regardless of whether AI lawyers are aligned, delegating too many legal functions to AI lawyers may frustrate important rule-of-law values, such as democratic responsiveness, intelligibility, and predictability. Furthermore, there are likely certain legal functions that it is important for natural persons to perform, such as serving as a judge on the court of last resort.
Research into the following questions may help humanity navigate the promises and perils of AI lawyers:
- Which legal functions should never be automated?
- Which legal functions, if entrusted to an AI lawyer, would significantly threaten democratic and rule-of-law values?
- How can AI lawyers enhance human autonomy and rule-of-law values?
- How can AI lawyers enhance the ability of human governments to respond to challenges from AI?
- What substantive safety standards should AI lawyers have to satisfy before being deployed in the human legal system?
- Which new legal checks and balances should be introduced if AI lawyers accelerate the speed of legal processes?
Accelerating Legal Technologies that Empower Citizens
Related to the above, there is also a question of how we can accelerate potential technologies that would defend against general risks to the rule of law and/or democratic accountability. For instance, as lawyers, we may also be particularly well-placed to advance legal reforms that make it easier for citizens to leverage “AI lawyers” to help them defend against vexatious litigation and governmental oppression, or pursue meritorious claims.[ref 25] For example, existing laws regulating the practice of law may impose barriers on citizens’ ability to leverage AI for their legal needs.[ref 26] This suggests further questions, such as:
- Who will benefit by default from the widespread availability of cheap AI lawyers?
- Will laws regulating the practice of law form a significant barrier to defensive (and other beneficial) applications of AI lawyers?
- How should laws regulating the practice of law accommodate the possibility of AI lawyers, especially those that are “defensive” in some sense?
- How might access to cheap AI lawyers affect the volume of litigation and pursuit of claims? If there is a significant increase, would this result in a counterproductive effect by slowing down court processing times or prompting the judicial system to embrace technological shortcuts?
Approval Regulation in a Decentralized World
After the release of GPT-4, a number of authors and policymakers proposed compute-indexed approval regulation, under which frontier AI systems trained with large amounts would be subjected to heightened predeployment scrutiny.[ref 27] Such regulation was perceived as attractive in large part because, under the scaling paradigm that produced GPT-4, development of frontier AI systems depended on the use of a small number of large data centers, which could (in theory) be easily monitored.
However, subsequent technological developments that reduce the amount of centralized compute needed to achieve frontier AI capabilities (namely improvements in decentralized training[ref 28] and the rise of reasoning models)[ref 29] have cast serious doubts on the long-term viability of compute-indexed approval regulation as a method for preventing unapproved development of highly capable AI models.[ref 30]
It is not clear, however, that these developments mean that other forms of approval regulation for frontier AI development and deployment would be totally ineffective. Many activities are subject to reasonably effective approval regulation notwithstanding their highly distributed nature. For example, people generally respect laws requiring a license to drive a car, hunt, or practice law, even though these activities are very difficult for the government to reliably prevent ex ante. Further research into approval regulation for more decentralized activities could therefore help illuminate whether approval regulation for frontier AI development could remain viable, at an acceptable cost to other values (e.g., privacy, liberty), notwithstanding these developments in the computational landscape.
Examples of possible questions in this workstream could include:
- How effective are existing approval regulation regimes for decentralized activities?
- Which decentralized activities most resemble frontier AI development under the current computing paradigm?
- How do governments create effective approval regulation regimes for decentralized activities, and how might those mechanisms be applied to decentralized frontier AI development?
- How can approval regulation of decentralized frontier AI development be implemented at acceptable costs to other values (e.g., privacy, liberty, administrative efficiency)?
The case for AI liability
The debate over AI governance has intensified following recent federal proposals for a ten-year moratorium on state AI regulations. This preemptive approach threatens to replace emerging accountability mechanisms with a regulatory vacuum.
In his recent AI Frontiers article, Kevin Frazier argues in favor of a federal moratorium, seeing it as necessary to prevent fragmented state-level liability rules that would stifle innovation and disadvantage smaller developers. Frazier (an AI Innovation and Law Fellow at the University of Texas, Austin, School of Law) also contends that, because the norms of AI are still nascent, it would be premature to rely on existing tort law for AI liability. Frazier cautions that judges and state governments lack the technical expertise and capacity to enforce liability consistently.
But while Frazier raises important concerns about allowing state laws to assign AI liability, he understates both the limits of federal regulation and the unique advantages of liability. Liability represents the most suitable policy tool for addressing many of the most pressing risks posed by AI systems. Its superiority stems from three basic advantages. Specifically, liability can:
- Function effectively despite widespread disagreement about the likelihood and severity of risks
- Incentivize optimal rather than merely reasonable precautions
- Address third-party harms where market mechanisms fail to do so
Frazier correctly observes that “societal norms around AI are still forming, and the technology itself is not yet fully understood.” However, I believe he draws the wrong conclusion from this observation. The profound disagreement among experts, policymakers, and the public about AI risks and their severity does not argue against using liability frameworks to curb potential abuses. On the contrary, it renders their use indispensable.
Disagreement and Uncertainty
The disagreement about AI risks reflects more than differences in technical assessment. It also encompasses fundamental questions about the pace of AI development, the likelihood of catastrophic outcomes, and the appropriate balance between innovation and precaution. Some researchers argue that advanced AI systems pose high-probability and imminent existential threats, warranting immediate regulatory intervention. Others contend that such concerns are overblown, arguing that premature regulation could stifle beneficial innovation.
Such disagreement creates paralysis in traditional regulatory approaches. Prescriptive regulation designed to address risks before they become reality — known in legal contexts as “ex ante,” meaning “before the fact” — generally entails substantial up-front costs that increase as rules become stricter. Passing such rules requires social consensus about the underlying risks and the costs we’re willing to bear to mitigate them.
When expert opinions vary dramatically about foundational questions, as they do in the case of AI, regulations may emerge that are either ineffectively permissive or counterproductively restrictive. The political process, which tends to amplify rather than resolve such disagreements, provides little guidance for threading this needle effectively.
Approval-based systems face similar challenges. In an approval-based system (for example, Food and Drug Administration regulations of prescription drugs), regulators must formally approve new products and technologies before they can be used. Thus, they depend on regulators’ ability to distinguish between acceptable and unacceptable risks — a difficult task when the underlying assessments remain contested.
Liability systems, by contrast, operate effectively even amid substantial disagreements. They do not require ex ante consensus about appropriate risk levels; rather, they assign “ex post” accountability. Liability scales automatically with risk, as revealed in cases where individual plaintiffs suffer real injuries. This obviates the need for ex ante resolution of wide social disagreement about the magnitude of AI risks.
Thus, while Frazier and I agree that governments have limited expertise in AI risk management, this actually strengthens rather than undermines the case for liability, which harnesses private-sector expertise through market incentives rather than displacing it through prescriptive rules.
Reasonable Care and Strict Liability
Frazier and I also share some common ground regarding the limits of negligence-based liability. Traditional negligence doctrine imposes a duty to exercise “reasonable care,” typically defined as the level of care that a reasonable person would exercise under similar circumstances. While this standard has served tort law well across many domains, AI systems present unique challenges that may render conventional reasonable care analysis inadequate for managing the most significant risks.
In practice, courts tend to engage in a fairly narrow inquiry when assessing whether a defendant exercised reasonable care. If an SUV driver runs over a pedestrian, courts generally do not inquire as to whether the net social benefits of this particular car trip justified the injury risk it generated for other road users. Nor would a court ask whether the extra benefits of driving an SUV (rather than a lighter-weight sedan) justified the extra risks the heavier vehicle posed to third parties. Those questions are treated as outside the scope of the reasonable care inquiry. Instead, courts focus on questions like whether the driver was drunk, or texting, or speeding.
In the AI context, I expect a similarly narrow negligence analysis that asks whether AI companies implemented well-established alignment techniques and safety practices. I do not anticipate questions about whether it was reasonable to develop an AI system with certain high-level features, given the current state of AI alignment and safety knowledge.
However, while negligence is limited in its ability to address broader upstream culpability, liability can still tackle it. Under strict liability, defendants internalize the full social costs of their activities. This structure incentivizes investment in precaution up to the point where marginal costs equal marginal benefits. Such an alignment between private and social incentives proves especially valuable when reasonable care standards may systematically underestimate the optimal level of precaution.
Accounting for Third-Party Harms
Another key feature of liability systems is their capacity to address third-party harms: situations where AI systems cause damage to parties who have no contractual or other market relationship with the system’s operator. These scenarios present classic market failure problems where private incentives diverge sharply from social welfare — warranting some sort of policy intervention.
When AI systems harm their direct users, market mechanisms provide some corrective pressure. Users who experience harms from AI systems can take their business to competitors, demand compensation, or avoid such systems altogether. While these market responses may be imperfect — particularly when harms are difficult to detect or when users face switching costs — they do provide an organic feedback mechanism, incentivizing AI system operators to invest in safety.
Third-party harms present an entirely different dynamic. In such cases, the parties bearing the costs of system failures have no market leverage to demand safer design or operation. AI developers, deployers, and users internalize the benefits of their activities — revenue from users, cost savings from automation, competitive advantages from AI capabilities — while externalizing many of the costs onto third parties. Without policy intervention, this leads to systematic underinvestment in safety measures that protect third parties.
Liability systems directly address this externality problem by compelling AI system operators to internalize the costs they impose on third parties. When AI systems harm people, liability rules require AI companies to compensate victims. This induces AI companies to invest in safety measures that protect third parties. AI companies themselves are best positioned to identify such measures, with the range of potential mitigations including high-level system architecture changes, investing more in alignment and interpretability research, and testing and red-teaming new models before deployment, potentially including broad internal deployment.
The power of this mechanism is clear when compared with alternative approaches to the problem of mitigating third-party harms. Prescriptive regulation might require regulators to identify appropriate risk-mitigation measures ex ante, a challenging task given the rapid evolution of AI technology. Approval-based systems might prevent the deployment of particularly risky systems, but they provide limited ongoing incentives for safety investment once systems are approved. Only liability systems create continuous incentives for operators to identify and implement cost-effective safety measures throughout the lifecycle of their systems.
Moreover, liability systems create incentives for companies to develop safety expertise that extends beyond compliance with specific regulatory requirements. Under prescriptive regulation, companies have incentives to meet specified requirements but little reason to exceed them. Under liability systems, companies have incentives to identify and address risks even when those risks are not explicitly anticipated by regulators. This creates a more robust and adaptive approach to safety management.
State-Level Liability
Frazier’s concerns about a patchwork of state-level AI regulation deserve serious examination, but his analysis overstates both the likelihood and the problematic consequences of such inconsistency. His critique conflates different types of regulatory requirements, while ignoring the inherent harmonizing features of liability systems.
First, liability rules exhibit greater natural consistency across jurisdictions than other forms of regulation do. Frazier worries about “ambiguous liability requirements” and companies needing to “navigate dozens of state-level laws.” However, the common-law tradition underlying tort law creates pressures toward harmonization that prescriptive regulations lack. Basic negligence principles — duty, breach, causation, and damages — remain remarkably consistent across states, despite the absence of a federal mandate.
More importantly, strict liability regimes avoid patchwork problems entirely. Under strict liability, companies bear responsibility for harm they cause, regardless of their precautionary efforts or the specific requirements they meet. This approach creates no compliance component that could vary across states. A company developing AI systems under a strict liability regime faces the same fundamental incentive everywhere: Make your systems safe enough to justify the liability exposure they create.
Frazier’s critique of Rhode Island Senate Bill 358, which I helped design, reflects some mischaracterization of its provisions. The bill is designed to close a gap in current law where AI systems may engage in wrongful conduct, yet no one may be liable.
Consider an agentic AI system that a user instructs to start a profitable internet business. The AI system determines that the easiest way to do this is to send out phishing emails and steal innocent people’s identities. It also covers its tracks, so reasonable care on the part of the user would neither prevent nor detect this activity. In such a case, current Rhode Island law would require the innocent third-party plaintiffs to prove that the developers failed to adopt some specific precautionary measure that would have prevented the injury, which may not be possible.
Under SB 358, it would be sufficient for the plaintiff to prove that the AI system’s conduct would be a tort if a human engaged in it, and that neither the user nor an intermediary that fine-tuned or scaffolded the model had intended or could have reasonably foreseen the system’s tortious conduct. That is, the bill holds that when AI systems wrongfully harm innocent people, someone should be liable. If the user and any intermediaries that modified the system are innocent, the buck should stop with the model developer.
One concern with this approach is that the elements of some torts implicate the mental states of the defendant, and many people doubt that AI systems can be understood as having any mental states at all. For this reason, SB 358 creates a rebuttable presumption that, in cases where the judge or jury would infer that a human possessed the relevant mental state if they engaged in conduct similar to that of the AI system, then that same inference should also apply to AI mental states.
AI Federalism
While state-level AI liability represents a significant improvement over the current regulatory vacuum, I do think there is an argument for federalizing AI liability rules. Alternatively, more states could adopt narrow, strict liability legislation (like Rhode Island SB 358) that would help close the current AI accountability gap.
A federal approach could provide greater consistency and reflect the national scope of AI system deployment. Federal legislation could also more easily coordinate liability rules with other aspects of AI governance, such as liability insurance requirements, safety testing requirements, disclosure obligations, and government procurement standards.
However, the case for federalization is not an argument against liability as a policy tool. Whether implemented at the state level or the federal level, liability systems offer unique advantages for managing AI risks that other regulatory approaches cannot match. The key insight is not that liability must be federal to be effective, but rather that liability — at whatever level — represents a superior approach to AI governance than either prescriptive regulation or approval-based systems.
Frazier’s analysis culminates in support for federal preemption of state-level AI liability, noting that the US House reconciliation bill includes “a 10-year moratorium on a wide range of state AI regulations.” But this moratorium would replace emerging state-level accountability mechanisms with no accountability at all.
The proposed 10-year moratorium would leave two paths for responding to AI risks. One path would be for Congress to pass federal legislation. Confidence in such a development would be misplaced given Congress’s track record on technology regulation.
The second path would be to accept a regulatory vacuum where AI risks remain entirely unaddressed through legal accountability mechanisms. Some commentators (I’m not sure if Frazier is among them) actively prefer this laissez-faire scenario to a liability-based governance framework, claiming that it best promotes innovation to unlock the benefits of AI. This view is deeply mistaken. Concerns that liability will chill innovation are overstated. If AI holds the promise that Frazier and I think it does, there will still be very strong incentives to invest in it, even after developers fully internalize the technology’s risks.
What we want to promote is socially beneficial innovation that does more good than harm. Making AI developers pay when their systems cause harm balances their incentives and advances this larger goal. (Similarly, requiring companies to pay for the harms of pollution makes sense, even when that pollution is a byproduct of producing useful goods or services like electricity, steel, or transportation.)
In a world of deep disagreement about AI’s risks and benefits, abandoning emerging liability mechanisms risks creating a dangerous regulatory vacuum. Liability’s unique abilities — adapting dynamically, incentivizing optimal safety investments, and addressing third-party harms — makes it indispensable. Whether at the state level or the federal level, liability frameworks should form the backbone of any effective AI governance strategy.
How to design AI whistleblower legislation
Key takeaways
- The most important existing whistleblower protection law for employees at frontier AI companies is California Labor Code § 1102.5, which protects California workers from being fired or otherwise retaliated against for reporting violations of any law or regulation to the government or internally within their company.
- However, there are gaps in that statute that should be addressed by future federal and/or state AI whistleblower legislation.
- Most importantly, the California statute doesn’t protect whistleblowers who disclose information about serious risks to public safety that don’t involve a violation of any existing law.
- Additionally, frontier AI companies can neutralize whistleblower statutes by requiring employees to sign broad nondisclosure agreements—unless the statute in question includes a provision stating that such agreements are unenforceable.
- Lawmakers should strike a balance between protecting companies’ legitimate interest in protecting their valuable trade secrets and protecting the public’s interest in public safety and effective law enforcement.
- Key decision points for lawmakers designing AI whistleblower legislation include:
- Whether to establish a reporting process for disclosures (e.g. a government office charged with securely handling whistleblower disclosures or a designated hotline)
- How broad the scope of a statute’s protections should be—who should be covered, and for what kinds of disclosures
- Whether to prohibit contractual waivers of whistleblower protections, e.g. in nondisclosure agreements
If you follow the public discourse around AI governance at all (and, since you’re reading this, the odds of that are pretty good), you may have noticed that people tend to gravitate towards abstract debates about whether “AI regulation,” generally, is a good or a bad idea. The two camps were at each other’s throats in 2024 over California SB 1047, and before that bill was vetoed it wasn’t uncommon to see long arguments, ostensibly about the bill, that contained almost zero discussion of any of the actual things that the bill did.
That’s to be expected, of course. Reading statutes cover-to-cover can be a boring and confusing chore, especially if you’re not a lawyer, and it’s often reasonable to have a strong opinion on the big-picture question (“is frontier AI regulation good?”) without having similarly confident takes about the fine details of any specific proposal. But zooming in and evaluating specific proposals on their own merits has its advantages—not the least of which is that it sometimes reveals a surprising amount of consensus around certain individual policy ideas that seem obviously sensible.
One such idea is strengthening whistleblower protections for employees at frontier AI companies. Even among typically anti-regulation industry figures, whistleblower legislation has proven less controversial than one might have expected. For example, SB 53, a recent state bill that would expand the scope of the protection offered to AI whistleblowers in California, has met with approval from some prominent opponents of its vetoed predecessor, SB 1047. The Working Group on frontier AI that Governor Newsom appointed after he vetoed SB 1047 also included a section on the importance of protecting whistleblowers in its draft report.
There also seems to be some level of potential bipartisan support for whistleblower protection legislation at the federal level. Federal AI legislation has been slow in coming; hundreds of bills have been proposed, but so far nothing significant has actually been enacted. Whistleblower laws, which are plausibly useful for mitigating a wide variety of risks, minimally burdensome to industry, and easy to implement and enforce, seem like a promising place to start. And while whistleblower laws have sometimes been viewed in the past as Democrat-coded pro-labor measures, the increase in conservative skepticism of big tech companies in recent years and the highly public controversy regarding the restrictive contracts that OpenAI pressured departing employees to sign in 2024 seem to have given rise to some interest in protecting AI whistleblowers from the other side of the aisle as well.
Okay, so now you’re sold on the value of AI whistleblower legislation. Naturally, the next step is to join the growing chorus of voices desperately crying out for a medium-dive LawAI blog post explaining the scope of the protections that AI whistleblowers currently enjoy, the gaps that need to be addressed by future legislation, and the key decision points that state and federal lawmakers designing whistleblower statutes will confront. Don’t worry, we’re all over it.
1. What do whistleblower laws do?
The basic idea behind whistleblower protection laws is that employers shouldn’t be allowed to retaliate against employees who disclose important information about corporate wrongdoing through the proper channels. The core example of the kind of behavior that whistleblower laws are meant to protect is that of an employee who notices that his employer is breaking the law and reports the crime to the authorities. In that situation, it’s generally accepted that allowing the employer to fire (or otherwise retaliate against) the employee for blowing the whistle would discourage people from coming forward in the future. In other words, the public’s interest in enforcing laws justifies a bit of interference with freedom of contract in order to prevent retaliation against whistleblowers. Typically, the remedy available to a whistleblower who has been retaliated against is that they can sue the employer, or file an administrative complaint with a government agency, seeking compensation for whatever harm they’ve suffered—often in the form of a monetary payment, or being given back the job from which they were fired.
Whistleblowing can take many forms that don’t perfectly conform to that core example of an employee reporting some law violation by their employer to the government. For instance, the person reporting the violation might be an independent contractor rather than an employee, or might report some bad or dangerous action that didn’t technically violate the law, or might report their information internally within the company or to a media outlet rather than to the government. Whether these disclosures are protected by law depends on a number of factors.
2. What protections do AI whistleblowers in the U.S. currently have?
Currently, whistleblowers in the U.S. are protected (or, as the case may be, unprotected) by a patchwork of overlapping state and federal statutes, judicially created doctrines, and internal company policies. By default, private sector whistleblowers[ref 1] are not protected from retaliation by any federal statute, although they may be covered by state whistleblower protections and/or judicially created anti-retaliation doctrines. However, there are a number of industry- and subject-matter-specific federal statutes that protect certain whistleblowers from retaliation. For example, the Federal Railroad Safety Act protects railroad employees from being retaliated against for reporting violations of federal law relating to railroad safety or gross misuse of railroad-related federal funds; the Food Safety Modernization Act affords comparable protections to employees of food packing, processing, manufacturing, and transporting companies; and the Occupational Safety and Health Act prohibits employers generally from retaliating against employees for filing OSHA complaints.
The scope of the protections afforded by these statutes varies, as do the remedies that each statute provides to employees who have been retaliated against. Some only cover employees who report violations of federal laws or regulations to the proper authorities; others cover a broader range of whistleblowing activity, such as reporting dangerous conditions even when they don’t arise from any violation of a law or rule. Most allow employees who have been retaliated against either to file a complaint with OSHA or to sue the offending employer for damages in federal court, and a few even provide substantial financial incentives for whistleblowers who provide valuable information to the government.[ref 2]
Employees who aren’t covered by any federal statute may still be protected by their state’s whistleblower laws. In the context of the AI industry, the most important state is California, where most of the companies that develop frontier models are headquartered. California’s whistleblower protection statute is quite strong—it protects both public and private employees from retaliation for reporting violations of any state, federal, or local law or regulation to a government agency or internally within their company. It also prohibits employers from adopting any internal policies to prevent employees from whistleblowing. The recently introduced SB 53 would, if enacted, additionally protect employees and contractors working at frontier AI companies from retaliation for reporting information about “critical risk” from AI models.
Even when there are no applicable state or federal statutes, whistleblowers may still be protected by the “common law,” i.e., law created by judicial decisions rather than by legislation. These common law protections vary widely by state, but typically at a minimum prohibit employers from firing employees for a reason that contravenes a clearly established “public policy.”[ref 3] What exactly constitutes a clearly established public policy in a given state depends heavily on the circumstances, but whistleblowing often qualifies when it provides a public benefit, such as increasing public safety or facilitating effective law enforcement. However, it’s often difficult for a whistleblower (even with the assistance of a lawyer) to predict ex ante whether common law protections will apply because so much depends on how a particular court might apply existing law to a particular set of facts. Statutory protections are generally preferable because they provide greater certainty and can cover a broader range of socially desirable whistleblowing behavior.
3. Restrictions on whistleblowing: nondisclosure agreements and trade secrets
a. Nondisclosure and non-disparagement agreements
The existing protections discussed above are counterbalanced by two legal doctrines that can limit the applicability of anti-retaliation measures: the law of contracts and the law of trade secrets. Employers (especially in the tech industry) often require their employees to sign broad nondisclosure agreements that prohibit the employees from sharing certain confidential information outside of the company. It was this phenomenon—the use of NDAs to silence would-be whistleblowers—that first drew significant legislative and media attention to the issue of AI whistleblowing, when news broke that OpenAI had required departing employees to choose between signing contracts with broad nondisclosure and non-disparagement provisions or giving up their vested equity in the company. Essentially, the provisions would have required former employees to avoid criticizing OpenAI for the rest of their lives, even on the basis of publicly known facts, and even if they did not disclose any confidential information in doing so. In response to these provisions, a number of OpenAI employees and former employees wrote an open letter calling for a “right to warn about artificial intelligence” and had their lawyers write to the SEC arguing that OpenAI’s NDAs violated various securities laws and SEC regulations.
After news of the NDAs’ existence went public, OpenAI quickly apologized for including the problematic provisions in its exit paperwork and promised to remove the provisions from future contracts. But the underlying legal reality that allowed OpenAI to pressure employees into signing away their right to blow the whistle hasn’t changed. Typically, U.S. law assigns a great deal of value to “freedom of contract,” which means that mentally competent adults are usually allowed to sign away any rights they choose to give up unless the contract in question would violate some important public policy. Courts sometimes hold that NDAs are unenforceable against legitimate whistleblowers because of public policy considerations, but the existence of an NDA can be a powerful deterrent to a potential whistleblower even when there’s some chance that a court would refuse to enforce the contract.
By default, AI companies still have the power to prevent most kinds of whistleblowing in most jurisdictions by requiring employees to sign restrictive NDAs. And even companies that don’t specifically intend to prevent whistleblowing might take a “better safe than sorry” approach and adopt NDAs so broad and restrictive that they effectively deter whistleblowers. Of course, employees have the option of quitting rather than agreeing to sign, but very few people in the real world seriously consider doing that when they’re filling out hiring paperwork (or when they’re filling out departure paperwork and their employer is threatening to withhold their vested equity, as the case may be).
b. Trade secret law
Historically, frontier AI developers have often recognized that their work has immense public significance and that the public therefore has a strong interest in access to information about models. However, this interest is sometimes in tension with both the commercial interests of developers and the public’s interest in public safety. This tension is at the heart of the debate over open source vs. closed models, and it gave rise to the ironic closing-off of “OpenAI.”
The same tension also exists between the public’s interest in protecting whistleblowers and the interests of both companies and the public in protecting trade secrets. An overly broad whistleblower law that protected all employee disclosures related to frontier models would allow companies to steal model weights and algorithmic secrets from their competitors by simply poaching individual employees with access to the relevant information. In addition to being unfair, this would harm innovation in the long run, because a developer has less of an incentive to invest in research if any breakthroughs will shortly become available to its competitors. Furthermore, an overbroad whistleblower law might also actually create risks to public safety if it protected the public disclosure of information about dangerous capabilities that made it easier for bad actors or foreign powers to replicate those capabilities.
A “trade secret” is a piece of information, belonging to a company that makes reasonable efforts to keep it secret, that derives economic value from being kept secret. Wrongfully disclosing trade secrets is illegal under both state and federal law, and employees who disclose trade secrets can be sued or even criminally charged. Since 2016, however, the Defend Trade Secrets Act has provided immunity from both civil and criminal liability for disclosing a trade secret if the disclosure is made “(i) in confidence to a Federal, State, or local government official, either directly or indirectly, or to an attorney; and (ii) solely for the purpose of reporting or investigating a suspected violation of law.” In other words, the status quo for AI whistleblowers is essentially that they can disclose trade secret information only if the information concerns a violation of the law and only if they disclose it confidentially to the government, perhaps via a lawyer.
4. Why is it important to pass new AI whistleblower legislation?
Most of the employees working on the frontier models that are expected to generate many of the most worrying AI risks are located in California and entitled to the protection of California’s robust whistleblower statute. There are also existing common law and federal statutory protections that might prove relevant in a pinch; the OpenAI whistleblowers, for example, wrote to the SEC arguing that OpenAI’s NDAs violated the SEC’s rule against NDAs that fail to exempt reporting to the SEC about securities violations. However, there are important gaps in existing whistleblower protections that should be addressed by new federal and state legislation.
Most importantly, the existing California whistleblower statute only protects whistleblowers who report a violation of some law or regulation. But, as a number of existing federal and state laws recognize, there are times when information about significant risks to public safety or national security should be disclosed to the proper authorities even if no law has been broken. Suppose, for example, that internal safety testing demonstrates that a given model can, with a little jailbreaking, be coaxed into providing extremely effective help to a bad actor attempting to manufacture bioweapons. If an AI company chooses to deploy the model anyways, and an employee who worked on safety testing the model wants to bring the risk to the government’s attention through the proper channels, it seems obvious that they should be protected from retaliation for doing so. Unless the company’s actions violated some law or regulation, however, California’s existing whistleblower statute would not apply. To fill this gap, any federal AI whistleblower statute should protect whistleblowers who report information about significant risks from AI systems through the proper channels even if no law has been violated. California’s SB 53 would help to address this issue, but the scope of that statute is so narrow that additional protections would still be useful even if SB 53 is enacted.
Additionally, readers who followed the debate over SB 1047 may recall a number of reasons for preferring a uniform federal policy to a policy that applies only in one state, no matter how important that state is. Not every relevant company is located in California, and there’s no way of knowing for certain where all of the companies that will be important to the development of advanced AI systems in the future will be located. Federal AI whistleblower legislation, if properly scoped, would provide consistency and eliminate the need for an inconsistent patchwork of state protections.
New whistleblower legislation specifically for AI would also provide clarity to potential whistleblowers and raise the salience of AI whistleblowing. By default, many people who could come forward with potentially valuable information will not do so. Anything that reduces the level of uncertainty potential whistleblowers face and eliminates some of the friction involved in the disclosure process is likely to increase the number of whistleblowers who decide to come forward. Even an employee who would have been covered by existing California law or by common-law protections might be more likely to come forward if they saw, for example, a news item about a new statute that more clearly and precisely established protections for the kind of disclosure being contemplated. In other words, “whistleblowing systems should be universally known and psychologically easy to use – not just technically available.”
5. Key decision points for whistleblower legislation
There are also a number of other gaps in existing law that new state or federal whistleblower legislation could fill. This section discusses three of the most important decision points that lawmakers crafting state or federal AI whistleblower legislation will encounter: whether and how to include a formal reporting process, what the scope of the included protections should be, and whether to prohibit contracts that waive whistleblower protections.[ref 4]
a. Reporting process
Any federal AI whistleblower bill should include a formal reporting process for AI risks. This could take the form of a hotline or a designated government office charged with receiving, processing, and perhaps responding to AI whistleblower disclosures. Existing federal statutes that protect whistleblowers who report on hazardous conditions, such as the Federal Railroad Safety Act and the Surface Transportation Assistance Act, often direct an appropriate agency to promulgate regulations[ref 5] establishing a process by which whistleblowers can report “security problems, deficiencies, or vulnerabilities.”
The main benefit of this approach would be the creation of a convenient default avenue for reporting, but there would also be incidental benefits. For example, the existence of a formal government channel for reporting might partially address industry concerns about trade secret protection and the secure processing of sensitive information, especially if the established channel was the only legally protected avenue for reporting. Establishing a reporting process also provides some assurance to whistleblowers that the information they disclose will come to the attention of the government body best equipped to process and respond appropriately to it.[ref 6] Ideally, the agency charged with receiving reports would have preexisting experience with the secure processing of information related to AI security; if the Trump administration elects to allow the Biden administration’s reporting requirements for frontier AI developers to continue in some form, the natural choice would be whatever agency is charged with gathering and processing that information (currently the Department of Commerce’s Bureau of Industry and Security).
b. Scope of protection
Another key decision point for policymakers is the determination of the scope of the protection offered to whistleblowers—in other words, the actions and the actors that should be protected. California’s SB 53, which was clearly drafted to minimize controversy rather than to provide the most robust protection possible, only protects a whistleblower if either:
(a) the whistleblower had “reasonable cause to believe” that they were disclosing information regarding “critical risk,” defined as—
- a “foreseeable and material risk” of
- killing or seriously injuring more than 100 people or causing at least one billion dollars’ worth of damage, via
- one of four specified harm vectors—creating CBRN[ref 7] weapons, a cyberattack, loss of control, or AI model conduct with “limited human intervention” that would be a crime if committed by a human, or
(b) the whistleblower had reasonable cause to believe that their employer had “made false or misleading statements about its management of critical risk”
This is a hard standard to meet. It’s plausible that an AI company employee could be aware of some very serious risk that didn’t threaten a full billion dollars in damage—or even a risk that did threaten hundreds of lives and billions of dollars in damages, but not through one of the four specified threat vectors—and yet not be protected under the statute. Imagine, for example, that internal safety testing at an AI lab showed that a given frontier model could, with a little jailbreaking, provide extremely effective guidance on how to build conventional explosives and use them to execute terrorist attacks. Even if the lab chose not to release this information and issued false public statements about their model’s evaluation results, any potential whistleblower would likely not be protected under SB 53 for reporting this information.
Compare that standard to the one in Illinois’ whistleblower protection statute, which instead protects any employee who discloses information while having a “good faith belief” that the information relates to an activity of their employer that “poses a substantial and specific danger to employees, public health, or safety.”[ref 8] This protection applies to all employees in Illinois,[ref 9] not just employees at frontier AI companies. The federal Whistleblower Protection Act, which applies to federal employees, uses a similar standard—the whistleblower must “reasonably believe” that their disclosure is evidence of a “substantial and specific danger to public health or safety.”
Both of those laws apply to a far broader category of workers than an industry-specific frontier AI whistleblower statute would, and they both allow the disclosure to be made to a relatively wide range of actors. It doesn’t seem at all unreasonable to suggest that AI whistleblower legislation, whether state or federal, should similarly protect disclosures when the whistleblower believes in good faith that they’re reporting on a “substantial and specific” potential danger to public health, public safety, or national security. If labs are worried that this might allow for the disclosure of valuable trade secrets, the protection could be limited to employees who make their reports to a designated government office or hotline that can be trusted to securely handle the information it receives.
In addition to specifying the kinds of disclosures that are protected, a whistleblower law needs to provide clarity on precisely who is entitled to receive protection for blowing the whistle. Some whistleblower laws cover only “employees,” and define that term to exclude, e.g., independent contractors and volunteers. This kind of restriction would be inadvisable in the AI governance context. Numerous proposals have been made for various kinds of independent, and perhaps voluntary, third party testing and auditing of frontier AI systems. The companies and individuals conducting those tests and audits would be well-placed to become aware of new risks from frontier models. Protecting the ability of those individuals to securely and confidentially report risk-related information to the government should be a priority. Here, the scope of California’s SB 53 seems close to ideal—it covers contractors, subcontractors, and unpaid advisors who work for a business as well as ordinary employees.
c. Prohibiting contractual waivers of whistleblower protections
The ideal AI whistleblower law would provide that its protections could not be waived by an NDA or any similar contract or policy. Without such a provision, the effectiveness of any whistleblower law could be blunted by companies requiring employees to sign a relatively standard broad NDA, even if the company didn’t specifically intend to restrict whistleblowing. While a court might hold that such an NDA was unenforceable under common law principles, the uncertainty surrounding how a given court might view a given set of circumstances means that even an unenforceable NDA might have a significant impact on the likelihood of whistleblowers coming forward.
It is possible to pass laws directly prohibiting contracts that discourage whistleblowing—the SEC, for example, often brings charges under the Securities Exchange Act against companies that require employees to sign broad nondisclosure agreements if those agreements don’t include an exception allowing whistleblowers to report information to the SEC. A less controversial approach might be to declare such agreements unenforceable; this, for example, is what 18 U.S.C. § 1514A (another federal law relating to whistleblowing in the securities context) does. California’s SB 53 and some other state whistleblower laws do something similar, but with one critical difference—they prohibit employers from adopting “any rule, regulation, or policy” preventing whistleblowing, without specifically mentioning contracts. The language in SB 53, while helpful, likely wouldn’t cover individualized nondisclosure agreements that aren’t the result of a broader company policy.[ref 10] In future state or federal legislation, it would be better to use language more like the language in 18 U.S.C. § 1514A, which states that “The rights and remedies provided for in this section may not be waived by any agreement, policy form, or condition of employment, including by a predispute arbitration agreement.”
Conclusion
Whistleblower protections for employees at frontier AI companies are a fairly hot topic these days. Numerous state bills have been introduced, and there’s a good chance that federal legislation will follow. The idea seems to have almost as much currency with libertarian-minded private governance advocates as it does with European regulators: California SB 813, the recent proposal for establishing a system of “semiprivate standards organizations” to privately regulate AI systems, would require would-be regulators to attest to their plan for “implementation and enforcement of whistleblower protections.”
There’s reasonably widespread agreement, in other words, that it’s time to enact protections for AI whistleblowers. This being the case, it makes sense for policymakers and commentators who take an interest in this sort of thing to develop some informed opinions about what whistleblower laws are supposed to do and how best to design a law that does those things.
Our view is that AI whistleblower laws are essentially an information-gathering authority—a low-cost, innovation-friendly way to tweak the incentives of people with access to important information so that they’re more likely to make disclosures that benefit the public interest. It’s plausible that, from time to time, individual workers at the companies developing transformative AI systems will become aware of important nonpublic information about risks posed by those systems. Removing obstacles to disclosing that information will, on the margin, encourage additional disclosures and benefit the public. But passing “an AI whistleblower law” isn’t enough. Anyone trying to design such a law will face a number of important decisions about how to structure the offered protections and how to balance companies’ legitimate interest in safeguarding confidential information against the public’s interest in transparency. There are better and worse ways of proceeding, in other words; the idea behind this post was to shed a bit of light on which are which.
The National Security Memo on AI: what to expect in Trump 2.0
Any opinions expressed in this post are those of the author and do not reflect the views of the Institute for Law & AI or the U.S. Department of Defense.
On October 24, 2024, President Biden’s National Security Advisor Jake Sullivan laid out the U.S. government’s “first-ever strategy for harnessing the power and managing the risks of AI to advance [U.S.] national security.”[ref 1] The National Security Memorandum on AI (NSM) was initially seen as a major development in U.S. national security policy, but, following former President Donald Trump’s victory in the 2024 election, it is unclear what significance the NSM retains. If he is so inclined, President Trump can rescind the NSM on his first day in office, as he has promised to do with President Biden’s 2023 AI Executive Order (EO). But national security has traditionally been a policy space with a significant degree of continuity between administrations, and at least some of the policy stances embodied in the NSM seem consistent with the first Trump administration’s approach to issues at the intersection of AI and national security.
So, does the NSM still matter? Will the incoming administration repeal it completely, or merely amend certain provisions while leaving others in place? And what, if anything, might any repealed provisions be replaced with? While other authors have already provided comprehensive analyses of the NSM’s provisions and its accompanying framework, none have focused their assessments on how the documents will fare under the incoming administration. This blog post attempts to fill that gap by analyzing how President Trump and his key advisors may change or continue some of the NSM’s most significant provisions. In summary:
- Bias and Discrimination: The Trump Administration is likely to repeal NSM provisions that focus on issues of bias and discrimination, including its recognition of bias and discrimination as one of nine core AI risk categories.
- “Safe, Secure, and Trustworthy” AI: The NSM contains a number of provisions directing agencies to establish risk management practices, create benchmarks and standards, conduct testing, and release guidance for evaluating the safety, security, and trustworthiness of advanced AI systems. These provisions do not impose mandatory obligations on private companies, and may survive for that reason. However, conservatives may object to the NSM’s focus on mitigating risks rather than encouraging adoption through deregulation.
- Responding to Foreign Threats, Particularly from China: The incoming administration appears likely to continue the NSM’s initiatives aimed at slowing AI progress by China and other U.S. adversaries—including directing the Intelligence Community to focus on threats to the U.S. AI ecosystem and strengthening inbound investment screening—though potentially with a revised approach more consistent with President Trump’s explicit strategic focus on China during his first term.
- Infrastructure: The Trump administration seems poised to expand upon or at least continue the NSM’s efforts to develop the energy production capabilities necessary for expected future AI development and deployment needs, to strengthen domestic chip production, and to make AI resources accessible to researchers without the funding of private companies while increasing the government’s efficiency in using its own AI resources.
- Talent and Immigration: President Trump appears likely to continue the NSM’s initiatives aimed at better recruiting and retaining AI talent across the national security enterprise, but whether he will accept the NSM’s provisions relating to high-skilled immigration is uncertain.
Background
Created in response to a directive in President Biden’s AI EO,[ref 2] the National Security Memorandum on AI (NSM) was a major national security policy priority for the Biden administration. Few technologies over the last 75 years have received similar top-level, interagency attention; the Biden administration officials who designed the NSM have said that they took inspiration from historical efforts to compete against the Soviets in nuclear and space technologies. The NSM is detailed, specific, and lengthy, coming in at more than twice the length of any other national security memorandum issued by the Biden administration other than the National Security Memorandum on Critical Infrastructure Security and Resilience (NSM-22).
Relative to some of the Biden administration’s other AI policy documents, the NSM more narrowly focuses on the strategic consequences of AI for U.S. national security. It identifies AI as an “era-defining technology”[ref 3] and paints a picture of the United States in a great power competition that, at its core, is a struggle for technological supremacy.[ref 4] The NSM argues that, if the United States does not act now using a coordinated, responsible, and whole-of-society approach to take advantage of AI advances, it “risks losing ground to strategic competitors”[ref 5] and that this lost technological edge “could threaten U.S. national security, bolster authoritarianism worldwide, undermine democratic institutions and processes, facilitate human rights abuses, and weaken the rules-based international order.”[ref 6]
Where previous Biden administration AI documents either took a non-sector-specific approach,[ref 7] excluded non-national security systems,[ref 8] focused guidance narrowly on autonomous and semi-autonomous weapon systems,[ref 9] or provided high-level principles rather than concrete direction,[ref 10] the NSM requires follow-through by all government agencies across the national security enterprise and helps enable that follow-through with concrete implementation guidance. Specifically, the NSM includes more than 80 compulsory assignments[ref 11] to relevant agencies in support of efforts to promote and secure U.S. leadership in AI (focusing particularly on frontier AI models[ref 12]), harness AI to achieve U.S. national security goals, and engage with other countries and multilateral organizations to influence the course of AI development efforts around the world in a direction consistent with U.S. values and interests.[ref 13] Those assignments to agencies seek to accelerate domestic AI development while slowing the development of U.S. adversaries’ capabilities and managing technological risks, including “AI safety, security, and trustworthiness.”[ref 14] Inside the national security enterprise, the NSM seeks to enable effective and responsible AI use while ensuring agencies can manage the technology’s risks.
To the same ends, the NSM provides and requires agencies to follow a separate[ref 15] governance and risk management framework for “AI used as a component of a National Security System.”[ref 16] The framework sets concrete boundaries for national security agencies’ responsible adoption of AI systems in several ways.[ref 17] First, it delineates AI use restrictions and minimum risk management safeguards for specific use cases, ensuring agencies know what they can and cannot legally use AI for and when they must take more thorough risk reduction measures before a given stage of the AI lifecycle. The framework also requires agencies to catalog and monitor their AI use, facilitating awareness and accountability for all AI uses up the chain of command. Lastly, the framework requires agencies to establish standardized training and accountability requirements and guidelines to ensure their personnel’s responsible use and development of AI.
Logistics of a Repeal
Presidents are generally free to revoke, replace, or modify the presidential memoranda issued by their predecessors as they choose, without permission from Congress. To repeal the NSM, President Trump could issue a new memorandum rescinding the entire NSM (and the accompanying framework) or repealing certain provisions while retaining others. A new Executive Order, potentially with the broader purpose of repealing the Biden AI EO, could serve the same function. Both of these options would typically include a policy review led by the National Security Council to assess the status quo and recommend updates, though each presidential administration has revised the exact process to fit their needs. If the NSM does not end up being a top priority, President Trump could also informally direct[ref 18] national security agency heads to stop or change their implementation of some of the NSM’s provisions before he issues a formal policy document.
Bias and Discrimination
The first NSM provisions on the chopping block will, in all likelihood, be those that focus on bias and discrimination. President Trump and conservatives across the board have vowed to “stop woke and weaponized government” and generally view many of the Biden administration’s policies in this arena as harming U.S. competitiveness and growth, stifling free speech, and negatively impacting U.S. homeland and national security. In the AI context, the 2024 GOP Platform promised to repeal the Biden EO, stating that it “hinders AI Innovation,… imposes Radical Leftwing ideas on the development of this technology,” and restricts freedom of speech.
While not as focused on potentially controversial social issues as some other Biden administration AI policy documents,[ref 19] the NSM does contain several provisions to which the Trump administration will likely object. Specifically, the incoming administration seems poised to cut the NSM’s recognition of “discrimination and bias” as one of nine core AI risk categories[ref 20] that agency heads must “monitor, assess, and mitigate” in their agencies’ development and use of AI.[ref 21] Additionally, the incoming administration may repeal or revise M-24-10—a counterpart to the NSM’s framework that addresses AI risks outside of the national security context—effectively preventing the NSM’s framework from incorporating M-24-10’s various “rights-impacting” use cases.[ref 22] These provisions are easily severable from the current NSM and its framework.
“Safe, Secure, and Trustworthy” AI
One primary focus of the NSM is facilitating the “responsible” adoption of AI by promoting the “safety, security, and trustworthiness” of AI systems through risk management practices, standard-setting, and safety evaluations. In many ways, the NSM’s approach in these sections is consistent with aspects of the first Trump administration’s AI policy, but there is growing conservative and industry support for a deregulatory approach to speed up AI adoption.
The first Trump administration kickstarted federal government efforts to accelerate AI development and adoption with two AI-related executive orders issued in 2019 and 2020. At the time, the administration saw trust and safety as important factors for facilitating adoption of AI technology; the 2019 EO noted that “safety and security concerns” were “barriers to, or requirements associated with” widespread AI adoption, emphasized the need to “foster public trust and confidence in AI technologies and protect civil liberties, privacy, and American values,” and required the National Institute of Standards and Technology (NIST) to develop a plan for the federal government to assist in the development of technical standards “in support of reliable, robust, and trustworthy” AI systems.[ref 23] The 2020 EO sought to “promot[e] the use of trustworthy AI in the federal government” in non-national security contexts and, in service of this goal, articulated principles for the use of AI by government agencies. According to the 2020 EO, agency use of AI systems should be “safe, secure, and resilient,” “transparent,” “accountable,” and “responsible and traceable.”[ref 24]
The Biden administration continued along a similar course,[ref 25] focusing on the development of soft law mechanisms to mitigate AI risks, including voluntary technical standards, frameworks, and agreements. Echoing the 2019 Trump EO, the Biden NSM argues that standards for safety, security, and trustworthiness will speed up adoption “thanks to [the] increased certainty, confidence, and compatibility” they bring.
But 2020 was a lifetime ago in terms of AI policy. Ultimately, the real question is not whether the second Trump administration thinks that safety, security, and trustworthiness are relevant, but rather whether the NSM provisions relating to trustworthy AI are viewed as, at the margins, facilitating adoption or hindering innovation. While there is certainly overlap between the two administration’s views, some conservatives have objected to the Biden administration’s AI policy outside of the national security context on the grounds that it focused on safety, security, and trustworthiness primarily for the sake of preventing various harms instead of as a means to encourage and facilitate AI adoption.[ref 26] Others have expressed skepticism regarding discussions of “trust and safety” on the grounds that large tech companies might use safety concerns to stymie competition, ultimately leading to reduced innovation and harm to consumers. In particular, the mandatory reporting requirements placed on AI companies by President Biden’s 2023 EO faced conservative opposition; the 2024 GOP platform asserts that the EO will “hinder[] AI innovation” and promises to overturn it.
Concretely, the NSM requires agencies in the national security enterprise to use its accompanying risk management framework as they implement AI systems; to conduct certain evaluations and testing of AI systems; to monitor, assess, and mitigate AI-related risks; to issue and regularly update agency-specific AI governance and risk management guidance; and to appoint Chief AI Officers and establish AI Governance Boards.[ref 27] The NSM intends these Officers and Boards to ensure accountability, oversight, and transparency in the implementation of the NSM’s framework.[ref 28] The NSM also designates NIST’s AI Safety Institute (AISI) to “serve as the primary United States government point of contact with private sector AI developers to facilitate voluntary pre- and post-public-deployment testing for safety, security, and trustworthiness,” conduct voluntary preliminary pre-deployment testing on at least two frontier AI models, create benchmarks for assessing AI system capabilities, and issue guidance on testing, evaluation, and risk management.[ref 29] Various other agencies with specific expertise are required to provide “classified sector-specific evaluations” of advanced AI models for cyber, nuclear, radiological, biological, and chemical risks.[ref 30]
Unlike the 2023 Biden EO, which invoked the Defense Production Act to impose its mandatory reporting requirements on private companies, the NSM’s provisions on safe, secure, and trustworthy AI impose no mandatory obligations on private companies. This, in addition to the NSM’s national security focus, might induce the Trump administration to leave most of these provisions in effect.[ref 31] However, not all members of the incoming administration may view such a focus on risk management, standards, testing, and evaluation as the best path to AI adoption across the national security enterprise. If arguments for a deregulatory approach toward AI adoption win the day, these NSM provisions and possibly the entire memorandum could face a full repeal. Regardless of the exact approach, the incoming administration seems likely to keep AI innovation as its north star, taking an affirmative approach focused on the benefits enabled by AI and on safety, security, and trustworthiness instrumentally to the degree the administration judges necessary to enable U.S. AI leadership.
Responding to Foreign Threats, Particularly from China
President Trump seems likely to maintain or expand the NSM directives aimed at impeding the AI development efforts of China and other U.S. adversaries. Over the last decade, Washington has seen bipartisan consensus behind efforts to respond to economic gray zone tactics[ref 32] used by U.S. adversaries, particularly China, to compete with the United States. These tactics have included university research partnerships, cyber espionage, insider threats, and both foreign investments in U.S. companies and aggressive headhunting of those companies’ employees to facilitate technology transfer. The NSM builds upon efforts to combat these gray zone tactics by reassessing U.S. intelligence priorities with an eye toward focusing on the U.S. AI ecosystem, strengthening inbound investment screening, and directing the Intelligence Community (IC) to focus on risks to the AI supply chain. If President Trump does not elect to repeal the NSM in its entirety, it seems likely that he will build upon each of these NSM provisions, although potentially applying an approach that more explicitly targets China.[ref 33]
The NSM requires a review and recommendations for revision of the Biden administration’s intelligence priorities, incorporating into those priorities risks to the U.S. AI ecosystem and enabling sectors.[ref 34] The recommendations, which the White House will likely complete before the inauguration,[ref 35] will likely help inform the incoming administration’s intelligence priorities. Though this implementation will not make headlines, it could significantly strengthen the incoming administration’s enforcement efforts by enabling better strategies and targeting decisions for export controls, tariffs (including possible component tariffs), outbound investment restrictions, and other measures.
Additionally, the NSM strengthens inbound investment screening by requiring the Committee on Foreign Investment in the United States (CFIUS) to consider, as part of its screenings, whether a given transaction involves foreign actor access to proprietary information related to any part of the AI lifecycle.[ref 36] This provision is consistent with President Trump’s strengthening of CFIUS during his first term—both by championing the Foreign Investment Risk Review Modernization Act (FIRRMA), which expanded CFIUS’s jurisdiction and review process, and by increasing scrutiny on foreign acquisitions of U.S. tech companies, including semiconductor companies. By specifically requiring an analysis of risks related to access of proprietary AI information, this provision seems likely to increase scrutiny of AI-relevant foreign investments covered by CFIUS[ref 37] and to make it more likely that AI-related transactions will be blocked.
The NSM also requires the Intelligence Community to identify critical AI supply chain nodes, determine methods to disrupt or compromise those nodes, and act to mitigate related risks.[ref 38] This directive is consistent with the first Trump administration’s aggressive approach to the use of export controls—including through authorities from the Export Control Review Act (ECRA), which President Trump signed into law—and diplomacy to disrupt China’s ability to manufacture or acquire critical AI supply chain components. It also parallels Trump-era efforts to secure telecommunications supply chains. Increased IC scrutiny of AI supply chain nodes may provide intelligence allowing the United States and its allies and partners to better leverage their supply chain advantages, just as the Biden administration has attempted to do through multiple new export controls.
Based on these consistencies across the last two administrations and public statements from President Trump’s incoming U.S. Trade Representative Jamieson Greer, the new administration seems poised to double down on the NSM’s combined efforts to protect against Chinese and other adversarial threats to the U.S. AI ecosystem. Congress also appears amenable to further strengthening the President’s ECRA authorities in support of possible Trump administration efforts—to cover AI systems and cloud compute providers that enable the training of AI models.[ref 39] However, President Biden’s recent export controls have met with opposition from major players in the semiconductor supply chain and conservative open-source advocates. President Trump could also potentially use such restrictions as bargaining chips, easing restrictions in order to secure concessions from foreign competitors in other policy areas.
Because the NSM’s provisions related to foreign threats do not significantly affect open-source models, they seem unlikely to provoke many objections from the incoming administration, except to the extent that they do not go far enough or avoid explicitly identifying China.[ref 40] This does not necessarily mean that President Trump will avoid repealing them, however, as it remains possible that the incoming administration will find it more convenient to repeal the entire document and replace provisions as necessary than to pick and choose its targets.
Infrastructure
President Trump seems likely to expand upon or at least continue the NSM’s provisions that focus on developing the energy infrastructure necessary to meet expected future AI power needs (without the Biden Administration’s focus on clean power), strengthening domestic chip production, and making AI resources accessible to diverse actors while increasing the government’s efficiency in using its own AI resources.
Bipartisan consensus exists around the need to build the infrastructure required to facilitate the development of next-generation AI systems. President Trump has already signaled that this issue will be one of his top priorities, and President Biden recently issued an Executive Order on Advancing United States Leadership in Artificial Intelligence Infrastructure.[ref 41] Although both parties recognize the importance of AI infrastructure, President Trump’s team has indicated that they intend to adopt a modified “energy dominance” version of this priority. Where the Biden administration sees the United States as being at risk of falling behind without additional clean power, the Trump administration views the nation as already behind the curve and needing to address its energy deficit with all types of power, including fossil fuels and nuclear energy. The incoming administration also sees power generation and the resulting lower energy prices as a potential asymmetric advantage for the United States in the “A.I. arms race.” Therefore, President Trump seems likely to significantly expand on the Biden administration’s efforts to provide power for the U.S. AI ecosystem. With respect to the NSM, this likely means that the provision requiring the White House Chief of Staff to coordinate the streamlining of permits, approvals, and incentives for the construction of AI-enabling infrastructure and supporting assets[ref 42] will survive, unless the NSM is repealed in its entirety.
Though the NSM does not focus on U.S. chip production infrastructure, National Security Advisor Sullivan pointed to the progress already made through the CHIPS and Science Act’s “generational investment in [U.S.] semiconductor manufacturing” in his speech announcing the memorandum. Under the incoming administration, however, the survival of that bipartisan effort is somewhat uncertain. While the legislation has received significant support from Republican members of Congress, President Trump has criticized the bill as being less efficient than tariffs, and he could delay or block the distribution of promised funds to chip companies. However, the concept of incentivizing foreign chip firms to build fabs in the United States was originally devised during the first Trump administration, and the first investment of Taiwan Semiconductor Manufacturing Company (TSMC) in Arizona came during Trump’s first term in office. Some commentators have argued that, at least for many segments of the chip industry, tariffs alone will not solve the United States’ chip problem. Given strong Republican support for CHIPS Act-funded U.S. factories and the national security case for such investments, it seems most likely that the incoming administration will continue to advance many of the bill’s infrastructure goals. Instead of attempting a broad reversal, President Trump might remove requirements from application guidelines that mandate that funding recipients provide child care, encourage union labor, and demonstrate environmental responsibility, including using renewable energy to operate their facilities.
The NSM also requires agencies to consider AI needs in their construction and renovation of federal compute facilities;[ref 43] begin a federated AI and data sources pilot project;[ref 44] and distribute compute, data, and other AI assets to actors who would otherwise lack access.[ref 45] As these assignments seem largely consistent with prior Trump efforts, they appear more likely than not to continue. The AI construction assessment requirement and the federated AI pilot appear to align with the incoming administration’s focus on efficiency. Additionally, President Trump signed into law the legislation that began the National AI Research Resource (NAIRR) during his first term and may continue to support its mission of democratizing access to AI assets, although potentially not at the levels requested by the Biden Administration’s Director of the Office of Science and Technology Policy.
Talent and Immigration
Whether the NSM provisions relating to high-skilled immigration survive under the new administration is uncertain, but non-immigration initiatives focused on AI talent seem likely to survive.
The NSM aims to better recruit and retain AI talent at national security agencies by revising federal hiring and retention policies to accelerate responsible AI adoption,[ref 46] identifying education and training opportunities to increase the AI fluency across the national security workforce,[ref 47] establishing a National Security AI Executive Talent Committee,[ref 48] and conducting “an analysis of the AI talent market in the United States and overseas” to inform future AI talent policy choices.[ref 49] These initiatives seem consistent with actions taken in the previous Trump administration and, therefore, likely to survive in some form. For example, President Trump’s signature AI for the American Worker initiative focused on training and upskilling workers with AI-relevant skills. President Trump also signed the bill into law that established the National Security Commission on AI, which completed the most significant government analysis of the AI national security challenge and whose final report emphasized the importance of recruiting and retaining AI talent within the government’s national security enterprise.
The NSM also seeks to better compete for AI talent by directing relevant agencies both to “use all available legal authorities to assist in attracting and rapidly bringing to the United States” individuals who would increase U.S. competitiveness in “AI and related fields”[ref 50] and to convene agencies to “explore actions for prioritizing and streamlining administrative processing operations for all visa applicants working with sensitive technologies.”[ref 51] Specifically, this effort would likely involve continued work to expand and improve the H-1B visa process, as well as other potential skilled immigration pathways and policies like O-1A and J-1 visas, Optional Practical Training, the International Entrepreneur Rule, and the Schedule A list.
President Trump’s position on high-skilled immigration and specifically the H-1B program appears to have softened since his first term, but it is unclear to what degree and how that will affect his policy decisions. On the campaign trail this year, President Trump stated his support for providing foreign graduates of U.S. universities and even “junior colleges” with green cards to stay in the United States, although his campaign later walked back the statement and clarified the need for “the most aggressive vetting process in U.S. history” before permitting graduates to stay. President Trump’s Senior Policy Advisor for AI Sriram Krishnan is a strong supporter of H-1B visa expansion. And most significantly, in response to the fiery online debate following the Krishnan announcement between President Trump’s pro-H-1B advisors Elon Musk and Vivek Ramaswamy and prominent critics of H-1B like Steve Bannon and Laura Loomer, President Trump reaffirmed his support for H-1B visas, saying, “we need smart people coming into our country.”
However, a significant portion of President Trump’s political base would prefer to shrink the H-1B program, as he did during his first term to protect American jobs. During that first term, President Trump repeatedly cut H-1B visas for skilled immigrants, including through his “Buy American, Hire American” Executive Order and interim H-1B program revision.[ref 52] His former Senior Advisor Stephen Miller and U.S. Immigration and Customs Enforcement Director Tom Homan, who were major proponents of these H-1B cuts, will serve in the new administration as Homeland Security Advisor and “border czar,” respectively. These posts will likely allow them to exert significant influence on the President’s immigration decisions and, potentially, to prevail over Trump’s supporters in Silicon Valley and other potential proponents of highly skilled immigration like Jared Kushner and UN Ambassador nominee Elise Stefanik. Additionally, cracking down on immigration in order to “put American workers first” was a core element of the 2024 Republican Party platform.
One key early indicator of the direction President Trump leans will be his decision to continue or attempt to roll back the Biden administration’s long-awaited revision to the H-1B program, which went into effect the last business day before President Trump’s inauguration and includes attempts to streamline the approvals process, increase flexibility, and strengthen oversight.
Conclusion
It is clear that AI will be a key part of the incoming administration’s national security policy. Across his campaign, President Trump prioritized developing domestic AI infrastructure, particularly energy production, and, since winning the election, he has prioritized the appointment of multiple high-level AI advisors.
However, while some of the incoming administration’s responses to the NSM seem locked in—notably, removing provisions relating to discrimination and bias, building on the NSM’s shift toward increasing U.S. power production to support AI energy needs, and continued efforts to slow China’s development of advanced AI systems—there are also key areas where the administration’s responses remain uncertain. Regardless of how the Trump administration’s policies at the intersection of AI and national security shake out, its response to the NSM will serve as a useful early indicator of what direction those policies will take.
Commerce just proposed the most significant federal AI regulation to date – and no one noticed
A little more than a month ago, the Bureau of Industry and Security (“BIS”) proposed a rule that, if implemented, might just be the most significant U.S. AI regulation to date. The proposed rule has received relatively scant media attention as compared to more ambitious AI governance measures like California’s SB 1047 or the EU AI Act, to no one’s surprise—regulation is rarely an especially sexy topic, and the proposed rule is a dry, common-sense, procedural measure that doesn’t require labs to do much of anything besides send an e-mail or two to a government agency once every few months. But the proposed rule would allow BIS to collect a lot of important information about the most advanced AI models, and it’s a familiar fact of modern life that complex systems like companies and governments and large language models thrive on a diet of information.
This being the case, anyone who’s interested in what the U.S. government’s approach to frontier AI regulation is likely to look like would probably be well-served by a bit of context about the rule and its significance. If that’s you, then read on.
What does the proposed rule do?
Essentially, the proposed rule would allow BIS to collect information on a regular basis about the most advanced AI models, which the proposed rule calls “dual-use foundation models.”[ref 1] The rule provides that any U.S. company that plans to conduct a sufficiently large AI model training run[ref 2] within the next six months must report that fact to BIS on a quarterly basis (i.e., once every three months, by specified reporting deadlines). Companies that plan to build or acquire sufficiently large computing clusters for AI training are similarly required to notify BIS.
Once a company has notified BIS of qualifying plans or activities, the proposed rule states that BIS will send the company a set of questions, which must be answered within 30 days. BIS can also send companies additional “clarification questions” after receiving the initial answers, and these clarification questions must be answered within 7 days.
The proposed rule includes a few broad categories of information that BIS will certainly collect. For instance, BIS is required under the rule to ask companies to report the results of any red-teaming safety exercises conducted and the physical and cybersecurity measures taken to protect model weights. Importantly, however, the proposed rule would not limit BIS to asking these questions—instead, it provides that BIS questions “may not be limited to” the listed topics. In other words, the proposed rule would provide BIS with extremely broad and flexible information-gathering capabilities.
Why does the proposed rule matter?
The NPRM doesn’t come as a surprise—observers have been expecting something like it for a while, because President Biden ordered the Department of Commerce to implement reporting requirements for “dual-use foundation models” in § 4.2(a) of Executive Order 14110. Also, BIS previously sent out an initial one-off survey to selected AI companies in January 2024, collecting information similar to the information that will be collected on a more regular basis under the new rule.
But while the new proposed rule isn’t unexpected, it is significant. AI governance researchers have emphasized the importance of reporting requirements, writing of a “growing consensus among experts in AI safety and governance that reporting safety information to trusted actors in government and industry is key” for responding to “emerging risks presented by frontier AI systems.” And most of the more ambitious regulatory frameworks for frontier AI systems that have been proposed or theorized would require the government to collect and process safety-relevant information. Doing this effectively—figuring out what information needs to be collected and what the collected information means—will require institutional knowledge and experience, and collecting safety information under the proposed rule will allow BIS to cultivate that knowledge and experience internally. In short, the proposed rule is an important first step in the regulation of frontier models.
Labs already voluntarily share some safety information with the government, but these voluntary commitments have been criticized as “vague, sensible-sounding pledge[s] with lots of wiggle room,” and are not enforceable. In short, voluntary commitments obligate companies only to share whatever information they want to share, whenever they want to share it. The proposed rule, on the other hand, would be legally enforceable, with potential civil and criminal penalties for noncompliance, and would allow BIS to choose what information to collect.
Pushback and controversy
Like other recent attempts to regulate frontier AI developers, the proposed rule has attracted some amount of controversy. However, the recently published public comments on the rule seem to indicate that the rule is unlikely to be challenged in court—and that, unless the next presidential administration decides to change course and scrap the proposed rule, reporting requirements for dual-use foundation models are here to stay.
The proposed rule and the Defense Production Act
As an executive-branch agency, BIS typically only has the legal authority to issue regulations if some law passed by Congress authorizes the kind of regulation contemplated. According to BIS, congressional authority for the proposed rule comes from § 705 of the Defense Production Act (“DPA”).
The DPA is a law that authorizes the President to take a broad range of actions in service of “the national defense.” The DPA was initially enacted during the Korean War and used solely for purposes related to defense industry production. Since then, Congress has renewed the DPA a number of times and has significantly expanded the statute’s definition of “national defense” to include topics such as “critical infrastructure protection and restoration,” “homeland security,” “energy production,” and “space.”
Section 705 of the DPA authorizes the President to pass regulations and conduct industry surveys to “obtain such information… as may be necessary or appropriate, in his discretion, to the enforcement or administration of [the DPA].” While § 705 is very broadly worded, and on its face appears to give the President a great deal of discretionary authority to collect all kinds of information, it has historically been used primarily to authorize one-off “industrial base assessment” surveys of defense-relevant industries. These assessments have typically been time-bounded efforts to analyze the state of a specified industry that result in long “assessment” documents. Interestingly enough, BIS has actually conducted an assessment of the artificial intelligence industry once before—in 1994.[ref 3]
Unlike past industrial base assessments, the proposed rule would allow the federal government to collect information from industry actors on an ongoing basis, indefinitely. This means that the kind of information BIS requests and the purposes it uses that information for may change over time in response to advances in AI capabilities and in efforts to understand and evaluate AI systems. And unlike past assessment surveys, the rule’s purpose is not simply to aid in the preparation of a single snapshot assessment of the industry. Instead, BIS intends to use the information it collects to “ensure that the U.S. Government has the most accurate, up-to-date information when making policy decisions” about AI and the national defense.
Legal and policy objections to reporting requirements under Executive Order 14110
After Executive Order 14110 was issued in October 2023, one of the most common criticisms of the more-than-100-page order was that its reliance on the DPA to justify reporting requirements was unlawful. This criticism was repeated by a number of prominent Republican elected officials in the months following the executive order’s publication in October 2023, and the prospect of a lawsuit challenging the legality of reporting requirements under the executive order was widely discussed. But while these criticisms were based on legitimate and understandable concerns about the separation of powers and the scope of executive-branch authority, they were not legally sound. Ultimately, any lawsuit challenging the proposed rule would likely need to be filed by the leading AI labs who are subject to the rule’s requirements, and none of those labs seem inclined to raise the kind of fundamental objections to the rule’s legality that early reactions to the executive order contemplated.
The basic idea behind the criticisms of the executive order was that it used the DPA in a novel way, to do something not obviously related to the industrial production of military materiel. To some skeptics of the Biden administration, or observers generally concerned about the concentration of political power in the executive branch, the executive order looked like an attempt to use emergency wartime powers in peacetime to increase the government’s control over private industry. The public comment[ref 4] on BIS’s proposed rule by the Americans for Prosperity Foundation (“AFP”), a libertarian advocacy group, is a representative articulation of this perspective. AFP argues that the DPA is an “emergency” statute that should not be used in non-emergencies for purposes not directly related to defense industry production.
This kind of concern about peacetime abuses of DPA authority is not new. President George H.W. Bush, after signing a bill reauthorizing the DPA in 1992, remarked that using § 705 during peacetime to collect industrial base data from American companies would “intrude inappropriately into the lives of Americans who own and work in the Nation’s businesses.” And former federal judge Jamie Baker, in an excellent paper from 2021 on the DPA’s potential as an AI governance tool, predicted that the use of § 705 to collect information about AI to collect information from “private companies engaged in AI research” would meet with “challenge and controversy.”
Still, to quote from Judge Baker’s piece again, “Section 705 is clearly written and the authority it presents is strong.” Nothing in the DPA indicates that industrial base surveys under § 705 cannot be continuously ongoing, or that the DPA generally can only be used for encouraging increased defense industry production. It’s true that § 705, and related regulations, both focus on gathering information about the capacity of the U.S. industrial base to support “the national defense”—but recall that the DPA defines the term “national defense” very broadly, to include a wide variety of non-military considerations such as critical infrastructure protection. Moreover, the DPA generally has been used for purposes not directly related to defense industry production by Presidents from both parties for decades. For example, DPA authorities have been used to supply California with natural gas during the 2000-2001 energy crisis and to block corporate acquisitions that would have given Chinese companies ownership interests in U.S. semiconductor companies. In short, while critics of the proposed rule can reasonably argue that using the DPA in novel ways to collect information from private AI companies is bad policy, and politically undesirable, it’s much harder to make a reasonable argument against the legality of the proposed rule.
Also, government access to up-to-date information about frontier models may be more important to national security, and even to military preparedness specifically, than the rule’s critics anticipate. A significant portion of the Notice of Proposed Rulemaking in which BIS introduced the proposed rule is devoted to justifying the importance of the rule to “the national defense” and “the defense industrial base.” According to BIS, integrating dual-use foundation models into “military equipment, signal intelligence devices, and cybersecurity software” could soon become important to the national defense. Therefore, BIS claims, the government needs access to information from developers both to determine whether government action to stimulate further dual-use foundation model development is needed and “to ensure that dual-use foundation models operate in a safe and reliable manner.”
In any event, any lawsuit challenging the proposed rule would probably have to be brought by one of the labs subject to the reporting requirements.[ref 5] A few leading AI labs have submitted public comments on the rule, but none expressed any objection to the basic concept of an ongoing system of mandatory reporting requirements for dual-use foundation model developers. Anthropic’s comment only requests that the reporting requirements should be semiannual rather than quarterly, that labs should have more time to respond to questions, and that BIS should tweak some of the definitions in the proposed rule and take steps to ensure that the sensitive information contained in labs’ responses is handled securely. OpenAI’s comment goes a bit further, asking (among other requests) that BIS limit itself to collecting only “standardized” information relevant to national security concerns and to using information collected “for the sole purpose to ensure [sic] and verify the continuous availability of safe, reliable, and effective AI.” But neither those labs nor any of their competitors has voiced any fundamental objection to the basic idea of mandatory reporting requirements that allow the government to collect safety information about dual-use foundation models. This is unsurprising given that these and other leading AI companies have already committed to voluntarily sharing similar information with the US and other governments. In other words, while it’s too soon to be certain, it looks like the reporting requirements are unlikely to be challenged in court for the time being.
Conclusion
“Information,” according to LawAI affiliate Noam Kolt and his distinguished co-authors, “is the lifeblood of good governance.” The field of AI governance is still in its infancy, and at times it seems like there’s near-universal agreement on the need for the federal government to do something and near-universal disagreement about what exactly that something should be. Establishing a flexible system for gathering information about the most capable models, and building up the government’s capacity for collecting and processing that information in a secure and intelligent way, seems like a good first step. The regulated parties, who have voluntarily committed to sharing certain information with the government and have largely chosen not to object to the idea of ongoing information-gathering by BIS, seem to agree. In an ideal world, Congress would pass a law explicitly authorizing such a system; maybe someday it will. In the meantime, it seems likely that BIS will implement some amended version of its proposed rule in the near future, and that the result will, for better or worse, be the most significant federal AI regulation to date.
Last edited on: October 30, 2024
The limits of liability
I’m probably as optimistic as anyone about the role that liability can play in AI governance. Indeed, as I’ll argue in a forthcoming article, I think it should be the centerpiece of our AI governance regime. But it’s important to recognize its limits.
First and foremost, liability alone is not an effective tool for solving public good problems. This means it is poorly positioned to address at least some challenges presented by advanced AI. Liability is principally a tool for addressing risk externalities generated by training and deploying advanced AI systems. That is, AI developers and their customers largely capture the benefits of increasing AI capabilities, but most of the risk is borne by third parties who have no choice in the matter. This is the primary market failure associated with AI risk, but it’s not the only one. There is also a public good problem with AI alignment and safety research. Like most information goods, advances in alignment and safety research are non-rival (you and I can both use the same idea, without leaving less for the other) and non-excludable (once you come up with an idea, it’s hard to use it without the secret getting out). Markets generally underprovide public goods, and AI safety research is no exception. Plausible policy interventions to address this problem include prizes and other forms of public subsidies. Private philanthropy can also continue to play an important role in supporting alignment and safety research. There may also be winner-take-all race dynamics that generate market distortions not fully captured by the risk externality and public goods problems.
Second, there are some plausible AI risk externalities that liability cannot realistically address, especially those involving structural harms or highly attenuated causal chains. For instance, if AI systems are used to spread misinformation or interfere with elections, this is unlikely to give rise to a liability claim. To the extent that AI raises novel issues in those domains, other policy ideas may be needed. Similarly, some ways of contributing to the risk of harm are too attenuated to trigger liability claims. For example, if the developer of a frontier or near-frontier model releases information about the model and its training data/process that enables lagging labs to move closer to the frontier, this could induce leading labs to move faster and exercise less caution. But it would not be appropriate or feasible to use liability tools to hold the first lab responsible for the downstream harms from this race dynamic.
Liability also has trouble handling uninsurable risks— those that might cause harms so large that a compensatory damages award would not be practically enforceable — if warning shots are unlikely. In my recent paper laying out a tort liability framework for mitigating catastrophic AI risk, I argue that uninsurable risks more broadly can be addressed using liability by applying punitive damages in “near miss” cases of practically compensable harm that are associated with the uninsurable risk. But if some uninsurable risks are unlikely to produce warning shots, then this indirect liability mechanism would not work to mitigate them. And if the uninsurable risk is realized, the harm would be too large to make a compensatory damages judgment practically enforceable. That means AI developers and deployers would have inadequate incentives to mitigate those risks.
Like most forms of domestic AI regulation, unilateral imposition of a strong liability framework is also subject to regulatory arbitrage. If the liability framework is sufficiently binding, AI development may shift to jurisdictions that don’t impose strong liability policies or comparably onerous regulations. While foreign AI developers would still be subject to liability if they harm people in countries with strong liability regimes, it may prove difficult to enforce those judgments if the developer lacks substantial assets in the country where the injuries occur. One potential solution to this problem is international treaties establishing reciprocal enforcement of liability judgments reached by the other country’s courts.
Finally, liability is a weak tool for influencing the conduct of governmental actors. By default, many governments will be shielded from liability, and many legislative proposals will continue to exempt government entities. Even if governments waive sovereign immunity for AI harms they are responsible for, the prospect of liability is unlikely to sway the decisions of government officials, who are more responsive to political than economic incentives. This means liability is a weak tool in scenarios where the major AI labs get nationalized as the technology gets more powerful. But even if AI research and development remains largely in the private sector, the use of AI by government officials will be poorly constrained by liability. Ideas like law-following AI are likely to be needed to constrain governmental AI deployment.
International law and advanced AI: exploring the levers for ‘hard’ control
The question of how artificial intelligence (AI) is to be governed has risen rapidly up the global agenda – and in July 2023, United Nations Secretary-General António Guterres raised the possibility of the “creation of a new global body to mitigate the peace and security risks of AI.” While the past year has seen the emergence of multiple initiatives for AI’s international governance – by states, international organizations and within the UN system – most of these remain in the realm of non-binding ‘soft law.’ However, many influential voices in the debate are increasingly arguing that the challenge of future AI systems means that international AI governance would eventually need to include elements that are legally binding.
If and when states choose to take up this challenge and institute binding international rules on advanced AI – either under a comprehensive global agreement, or between a small group of allied states – there are three principal areas where such controls might usefully bite. First, states might agree to controls on particular end uses of AI that are considered most risky or harmful, drawing on the European Union’s new AI Act as a general model. Second, controls might be introduced on the technology itself, structured around the development of certain types of AI systems, irrespective of use – taking inspiration from arms control regimes and other international attempts to control or set rules around certain forms of scientific research. Third, states might seek to control the production and dissemination of the industrial inputs that power AI systems – principally the computing power that drives AI development – harmonizing export controls and other tools of economic statecraft.
Ahead of the upcoming United Nations Summit of the Future and the French-hosted international AI summit in 2025, this post explores these three possible control points and the relative benefits of each in addressing the challenges posed by advanced AI. It also addresses the structural questions and challenges that any binding regime would need to address – including its breadth in terms of state participation, how participation might be incentivized, the role that private sector AI labs might play, and the means by which equitable distribution of AI’s benefits could be enabled. This post is informed by ongoing research projects into the future of AI international governance undertaken by the Institute for Law & AI, Lawfare’s Legal AI Safety Initiative, and others.
Hard law approaches to AI governance
The capabilities of AI systems have advanced rapidly over the past decade. While these systems present significant opportunities for societal benefit, they also engender new risks and challenges. Possible risks from the next wave of general-purpose foundation models, deemed “frontier” or “advanced AI,” include increases in inequality, misuse by harmful actors, and dangerous malfunctions. Moreover, AI agents that are able to make and execute long-term plans may soon proliferate, and would pose particular challenges.
As a result of these developments, states are beginning to take concrete steps to regulate AI at the domestic level. This includes the United States’ Executive Order on the Safe, Secure, and Trustworthy Development and Use of AI, the European Union’s AI Act, the UK’s AI White Paper and subsequent public consultation, and Chinese laws covering both the development and use of various AI systems. At the same time, given the rapid pace of change and cross-border nature of AI development and potential harms, it is increasingly recognized that domestic regulation alone will likely not be adequate to address the full spread of challenges that advanced AI systems pose.
As a result, recent years have also witnessed the emergence of a growing number of initiatives for international coordination of AI policy. In the twenty months since the launch of OpenAI’s ChatGPT propelled AI to the top of the policy agenda, we have seen two international summits on AI safety; the Council of Europe conclude its Framework Convention on Artificial Intelligence and Human Rights, Democracy and the Rule of Law; the G7 launch its Hiroshima Process on responsible AI governance; and the UN launch an Advisory Body on international AI governance.
These ongoing initiatives are unlikely to represent the limits of states’ ambitions for AI coordination on the international plane. Indeed, should the pace of AI capability development continue as it has over the last decade, it seems likely that in the coming years states may choose to pursue some form of binding ‘hard law’ international governance for AI – moving beyond the mostly soft law commitments that have characterized today’s diplomatic efforts. Geopolitical developments, a rapid jump in AI capabilities, or a significant AI security incident or crisis, might also lead states to come to support a hard law approach. Throughout the course of 2023, several influential participants in the debate began to raise the possibility that binding international governance may be necessary, once AI systems reach a certain capability level – including most notably AI lab OpenAI. A number of political and moral authorities have gone beyond this and called for the immediate institution of binding international controls on AI – including the influential group of former politicians The Elders who have called for an “international treaty establishing a new international AI safety agency,” and Pope Francis who has urged the global community to adopt a “binding international treaty that regulates the development and use of artificial intelligence in its many forms.”
To date these calls for binding international governance have only been made at a high level of abstraction, without inclusion of detailed proposals for how a binding international AI governance regime might be structured or what activities should be controlled. Moreover, the advanced state of the different soft law approaches currently in progress mean that the design and legal form of any hard law regime that is eventually instituted would be heavily conditioned by other AI governance initiatives or institutions that precede it. Nevertheless, given the significant possibility of states beginning discussion of binding AI governance in the coming years, there is value in surveying the areas where controls could be implemented, assessing the contribution these controls might make in addressing the challenges of AI, and identifying the relevant institutional antecedents.
Three control points
There are three main areas where binding international controls on AI might bite: on particular ‘downstream’ uses of AI, on the upstream ‘development’ of AI systems, and on the industrial inputs that underpin the development of AI systems.
Downstream uses of AI
If the primary motivation behind states introducing international controls is a desire to mitigate the perceived risks from advanced AI, then the most natural approach would be to structure those controls around the particular AI uses that are considered to pose the greatest level of risk. The most prominent domestic AI regulation – the European Union’s AI Act – follows this approach, introducing different tiers of control for uses of AI systems based around the perceived risk of those use cases. Those that are deemed most harmful – for example the use of AI for social-scoring or in biometric systems put in place to predict criminality – are prohibited outright.
This form of control could be replicated at an international level. Existing international law imposes significant constraints on certain uses of AI – such as the protections provided by international human rights law and international humanitarian law. However, explicitly identifying and controlling particular harmful AI uses would add an additional layer of granularity to these constraints. Should states wish to do so, arms control agreements offer one model for how this could be done.
The principal benefit of a use-based approach to international control of AI is its simplicity: where particular AI uses are most harmful, they can be controlled or prohibited. States should in theory also be able to update any new treaty regime, adding additional harmful uses of AI to a controlled list should they wish to do so – and if they are able to agree on these. Nevertheless, structuring international controls solely around identified harmful uses of AI also has certain limitations. Most importantly, while such a use-based governance regime would have a significant impact in addressing the risks posed by the deliberate misuse of AI, its impact in reducing other forms of AI risk is less clear.
As reported by the 2024 International Scientific Report on the Safety of Advanced AI, advanced AI systems may also pose risks stemming from the potential malfunction of those systems – regardless of their particular application or form of use. The “hallucinations” generated by the most advanced chatbots, in spite of their developers best intentions, are an early example of this. At the extreme, certain researchers have posited that developers might lose the ability to control the most advanced systems. The malfunction or loss of control of more advanced systems could have severe implications as these systems are increasingly incorporated into critical infrastructure systems, such as energy, financial or cyber security networks. For example, a malfunction of an AI system incorporated into military systems, such as nuclear command, control and communication infrastructure, might lead to catastrophic consequences. Use-based governance may be able to address this issue in part, by regulating the extent to which AI technology is permitted to be integrated into critical infrastructure at all – but such a form of control would not address the possibility of unexpected malfunction or loss of control of an AI system used in a permitted application.
Upstream development of AI
Given the possibility of dangerous malfunctions in advanced AI systems, a complementary approach would be to focus on the technology itself. Such an approach would entail structuring an international regime around controls on the upstream development of AI systems, rather than particularly harmful applications or uses.
International controls on upstream AI development could be structured in a number of ways. Controls could focus on security measures. They could include the introduction of mandatory information security or other protective requirements, to ensure that key components of advanced AI systems, such as model weights, cannot leak or be stolen by harmful actors or geopolitical rivals. The regime might also require the testing of AI systems against agreed safety metrics prior to release, with AI systems that fail prohibited from release until they can be demonstrated to be safe. Alternatively, international rules might focus on state jurisdiction compliance with agreed safety and oversight standards, rather than focusing on the safety of individual AI systems or training runs.
Controls could focus on increasing transparency or other confidence-building measures. States could introduce a mandatory warning system should AI models reach certain capability thresholds, or should there be an AI security incident. A regime might also include a requirement to notify other state parties – or the treaty body, if one was created – before beginning training of an advanced AI system, allowing states to convene and discuss precautionary measures or mitigations. Alternatively, the regime could require that other state parties or the treaty body give approval before advanced systems are trained.If robustly enforced, structuring controls around AI development would contribute significantly towards addressing the security risks posed by advanced AI systems. However, this approach to international governance also has its challenges. In particular, given that smaller AI systems are unlikely to pose significant risks, participants in any regime would likely need to also agree on thresholds for the introduction of controls – with these only applying to AI systems of a certain size or anticipated capability level. Provision may be needed to periodically update this threshold, in line with technological advances. In addition, given the benefits that advanced AI is expected to bring, an international regime controlling AI development would need to also include provision for the continued safe development of advanced AI systems above any capability threshold.
Industrial inputs: AI compute
Finally, a third approach to international governance would be for states to move another step back and focus on the AI supply chain. Supply-side controls of basic inputs have been successful in the past in addressing the challenges posed by advanced technology. An equivalent approach would involve structuring international controls around the industrial inputs necessary for the development of advanced AI systems, with a view to shaping the development of those systems.
The three principal inputs used to train AI systems are computing power, data and algorithms. Of these, computing power (“compute”) is the most viable node for control by states, and hence the focus of this section. This is because AI models are trained on physical semiconductor chips, that are by their nature quantifiable (they can be counted), detectable (they can be identified and physically tracked), and excludable (they can be restricted). The supply chain for AI chips is also exceptionally concentrated. These properties mean that controlling the distribution of AI compute would likely be technologically feasible – should states be able to agree on how to do so.
International agreements on the flow and usage of AI chips could assist in reducing the risks from advanced AI in a number of different ways. Binding rules around the flow of AI chips could be used to augment or enforce a wider international regime covering AI uses or development – for example by denying these chips to states who violate the regime or to non-participating states. Alternatively, international controls around AI industrial inputs might be used to directly shape the trajectory of AI development, through directing the flow of chips towards certain actors, potentially mitigating the need to control downstream uses or upstream development of AI systems at all. Future technological advances may also make it possible to monitor the use of individual semiconductor chips – which would be useful in verifying compliance with any binding international rules around the development of AI systems.
Export control law can provide the conceptual basis for international control of AI’s industrial inputs. The United States has already introduced a sweeping set of domestic laws controlling the export of semiconductors, with a view to restricting China’s ability to acquire the chips needed to develop advanced AI and to maintaining the U.S. technological advantage in this space. These U.S. controls could be used as the basis for an expanded international semiconductor export control regime, between the U.S. and its allies. Existing or historic multilateral export control regimes could also serve as a model for a future international agreement on AI compute exports. This includes the Cold War-era Coordinating Committee for Multilateral Export Controls (COCOM), under which Western states coordinated an arms embargo on Eastern Bloc countries, and its successor Wassenaar Arrangement, through which Western states harmonize controls on exports of conventional arms and dual-use items.
In order to be effective, controls on the export of physical AI chips would likely need to be augmented by restrictions on the proliferation of both AI systems themselves and of the technology necessary for the development of semiconductor manufacturing capability outside of participating states. Precedent for such a provision can be found in a number of international arms control agreements. For example, Article 1 of the Nuclear Non-Proliferation Treaty prohibits designated nuclear weapon states from transferring nuclear weapons or control over such weapons to any recipient, and from assisting, encouraging or inducing non-nuclear weapon states to manufacture or acquire the technology to do so. A similar provision controlling the exports of semiconductor design and manufacturing technology – perhaps again based on existing U.S. export controls – could be included in an international AI regime.
Structural challenges
A binding regime for governing advanced AI agreed upon by states incorporating any of the above controls would face a number of structural challenges.
Private sector actors
The first of these stems from the nature of the current wave of AI development. Unlike many of the Twentieth Century’s most significant AI advances, which were developed by governments or academia, the most powerful AI models today are almost exclusively designed in corporate labs, trained using private sector-produced chips, and run on commercial cloud data centers. While certain AI companies have experimented with corporate structures such as a long-term benefit trust or capped profit provision, commercial concerns are the major driver behind most of today’s AI advances – a situation that is likely to continue in the near future, pending significant government investment in AI capabilities.
As a result, a binding international regime aiming to control AI use or development would require a means of legally ensuring the compliance of private sector AI labs. This could be achieved through the imposition of obligations on participating state parties to implement the regime through domestic law. Alternatively the treaty instituting the regime could impose direct obligations on corporations – a less common approach in international law. However, even in such a situation the primary responsibility for enforcing the regime and remedying breaches would likely still fall on states.
Breadth of state participation
A further issue relates to the breadth of state participation in any binding international regime: should this be targeted or comprehensive? At present, the frontier of the AI industry is concentrated in a small number of countries. A minilateral agreement concluded between a limited group of states (such as between the U.S. and its allies) would almost certainly be easier to reach consensus on than a comprehensive global agreement. Given the pace of AI development, and concerns regarding the capabilities of the forthcoming generation of advanced models, there is significant reason to favor the establishment of a minimally viable international agreement, concluded as quickly as possible.
Nevertheless, a major drawback of a minilateral agreement conducted between a small group of states – in contrast to a comprehensive global agreement – would be the issue of legitimacy. Although AI development is currently concentrated in a small number of states, any harms that result from the misuse or malfunction of AI systems are unlikely to remain confined within the borders of those states. In addition, citizens of the Global South may be least likely to realize the economic benefits that result from AI technological advances. As such, there is a strong normative argument for giving a voice to a broad group of states in the design of any international regime intended to govern its development – not simply those that are currently most advanced in terms of AI capabilities. In the absence of this, any regime would likely suffer from a critical absence of global legitimacy, potentially threatening both its longevity and the likelihood of other states later agreeing to join.
A minilateral agreement aiming to institute binding international rules to govern AI would therefore need to include a number of provisions to address these legitimacy issues. First, while it may end up as more practicable to initially establish governance amongst a small group of states, it would greatly aid legitimacy if participants were to explicitly commit to working towards the establishment of a global regime, and open the regime for all states to theoretically join, provided they agreed to the controls and any enforcement mechanisms. Precedent for such a provision can be found in other international agreements – for example the 1990 Chemical Weapons Accord between the U.S. and the USSR, which included a pledge to work towards a global prohibition on chemical weapons, and eventually led to the establishment of the 1993 Chemical Weapons Convention which is open to all states to join.
Incentives and distribution
This brings us to incentives. In order to encourage broad participation in the regime, states with less developed artificial intelligence sectors may need to be offered inducements to join – particularly given that doing so might curtail their freedom to develop their own domestic AI capabilities. One way to do so would be to include a commitment from leading AI states to distribute the benefits of AI advances to less developed states, conditional on those participants committing to not violating the restrictive provisions of the agreement – a so-called ‘dual mandate.’
Inspiration for such an approach could be drawn from the Nuclear Non-Proliferation Treaty, under which non-nuclear weapon participants agree to forgo the right to develop nuclear weapons in exchange for the sharing of “equipment, materials and scientific and technological information for the peaceful uses of nuclear energy.” An equivalent provision under an AI governance regime might for example grant participating states the right to access the most advanced systems, for public sector or economic development purposes, and promise assistance in incorporating these systems into beneficial use cases.
The international governance of AI remains a nascent project. Whether binding international controls of any form come to be implemented in the near future will depend upon a range of variables and political conditions. This includes the direction of AI technological developments and the evolution of relations between leading AI states. As such, the feasibility of a binding international governance regime for AI remains to be seen. In light of 2024’s geopolitical tensions, and the traditional reticence from the U.S. and China to agree to international law restrictions that infringe on sovereignty or national security, binding international AI governance appears unlikely to be established immediately.
However, this position could rapidly change. Technological or geopolitical developments – such as a rapid and unexpected jump in AI capabilities, a shift in global politics, or an AI-related security incident or crisis with global impact – could act as forcing mechanisms leading states to come to support the introduction of international controls. In such a scenario, states will likely wish to implement these quickly, and will require guidance on both the form these controls should take and how they might be enacted.
Historical analogy suggests that international negotiations of equivalent magnitude to the challenges AI will pose typically take many years to conclude. It took over ten years from the initial UN discussions around international supervision of nuclear material for the statute of the International Atomic Energy Agency to be negotiated. In the case of AI, states will likely not have this long. Given the stakes at hand, lawyers and policymakers should therefore begin consideration both of the form that future international AI governance should take, and how this might be implemented, as a matter of urgency.